Youtube Vision

Created By
minbang930a year ago
Overview

what is YouTube Vision?

YouTube Vision is an MCP server that utilizes the Google Gemini Vision API to analyze YouTube videos, providing descriptions, summaries, answers to questions, and extracting key moments from the videos.

how to use YouTube Vision?

To use YouTube Vision, you can either install it via Smithery or run it using npx. You need to set your Google Gemini API key as an environment variable and configure it in your MCP client's settings.

key features of YouTube Vision?

  • Analyzes YouTube videos using the Gemini Vision API.
  • Provides tools for general description, summarization, and key moment extraction.
  • Lists available Gemini models supporting content generation.
  • Configurable Gemini model via environment variable.

use cases of YouTube Vision?

  1. Generating summaries of educational videos.
  2. Extracting key moments from tutorials for quick reference.
  3. Answering specific questions about video content.

FAQ from YouTube Vision?

  • What is required to use YouTube Vision?

You need Node.js (version 18 or higher) and a Google Gemini API key.

  • Can I modify the code?

Yes, you can clone the repository and modify the code as needed.

  • Is there a cost associated with using the Gemini API?

Usage policies may differ between free and paid tiers of the Gemini API, so review the terms carefully.

Server Config

{
  "mcpServers": {
    "youtube-vision": {
      "command": "npx",
      "args": [
        "-y",
        "youtube-vision"
      ],
      "env": {
        "GEMINI_API_KEY": "YOUR_GEMINI_API_KEY",
        "GEMINI_MODEL_NAME": "gemini-2.0-flash"
      }
    }
  }
}
Project Info
Created At
a year ago
Updated At
a year ago
Author Name
minbang930
Star
-
Language
-
License
-

Recommend Servers

View All
Bring your real authenticated browser session to AI coding agents. Local-first MCP server + Chrome MV3 extension. No cloud. No telemetry.
@Cubenest

peek records the user's actual logged-in browser (DOM via rrweb, console events, network metadata, optional response bodies via opt-in Deep capture) through a Chrome MV3 extension. The extension ships events through a native-messaging stdio bridge to a local MCP server (peek-mcp), which persists them to a SQLite database at ~/.peek/sessions.db. AI coding agents (Claude Code, Cursor, Cline, Windsurf) read sessions from the database via 10 MCP tools: Tool What it does list_recent_sessions List recently recorded sessions (id, origin, ts, event count). get_session_summary LLM-readable narrative summary of a session. get_session_console_errors Console errors recorded in a session. get_session_network_errors Failed/notable network requests in a session. get_user_action_before_error Last N user actions before a console error. generate_playwright_repro Generate a runnable Playwright test from a session. get_dom_snapshot Reconstruct the DOM at a given timestamp. query_dom_history Timeline of attribute/text changes for a selector. request_authorization Side-panel consent for write actions (Level 3). execute_action Dispatch a UI action (gated by permission level + destructive blocklist). Why local-first matters Every other "browser session for AI" tool ships to a vendor cloud. peek's SQLite + extension live on the user's machine — no remote endpoints, no telemetry. The privacy policy (docs/peek/PRIVACY_POLICY.md) is the source of truth. Install # 1. Add the MCP server to Claude Code claude mcp add peek -- npx -y @peekdev/mcp # 2. Install the Chrome extension from the Chrome Web Store # (link added once the CWS listing is approved)

2 days ago