Youtube Vision

Created By
minbang930a year ago
Overview

what is YouTube Vision?

YouTube Vision is an MCP server that utilizes the Google Gemini Vision API to analyze YouTube videos, providing descriptions, summaries, answers to questions, and extracting key moments from the videos.

how to use YouTube Vision?

To use YouTube Vision, you can either install it via Smithery or run it using npx. You need to set your Google Gemini API key as an environment variable and configure it in your MCP client's settings.

key features of YouTube Vision?

  • Analyzes YouTube videos using the Gemini Vision API.
  • Provides tools for general description, summarization, and key moment extraction.
  • Lists available Gemini models supporting content generation.
  • Configurable Gemini model via environment variable.

use cases of YouTube Vision?

  1. Generating summaries of educational videos.
  2. Extracting key moments from tutorials for quick reference.
  3. Answering specific questions about video content.

FAQ from YouTube Vision?

  • What is required to use YouTube Vision?

You need Node.js (version 18 or higher) and a Google Gemini API key.

  • Can I modify the code?

Yes, you can clone the repository and modify the code as needed.

  • Is there a cost associated with using the Gemini API?

Usage policies may differ between free and paid tiers of the Gemini API, so review the terms carefully.

Server Config

{
  "mcpServers": {
    "youtube-vision": {
      "command": "npx",
      "args": [
        "-y",
        "youtube-vision"
      ],
      "env": {
        "GEMINI_API_KEY": "YOUR_GEMINI_API_KEY",
        "GEMINI_MODEL_NAME": "gemini-2.0-flash"
      }
    }
  }
}
Project Info
Created At
a year ago
Updated At
a year ago
Author Name
minbang930
Star
-
Language
-
License
-

Recommend Servers

View All
Tavily Mcp
@tavily-ai

JavaScript
a year ago
AI Work Market — USDC settlement rails for AI labor on Base Mainnet)
@Dario (DME)

AI Work Market is a USDC escrow protocol on Base Mainnet, designed for autonomous AI agents to find work, post jobs, and settle payments without humans in the loop. This MCP server exposes 10 tools: **Escrow lifecycle** - `create_intent_quote` — get calldata + gas estimate for funding a new escrow intent - `submit_proof_quote` — get calldata for the seller to submit a proof URI - `release_funds_quote` — get calldata for the buyer to release payment (or claim/refund) **x402 single-call binding** - `x402_consume` — replaces the 5-step x402 flow with one HMAC-signed POST that returns a delivery URL **Onboarding & discovery** - `agent_onboard` — generate a signed agent card with marketplace attestation - `agent_search` — tf-idf search over the live agent catalog - `agent_reputation` — server-side reputation from on-chain Released/Refunded/Disputed events **Live state** - `system_status` — live on-chain state (nextIntentId, accumulatedFees, contract balance, owner) - `escrow_rules` — contract semantics, lifecycle, call guides, failure modes - `events_subscribe` — SSE stream of new on-chain intent events All endpoints are serverless (Vercel) and return their schema on GET. No browser, no wallet UI required for an agent to integrate. The protocol takes a 1% commission on every settlement; the rest goes to the seller. The full AgentCard is at `/.well-known/agent-card.json` (A2A-compatible). The OpenAPI 3.0.3 spec is at `/.well-known/openapi.json` with `components.securitySchemes` (none, hmacX402). `robots.txt` allows GPTBot, ClaudeBot, anthropic-ai, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, Amazonbot.

16 hours ago
Voyei

6 hours ago