MCP YOLOE: Zero-Shot Object Detection & Segmentation

Created By
rjn32s4 months ago
Provide your AI agents with "eyes." This server enables open-vocabulary object detection and instance segmentation using naturally phrased text prompts (e.g., "detect the laptop next to the coffee").
Overview

MCP-YOLO

MCP-YOLO is a powerful Model Context Protocol server that grants AI agents advanced computer vision capabilities. Unlike traditional YOLO models that only detect a fixed list of objects, this server uses Zero-Shot Learning to detect and segment anything you describe.

Key Features

  • Zero-Shot Detection: Detect arbitrary objects using natural language prompts.
  • Precision Segmentation: Get exact polygon masks for every detected object.
  • Flexible Inputs: Works with local file paths, remote image URLs, and Base64 strings.
  • Agent-First: Designed specifically for integration with Claude, IDEs, and autonomous workspace agents.

Example Usage

Ask your agent to:

"Find the 'vintage typewriter' in this image and give me its exact coordinates."

Performance

Uses the state-of-the-art YOLOE26-L architecture, providing a perfect balance of high precision (55.0 mAP) and rapid inference (~6.2ms on T4 GPUs).

Server Config

{
  "mcpServers": {
    "mcp-yolo": {
      "command": "uvx",
      "args": [
        "mcp-yolo"
      ]
    }
  }
}
Project Info
Created At
4 months ago
Updated At
4 months ago
Author Name
rjn32s
Star
-
Language
-
License
-
Category

Recommend Servers

View All
Docwand

14 hours ago