🦊 MCPBench: A Benchmark for Evaluating MCP Servers

Created By
modelscopea year ago
The evaluation benchmark on MCP servers
Overview

what is MCPBench?

MCPBench is an evaluation framework designed for assessing the performance of MCP Servers, specifically focusing on Web Search and Database Query tasks. It evaluates various servers like Brave Search and DuckDuckGo based on task completion accuracy, latency, and token consumption.

how to use MCPBench?

To use MCPBench, install the required dependencies, configure your LLM key and endpoint, launch the MCP server with the appropriate configuration, and run evaluations for Web Search or Database Query tasks.

key features of MCPBench?

  • Supports evaluation of multiple MCP Servers
  • Measures task completion accuracy, latency, and token consumption
  • Compatible with local and remote MCP Servers
  • Provides datasets for evaluation

use cases of MCPBench?

  1. Evaluating the performance of different web search engines.
  2. Comparing database query efficiency across various MCP Servers.
  3. Analyzing the impact of different configurations on server performance.

FAQ from MCPBench?

  • What types of servers can be evaluated with MCPBench?

MCPBench can evaluate both Web Search and Database Query servers.

  • Is there a specific Python version required?

Yes, MCPBench requires Python version >= 3.11.

  • Where can I find the evaluation report?

The evaluation report is available in the project repository.

Project Info
Created At
a year ago
Updated At
a year ago
Author Name
modelscope
Star
93
Language
Python
License
Apache-2.0 license

Recommend Servers

View All
Gelbooru

17 hours ago
Ghl Command
@Elite DCs LLC

GoHighLevel MCP server for Claude. 212 tools across 43 modules, including the only programmatic GHL workflow builder (private API, reverse-engineered), funnel + page editor, form builder, pipeline builder, pre-deploy validator, multi-sub-account switching, bulk operations, and full account export. $97 one-time, lifetime updates. GHL Command gives Claude full programmatic control of GoHighLevel through 212 tools across 43 modules. Built for GoHighLevel agency operators who manage many client sub-accounts and want to onboard new clients in minutes instead of days. Exclusive capabilities (none of the free GHL MCPs have these): - Programmatic workflow builder. Create, edit, clone, publish, and validate complete GHL workflows from a single prompt. GHL's public API has no workflow write endpoints; this uses their internal API (the same one their UI calls). - Funnel + page editor and form builder (also private API). - Pipeline builder, goal event builder, full 57-native-trigger registry. - Pre-deploy validator that catches GHL's silent invalid-ID failure (a common workflow-breaking bug GHL never warns you about). - Multi-sub-account token registry. Switch between any client account mid-conversation; API keys swap automatically. - Bulk operations: tag, update, enroll, delete hundreds of contacts in one command. - Full account export and side-by-side location diff for audit or migration. Works with Claude Desktop App, Claude Code (terminal), and headless on a Linux server or droplet. $97 one-time, 3 machines, no subscription, lifetime updates. 30-day time-back guarantee: save 5+ hours on one real client build or full refund.

a day ago