Crawlbase Web MCP Server

Created By
crawlbase10 months ago
Crawlbase Web MCP Server (Model Context Protocol) connects AI agents and large language models (LLMs) with real-time web data. Built on Crawlbase’s proven web scraping and crawling infrastructure, it handles JavaScript rendering, anti-bot protection, and web data extraction at scale — powering Claude, Cursor, and Windsurf. It delivers a reliable, production-ready pipeline of live, structured data for AI workflows and intelligent applications.
Overview

What is Crawlbase MCP?

Crawlbase MCP is a Model Context Protocol (MCP) server that bridges AI agents and the live web. Instead of relying on outdated training data, your LLMs can now fetch fresh, structured, real-time content — powered by Crawlbase’s proven crawling infrastructure trusted by 70,000+ developers worldwide.

It handles the complexity of scraping for you:

  • JavaScript rendering for modern web apps
  • Proxy rotation & anti-bot evasion
  • Structured outputs (HTML, Markdown, screenshots)

How It Works

  • Get Free Crawlbase Tokens → Sign up at Crawlbase ↗, get free Normal, and JavaScript tokens.
  • Add MCP Config → Connect Crawlbase MCP to Claude, Cursor, or Windsurf by updating your mcpServers config.
  • Start Crawling → Use commands like crawl, crawl_markdown, or crawl_screenshot to bring live web data into your AI agent.

Features

  • Real-time web scraping for AI agents
  • JavaScript rendering (SPAs & dynamic pages)
  • Proxy rotation to bypass blocks & captchas
  • Structured outputs (HTML, Markdown, Screenshots)
  • Seamless MCP integration with Claude, Cursor & Windsurf

Use Cases

  • Research with up-to-date articles & reports
  • Monitor e-commerce products & prices
  • Fetch real-time news & financial data
  • Aggregate content for data pipelines
  • Power AI agents with fresh, accurate information

Server Config

{
  "mcpServers": {
    "crawlbase": {
      "type": "stdio",
      "command": "npx",
      "args": [
        "@crawlbase/mcp@latest"
      ],
      "env": {
        "CRAWLBASE_TOKEN": "your_token_here",
        "CRAWLBASE_JS_TOKEN": "your_js_token_here"
      }
    }
  }
}
Project Info
Created At
10 months ago
Updated At
9 months ago
Author Name
crawlbase
Star
-
Language
-
License
-
Category

Recommend Servers

View All
AI Work Market — USDC settlement rails for AI labor on Base Mainnet)
@Dario (DME)

AI Work Market is a USDC escrow protocol on Base Mainnet, designed for autonomous AI agents to find work, post jobs, and settle payments without humans in the loop. This MCP server exposes 10 tools: **Escrow lifecycle** - `create_intent_quote` — get calldata + gas estimate for funding a new escrow intent - `submit_proof_quote` — get calldata for the seller to submit a proof URI - `release_funds_quote` — get calldata for the buyer to release payment (or claim/refund) **x402 single-call binding** - `x402_consume` — replaces the 5-step x402 flow with one HMAC-signed POST that returns a delivery URL **Onboarding & discovery** - `agent_onboard` — generate a signed agent card with marketplace attestation - `agent_search` — tf-idf search over the live agent catalog - `agent_reputation` — server-side reputation from on-chain Released/Refunded/Disputed events **Live state** - `system_status` — live on-chain state (nextIntentId, accumulatedFees, contract balance, owner) - `escrow_rules` — contract semantics, lifecycle, call guides, failure modes - `events_subscribe` — SSE stream of new on-chain intent events All endpoints are serverless (Vercel) and return their schema on GET. No browser, no wallet UI required for an agent to integrate. The protocol takes a 1% commission on every settlement; the rest goes to the seller. The full AgentCard is at `/.well-known/agent-card.json` (A2A-compatible). The OpenAPI 3.0.3 spec is at `/.well-known/openapi.json` with `components.securitySchemes` (none, hmacX402). `robots.txt` allows GPTBot, ClaudeBot, anthropic-ai, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, Amazonbot.

5 hours ago
Bring your real authenticated browser session to AI coding agents. Local-first MCP server + Chrome MV3 extension. No cloud. No telemetry.
@Cubenest

peek records the user's actual logged-in browser (DOM via rrweb, console events, network metadata, optional response bodies via opt-in Deep capture) through a Chrome MV3 extension. The extension ships events through a native-messaging stdio bridge to a local MCP server (peek-mcp), which persists them to a SQLite database at ~/.peek/sessions.db. AI coding agents (Claude Code, Cursor, Cline, Windsurf) read sessions from the database via 10 MCP tools: Tool What it does list_recent_sessions List recently recorded sessions (id, origin, ts, event count). get_session_summary LLM-readable narrative summary of a session. get_session_console_errors Console errors recorded in a session. get_session_network_errors Failed/notable network requests in a session. get_user_action_before_error Last N user actions before a console error. generate_playwright_repro Generate a runnable Playwright test from a session. get_dom_snapshot Reconstruct the DOM at a given timestamp. query_dom_history Timeline of attribute/text changes for a selector. request_authorization Side-panel consent for write actions (Level 3). execute_action Dispatch a UI action (gated by permission level + destructive blocklist). Why local-first matters Every other "browser session for AI" tool ships to a vendor cloud. peek's SQLite + extension live on the user's machine — no remote endpoints, no telemetry. The privacy policy (docs/peek/PRIVACY_POLICY.md) is the source of truth. Install # 1. Add the MCP server to Claude Code claude mcp add peek -- npx -y @peekdev/mcp # 2. Install the Chrome extension from the Chrome Web Store # (link added once the CWS listing is approved)

a day ago