Anycrawl - Turn Any Website Into Llm Ready

Created By

any4ai9 months ago

Turn any website into LLM-ready, and provide SERP

Overview

What is AnyCrawl?

AnyCrawl is a powerful web scraping and crawling tool designed to turn any website into LLM-ready content, providing seamless integration with various LLM clients via the Model Context Protocol (MCP).

How to use AnyCrawl?

To use AnyCrawl, sign up on the AnyCrawl website to receive an API key, then set it as the ANYCRAWL_API_KEY environment variable. You can start crawling by running the AnyCrawl MCP server with the provided command.

Key features of AnyCrawl?

Web scraping with multiple output formats
Configurable website crawling with depth limits
Integration with search engines for scraping results
Support for multiple engines like Playwright, Cheerio, and Puppeteer
Flexible output options including Markdown, HTML, and structured JSON
Non-blocking async operations with status monitoring
Robust error handling and logging
Multiple deployment modes available

Use cases of AnyCrawl?

Extracting data from specific web pages for analysis.
Crawling entire websites for content aggregation.
Integrating web search results into applications.
Automating data collection for research purposes.

FAQ from AnyCrawl?

How do I get started with AnyCrawl?

Sign up on the AnyCrawl website to receive your API key and follow the setup instructions.

Is there a free tier available?

Yes! You can sign up for free and receive 1,500 credits to crawl nearly 1,500 pages.

What output formats does AnyCrawl support?

AnyCrawl supports Markdown, HTML, text, screenshots, and structured JSON.

Try in Playground

Server Config

{
  "mcpServers": {
    "anycrawl-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "anycrawl-mcp-server"
      ],
      "env": {
        "ANYCRAWL_API_KEY": "<YOUR_TOKEN>",
        "ANYCRAWL_BASE_URL": "https://api.anycrawl.dev",
        "LOG_LEVEL": "info"
      }
    }
  }
}

Project Info

Created At

9 months ago

Updated At

9 months ago

Author Name

any4ai

Star

Language

License

Recommend Servers

10 hours ago

Model Context Protocol server that integrates AgentQL's data extraction capabilities.

JavaScript

a year ago

@altronis/sgdata mcp

@sypherin

Comprehensive Singapore government data MCP server — typed tools for SG datasets (data.gov.sg and more) for any AI agent.

2 hours ago

Abm.dev

@abm.dev

B2B enrichment API for AI agents — 89 canonical fields per person + company, each annotated with source, confidence (0-1) and freshness. One call synthesises ten sources; no per-field bills. Add https://mcp.abm.dev/mcp as a custom MCP connector (OAuth 2.1 PKCE).

12 hours ago

Meok Vehicle Handover Mcp

10 hours ago

Bitly MCP

@Bitly

Bitly is the world's most trusted link management platform - and now your AI assistant can tap directly into everything it tracks. With the Bitly MCP Server, you can pull real-time engagement data on any link, QR code, or page: click volume, top locations, referral sources, device breakdowns, and performance over time. Ask your AI to surface your top-performing links this week, flag anything with unusual click patterns, or compare campaign performance across channels - all in natural conversation. Once you have the data, creation is just as seamless. Generate branded short links with custom domains and UTM parameters, produce print-ready QR codes with your brand colors and logo - all without leaving your workflow. 27 tools. Full Bitly platform access. Works with Claude, ChatGPT, Cursor, VS Code, and Windsurf.

a day ago

Kage

@kage-core

Kage is shared, code-grounded memory for developers and their coding agents. Capture a learning once — a bug cause, a decision, a gotcha — and the whole team plus every agent recalls it next time it's relevant. Memory is grounded in your actual code and stored as git-tracked JSON reviewed in PRs, not a personal vector blob that drifts: citations are validated on write, and stale memory (whose code was deleted or refactored) is withheld from recall and flagged for you. Works with Claude Code, Codex, Cursor, Windsurf, and any MCP client. No vector DB, API key, or service to run. Try it in 30s: npx -y @kage-core/kage-graph-mcp demo

13 hours ago

Fittin — Startup Ip Protection

@rnwkr7rscy-lang

Know what's protectable before competitors copy it. Instant AI-powered patentability check for your startup idea. Submit your idea — get a Patentability Analysis document (PDF + DOCX) with IP risk score and a clear picture of what's protectable. Free, no card required. Full IP Portfolio with 6 documents available ($99).

2 days ago

Meok Ev Recall Transport Mcp

10 hours ago

Aiimagemultistyle

@codecraftm

A Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.

a year ago

Bucket Feature Flags MCP Server

@bucketco

Flag features directly from chat in your code editor, including VS Code, Cursor, Windsurf, Claude Code—any IDE with MCP support.

a year ago

CLI MARKET MCP SERVER

@Ricardo Cuba

Conecta tus asistentes de IA (Claude, Cursor, Cline, etc.) con **CLI MARKET**, la plataforma de marketplace para línea de comandos.

2 days ago

Local MCP (LMCP)

@lanchuske

143 local tools for Claude, Cursor & ChatGPT — Mail, iMessage, Microsoft Teams, Slack, WhatsApp, Calendar, OneDrive, Outlook, OmniFocus, Notes, Finder and Office docs. 100% local, no API keys. Reaches what cloud connectors can't. Mac + Windows.

6 hours ago

Slack

@modelcontextprotocol

Channel management and messaging capabilities

a year ago

Redis Mcp Server For Java

@6000fish

A stdio MCP server that connects MCP-compatible Agents to Redis for safe key/value, hash, list, set, metadata, and diagnostic operations. Part of MCP Java SDK, a Java toolkit for building custom MCP Servers.

15 hours ago

Claude For Safari

@Lyosis

Safari Web Extension + Node.js MCP bridge giving Claude Desktop full control over Safari — navigate, read pages, click elements, fill forms, and manage tabs. No Playwright or WebDriver dependency.

an hour ago

Github

@modelcontextprotocol

Repository management, file operations, and GitHub API integration

a year ago

Time

@modelcontextprotocol

A Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.

4 months ago

Savvly MCP Server

@Savvly

MCP server for the Savvly Longevity Benefit Fund: product info, Savvly-vs-alternative comparisons, eligibility checks, an audience-tagged Q&A library, and retirement / lump-sum / monthly projections (with inline MCP Apps chart widgets).

a day ago

PulseNetwork MCP Server

@GTCC777

51 AI-powered Intelligence verticals - finance, immigration, legal, health, crypto, real estate, careers, travel, compliance and more. x402-native, pay-per-query via USDC on Base. No API keys, no subscriptions.

a day ago

Hellogrowthcrm

a day ago

GovQL

@Alex Stout

# govql-mcp-server An MCP (Model Context Protocol) server for [GovQL](https://govql.us) — gives AI clients like Claude Desktop, Claude Code, and Cursor direct access to the US Congressional GraphQL API at [api.govql.us/graphql](https://api.govql.us/graphql) without bespoke HTTP wiring. For the design rationale (why FastMCP-Python, the passthrough+curated philosophy, roadmap through v0.4), see [design.md](https://github.com/govql/govql/blob/main/mcp-server/docs/design.md). ## What you can do with it Ask an agent questions like: - *"How did Vermont's two senators vote on the most recent nomination?"* - *"Which legislators in the 118th Congress switched parties during their service?"* - *"Compare Senator Sanders' voting record to Senator Murkowski's on cloture votes in the most recent Congress."* The agent picks the right tool, writes the GraphQL query against the live schema, and parses the response — no manual API wrangling. ## Install The server runs as a per-client subprocess over stdio. Pick your client: ### Claude Desktop Edit `claude_desktop_config.json` (Settings → Developer → Edit Config): ```json { "mcpServers": { "govql": { "command": "uvx", "args": ["govql-mcp-server"] } } } ``` Restart Claude Desktop. The `govql` tools appear in the tools panel. ### Claude Code Add to `.mcp.json` in your project (or `~/.mcp.json` for global): ```json { "mcpServers": { "govql": { "command": "uvx", "args": ["govql-mcp-server"] } } } ``` ### Cursor Settings → MCP → Add Server. Use the same `command` / `args` as above. ### Other clients Any MCP-compatible client that supports stdio servers will work. The command is `uvx govql-mcp-server` with no required arguments. ## Tools | Tool | Purpose | |---|---| | `execute_graphql` | Run any GraphQL query against the GovQL endpoint. Returns the result plus an `last_ingest` timestamp so the agent can reason about data freshness. | | `list_types` | Returns the names and kinds of every type in the GovQL schema. Optional `kind` filter (`"OBJECT"`, `"INPUT_OBJECT"`, `"ENUM"`, etc.) to narrow further. Start here when you don't know what's queryable. | | `describe_type` | Returns one type's full details — fields, arg signatures, input fields, enum values. Call after `list_types` to learn the shape of a specific type before writing a query. | ## Configuration All env vars are optional — the package is zero-config for end users. | Env var | Default | Purpose | |---|---|---| | `GOVQL_ENDPOINT` | `https://api.govql.us/graphql` | Endpoint to query. Override to point at a local dev stack. | | `GOVQL_TIMEOUT_MS` | `30000` | Per-request HTTP timeout. | | `LOG_LEVEL` | `INFO` | Logging level. Logs go to stderr only (stdout is reserved for the MCP transport). | ## Limits (enforced by the upstream API) - Max query depth: 10 - Max query complexity: ~10 billion points (`first: N` multiplies child cost by N — keep page sizes reasonable on deeply nested queries) - Rate limit: 100 requests / 60 s per source IP A depth or complexity violation surfaces as a GraphQL `errors` entry in the tool response so the agent can adjust and retry. ## Data freshness Every `execute_graphql` response includes a `last_ingest` ISO timestamp. Vote data refreshes hourly; legislator data refreshes daily. ## Status Version 0.1.0 ships three foundational tools: a GraphQL passthrough (`execute_graphql`) and two narrow schema-discovery tools (`list_types`, `describe_type`). Curated higher-level tools (`find_legislator`, `get_voting_record`, `compare_voters`, etc.) are planned for subsequent releases — see [design.md](https://github.com/govql/govql/blob/main/mcp-server/docs/design.md) for the roadmap. ## Links - [GovQL project site](https://govql.us) - [GraphQL API](https://api.govql.us/graphql) - [Source / issues](https://github.com/govql/govql)

a day ago

Meok Ev Recall Transport Mcp

10 hours ago

20 hours ago

AI-native task manager with native MCP server. Let Claude or any MCP client read, create, update, complete, and reschedule your tasks directly — no copy-pasting. Includes semantic search, daily planning, habit tracking, and weekly reviews. Free tier available.

16 hours ago

Framelink Figma MCP Server

@GLips

MCP server to provide Figma layout information to AI coding agents like Cursor

TypeScript

a year ago

Freshfilings Mcp

@Noah Fischer

Search business entity filings from New York, Florida, and Colorado. Look up LLCs and corporations by name, get full entity details (registered agent, officers, addresses), and track new formations for sales prospecting. Requires a free FreshFilings API key at freshfilings.dev.

a day ago

Mcp Skills

@BeBraveBeKind

Pre-install trust scoring & safety scanning for MCP servers, AI skills & npm packages — 15 signals incl. OSV/KEV/EPSS vuln intel and an auto-gate go/no-go.

2 days ago

Cli Market

@Treevu-ai

Commerce infrastructure for AI agents — 43 MCP tools to search, compare, and purchase across 36 verified retailers in 8 countries. 45,000+ real shelf prices refreshed every 4h. MIT.

2 days ago

Claribi.com

@Darek Černý

clariBI is an AI-powered business intelligence platform built for small and mid-sized SaaS, ecommerce, and operations teams. Instead of writing SQL or building dashboards manually, users ask questions in plain English ("what's our CAC by channel last quarter?") and the AI engine pulls the data from Stripe, HubSpot, Google Analytics, ad platforms, Jira, and 30+ other sources via the open Model Context Protocol. Dashboards generate themselves on source connect.

a day ago