mcp-server-webcrawl

Created By

pragmara year ago

Bridge the gap between your web crawler and AI language models using Model Context Protocol (MCP). With mcp-server-webcrawl, your AI client filters and analyzes web content under your direction or autonomously, extracting insights from your web content. Support for WARC, wget, InterroBot, Katana, and SiteOne crawlers is available out of the gate. The server includes a full-text search interface with boolean support, resource filtering by type, HTTP status, and more.

Overview Content Tools Comments

Overview

what is mcp-server-webcrawl?

mcp-server-webcrawl is an open-source server that bridges the gap between web crawlers and AI language models using the Model Context Protocol (MCP). It allows AI clients to filter and analyze web content, extracting insights either under user direction or autonomously.

how to use mcp-server-webcrawl?

To use mcp-server-webcrawl, install it via pip with the command: pip install mcp-server-webcrawl. You can then run the server using the command: mcp-server-webcrawl --crawler wget --datasrc /path/to/wget/archives/.

key features of mcp-server-webcrawl?

Compatibility with Claude Desktop
Full-text search interface with boolean support
Resource filtering by type and HTTP status
Support for various crawlers including wget, WARC, and more
Ability to augment your LLM knowledge base
ChatGPT support is coming soon

use cases of mcp-server-webcrawl?

Analyzing web content for research purposes
Extracting insights from large datasets collected by web crawlers
Enhancing AI language models with real-time web data

FAQ from mcp-server-webcrawl?

Is mcp-server-webcrawl free to use?

Yes! mcp-server-webcrawl is free and open-source.

What are the system requirements?

It requires Claude Desktop and Python version 3.10 or higher.

Which crawlers are supported?

It supports wget, WARC, InterroBot, Katana, and SiteOne crawlers.

Try in Playground

Server Config

{
  "mcpServers": {
    "webcrawl": {
      "command": "mcp-server-webcrawl",
      "args": [
        "--crawler",
        "wget",
        "--datasrc",
        "/path/to/wget/archives/"
      ]
    }
  }
}

Project Info

Created At

a year ago

Updated At

a year ago

Author Name

pragmar

Star

-

Language

-

License

-

Category

Tags

Homepage

https://github.com/pragmar/mcp_server_webcrawl

Recommend Servers

Web content fetching and conversion for efficient LLM usage

9 months ago

Rwanda Payments Mcp

@junter1989k-ai

11 days ago

Estonia Payments Mcp

@junter1989k-ai

11 days ago

@modelcontextprotocol

Web and local search using Brave's Search API

a year ago

Algeria Payments Mcp

@junter1989k-ai

11 days ago

Saudi Arabia Payments Mcp

@junter1989k-ai

11 days ago

Kuwait Payments Mcp

@junter1989k-ai

11 days ago

Czechia Payments Mcp

@junter1989k-ai

11 days ago

@modelcontextprotocol

AI image generation using various models

a year ago

Slovakia Payments Mcp

@junter1989k-ai

11 days ago

mcp-server-flomo MCP Server

Write notes to Flomo

JavaScript

a year ago

Georgia Payments Mcp

@junter1989k-ai

11 days ago

New Zealand Payments Mcp

@junter1989k-ai

11 days ago

Malta Payments Mcp

@junter1989k-ai

11 days ago

@modelcontextprotocol

Retrieving and analyzing issues from Sentry.io

a year ago

Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

Python

a year ago

Aws Kb Retrieval Server

@modelcontextprotocol

An MCP server implementation for retrieving information from the AWS Knowledge Base using the Bedrock Agent Runtime.

a year ago

Framelink Figma MCP Server

MCP server to provide Figma layout information to AI coding agents like Cursor

TypeScript

a year ago

Arcgis Portal Mcp

MCP server for ArcGIS Portal and ArcGIS Online. Lets AI assistants search content, query feature layers, manage features, handle content operations, and administer users and groups. Built on the Model Context Protocol for integration with Claude Desktop, Cursor, VS Code Copilot, and other MCP clients. Disclaimer: This is an independent open-source project. It is not affiliated with, endorsed by, or sponsored by Esri Inc. "ArcGIS" is a registered trademark of Esri.

11 days ago

302_sandbox_mcp

Create a remote sandbox that can execute code/run commands/upload and download files. 创建远程沙盒，可以执行代码/运行命令/上传下载文件

a year ago

Norway Payments Mcp

@junter1989k-ai

11 days ago

302_browser_use_mcp

Automatically create a remote browser to complete your specified tasks, developed based on Browser Use + Sandbox. 自动创建一个远程浏览器，完成你指定的任务，基于Browser Use + Sandbox开发。

a year ago

Official remote MCP server for Framesail AI. Create long-form (faceless YouTube) videos end to end from any MCP client: script, locked character references, storyboard, voiceover, and final video editing — with characters and style held consistent across every shot. Making long-form AI video today means 8+ tabs stitched by hand — an LLM for the script, a voice model, an image model, a video model — with characters drifting between tools and style resetting at every export. Framesail replaces the patchwork: the whole pipeline runs in one place and manages your video's context end to end. Six stages: Style (paste images, videos, or YouTube links and Framesail reverse-engineers the look, voice, and direction), Script (write it yourself or generate it in your narrative style), Reference images (auto-generated for every character, place, and prop), Voiceover (one narrator or many characters, with word-level timing), Storyboard (planned scene by scene), and Editor (captions, music, SFX, then export). No black box: you control every prompt, asset, model, and setting.

11 days ago

EdgeOne Pages MCP

@TencentEdgeOne

An MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.

TypeScript

a year ago

Switzerland Payments Mcp

@junter1989k-ai

11 days ago

Uruguay Payments Mcp

@junter1989k-ai

11 days ago

Portugal Payments Mcp

@junter1989k-ai

11 days ago

Aiimagemultistyle

A Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.

a year ago

Mailtrap Email Sending MCP

An MCP server that provides a tool for sending transactional emails via Mailtrap

a year ago

Belgium Payments Mcp

@junter1989k-ai

11 days ago