Docsearch Mcp

Created By

PatrickKoss8 months ago

ripgrep for docs but via MCP

Overview

What is Docsearch MCP?

Docsearch MCP is a local-first document search and indexing system that provides hybrid semantic and keyword search across local files (including PDFs) and Confluence pages through the Model Context Protocol (MCP). It is designed to assist AI tools in accessing documentation, codebases, and research materials.

How to use Docsearch MCP?

To use Docsearch MCP, you can install it via npm or run it using Docker. After installation, you can index your documents and search through them using the command-line interface (CLI).

Key features of Docsearch MCP?

Hybrid Search: Combines full-text search with vector similarity for optimal results.
Multi-Source Indexing: Index local files and Confluence spaces.
PDF Support: Extract and search text from PDF documents.
Image Search: AI-powered image description and search.
Database Flexibility: Supports SQLite and PostgreSQL.
Real-time Updates: Automatic re-indexing with file watching.
Multiple Output Formats: Supports text, JSON, and YAML outputs.

Use cases of Docsearch MCP?

Searching through code documentation.
Indexing and searching academic papers.
Assisting AI tools in retrieving relevant documentation.

FAQ from Docsearch MCP?

Can Docsearch MCP index all file types?

Yes, it supports various file types including code files, PDFs, and images.

Is Docsearch MCP free to use?

Yes, it is open-source and free to use.

How does the hybrid search work?

It combines traditional keyword search with semantic search using vector embeddings.

Try in Playground

Server Config

{
  "mcpServers": {
    "docsearch": {
      "command": "npx",
      "args": [
        "docsearch-mcp",
        "start"
      ],
      "env": {
        "OPENAI_API_KEY": "your-openai-key",
        "EMBEDDINGS_PROVIDER": "openai",
        "FILE_ROOTS": ".,../other-project",
        "DB_PATH": "/path/to/your/index.db"
      }
    }
  }
}

Project Info

Created At

8 months ago

Updated At

7 months ago

Author Name

PatrickKoss

Star

Language

License

Recommend Servers

View All

Name Brewery Domain Checker

@Name Brewery

Bulk domain checking for AI chats: availability, aftermarket prices, archive.org history, social handle links, and buy links. Your AI brainstorms names; check_domains reports what's real for up to 50 names across 6 TLDs per call. Free to start — 20 credits. Setup: https://namebrewery.com/mcp

2 hours ago

Fetch

@test

Web content fetching and conversion for efficient LLM usage

8 months ago

Sequential Thinking

@modelcontextprotocol

An MCP server implementation that provides a tool for dynamic and reflective problem-solving through a structured thinking process.

a year ago

Tavily Mcp

@tavily-ai

JavaScript

a year ago

Wundervault MCP

@wundervault

MCP server for Wundervault zero-knowledge secret management. Exposes vault secrets to AI agents via the Model Context Protocol — secrets are decrypted server-side and never returned to the agent in plaintext.

11 hours ago

Perplexity Ask MCP Server

@ppl-ai

A Model Context Protocol Server connector for Perplexity API, to enable web search without leaving the MCP ecosystem.

JavaScript

a year ago

Cliqo Mcp

Create and manage short links - shorten URLs, list / inspect links, track credits. No subscriptions.

13 hours ago

Inboxguard

Scan and fix a domain's email deliverability (SPF, DKIM, DMARC, MTA-STS, TLS-RPT, BIMI, DNS blocklists) — and remediate the DNS at the registrar.

12 hours ago

Layup Sport Booking

Search bookable London sports availability — courts, pitches, lanes, classes, pickup games — across every major UK leisure-centre operator and aggregator. ~100k slots indexed across 527 venues, 5 sports. Read-only, anonymous, CC-BY-4.0 attribution for OpenActive sources.

16 hours ago

Docwand

13 hours ago

302_sandbox_mcp

@302ai

Create a remote sandbox that can execute code/run commands/upload and download files. 创建远程沙盒，可以执行代码/运行命令/上传下载文件

a year ago

Sentry

@modelcontextprotocol

Retrieving and analyzing issues from Sentry.io

a year ago

mcp-server-flomo MCP Server

@chatmcp

Write notes to Flomo

JavaScript

a year ago

Jina AI MCP Tools

@PsychArch

A Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.

JavaScript

a year ago

Search1API

One API for Search, Crawling, and Sitemaps

a year ago

Github

@modelcontextprotocol

Repository management, file operations, and GitHub API integration

a year ago

Telugu Panchangam

@socraticsurge

Hindu almanac (panchangam) calculations as MCP tools: tithi, nakshatra, muhurta windows, 30+ named festivals, tarabalam good-day finding for up to four people, gochara transit verdicts, and a deterministic daily rasi phalalu. Swiss Ephemeris precision, every rule verified against reference almanacs. By AstroChaganti — panchangam.astrochaganti.com

a day ago

Zhipu Web Search

@BigModel

Zhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.

a year ago

//beforeyouship — LLM Cost Modeling From Your Editor

@Indiegoing

Query realistic LLM cost models without leaving your editor. beforeyouship models the **true monthly cost** of an LLM app architecture — retries, prompt caching, batch discounts, infra overhead, and 3×/10× growth — across GPT-5.x, Claude, Gemini, DeepSeek, and more. Not a token calculator: a planning tool for the design phase, before you commit to a stack. **No API key needed to try it** — demo mode covers the six free-tier models. A Pro key from [beforeyouship.dev](https://beforeyouship.dev) unlocks the full 18-model catalog. ## What you can ask - "How much will a RAG chatbot cost at 10,000 requests/day?" - "Compare Claude Haiku vs Gemini Flash pricing for my workload" - "What's the cheapest model for a multi-step agent at scale?" - "Show me current per-token prices for Anthropic models" ## Tools ### `estimate_cost` Full cost model for an architecture at a given usage level. Returns Naive / Realistic / Worst Case monthly cost per model, 3×/10× growth scenarios, and an opinionated recommendation with reasoning. ### `get_model_prices` Current per-1M-token pricing — input, output, cached input, batch — with context windows and staleness metadata. ### `list_archetypes` Seven preset architecture patterns (simple chatbot, chatbot with history, RAG pipeline, multi-model router, coding assistant, document processor, multi-step agent) used as starting points for estimates. ## Setup **Claude Code:** ```bash claude mcp add --transport http beforeyouship https://beforeyouship.dev/api/mcp ``` **Cursor / other clients** — add a remote server: ```json { "mcpServers": { "beforeyouship": { "type": "streamable-http", "url": "https://beforeyouship.dev/api/mcp" } } } ``` Add an `Authorization: Bearer bys_...` header with a Pro key for the full catalog. ## Try it > Estimate the monthly cost of a RAG pipeline at 10,000 requests/day

13 hours ago

Rightblogger

@RightBlogger

RightBlogger MCP gives any AI agent direct access to SEO keyword research, Google Search Console performance, and your WordPress/Ghost/Webflow CMS — research keywords, read posts, and pull GSC data straight from Claude, Cursor, or any MCP client.

a day ago

EchoRelay

EchoRelay is a multi-tenant HTTP relay: callers hit your versioned API lines, EchoRelay authenticates, rate-limits, and bills the request, then fans it out to your target URLs with retries and a dead-letter queue. The MCP server is the management plane — your agent can set up and operate a project end to end: create lines and endpoints, dry-run endpoint configs, manage API keys and key policies, invite team members, check billing and credits, read request logs and metrics, and retry or discard DLQ entries. Every tool carries read-only/destructive annotations, so MCP hosts can auto-approve reads and confirm destructive actions. Authentication is a Bearer token created in the EchoRelay panel. Docs: https://docs.echorelay.dev — pricing: https://echorelay.dev/pricing

2 days ago

Versium Reach

@Versium

Find leads, enrich your contacts, and verify emails just by describing what you need. Versium REACH builds and sizes B2B and B2C audiences and fills in the contact and company data you're missing, all in plain language with no manual exports or API code. US data only. Estimates are free; building a list draws on your Versium account credits and always confirms with you first. Requires an active Versium REACH subscription with API access.

a day ago

Linkpulse

16 hours ago

Matchbox

@Matchbox (Co-fe GmbH)

Describe a real-world problem in plain language and Matchbox finds products built to solve it - with reasoning, honest caveats, what each product won't cover, and a frank 'no strong match' when nothing fits. The catalog (~12,000 products) focuses on early-stage and lesser-known products that search engines and LLM training data usually miss. Never sponsored; payment never affects ranking. Tools: find_products_for_problem, search_catalog, get_product. No auth required.

4 hours ago

Gas Fee Predictor

@higher-being

Live Ethereum + Layer-2 gas-fee data for AI agents — current gas, cheapest L2, ETH price, best time to transact, and per-action cost estimates. Wraps the free gasfeepredictor.com API. No key required.

15 hours ago

Socialclaw

@ndesv21

Social media scheduling MCP for AI agents posting to X, LinkedIn, Instagram, Facebook Pages, TikTok, Discord, Telegram, YouTube, Reddit, WordPress, and Pinterest.

16 hours ago

Smart Match

@Wallstrdev

AI-powered job matching and application tracker. Analyze job listings against your resume, get a match score (0-100), identify skill gaps, generate cover letters, and track your application pipeline.

a day ago

Agent Signals — Hiring, SEC, Research, GitHub & HN Data

@hassanahashish-design

Five live data tools for AI agents — company hiring signals (Greenhouse/Lever/Ashby), SEC EDGAR filings, academic papers (OpenAlex), GitHub repos, and Hacker News — as flat, citation-ready JSON. Hosted on Apify; pay-per-result billing to the caller's own Apify account, empty results cost $0.

13 hours ago

Time

@modelcontextprotocol

A Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.

4 months ago

Primerfp Scout

@PrimeRFP

Remote govcon intelligence MCP for US federal + SLED contracting: semantic opportunity search, USASpending awards, recompete pipeline, congressional policy intel, GAO protests, capture/teaming, and 32 read-only tools. Streamable HTTP at https://mcp.primerfp.com/mcp.

a day ago