Markdown Word Counter Mcp

Created By
LoganUpdatinga year ago
A Model Context Protocol (MCP) server that provides word counting functionality for Markdown files, with support for both Chinese characters and English words. # Features - Accurate Word Counting: Properly handles both Chinese characters and English words - Minimal Processing by Default: Only normalizes whitespace, preserves all content - Configurable Processing: Control what gets processed with flexible options - Markdown-Aware: Optionally removes Markdown syntax while preserving content - HTML Tag Removal: Optionally strips HTML tags from content - Link Processing: Optionally extracts text from Markdown links ## Two Tools Available: - count_words: Get total word count with options - detailed_word_count: Get detailed statistics with processing info
Overview

Markdown Word Counter MCP Server

A Model Context Protocol (MCP) server that provides word counting functionality for Markdown files, with support for both Chinese characters and English words.

Features

  • Accurate Word Counting: Properly handles both Chinese characters and English words
  • Minimal Processing by Default: Only normalizes whitespace, preserves all content
  • Configurable Processing: Control what gets processed with flexible options
  • Markdown-Aware: Optionally removes Markdown syntax while preserving content
  • HTML Tag Removal: Optionally strips HTML tags from content
  • Link Processing: Optionally extracts text from Markdown links
  • Two Tools Available:
    • count_words: Get total word count with options
    • detailed_word_count: Get detailed statistics with processing info

Processing Options

By default, only whitespace is normalized. All other processing is optional:

  • remove_html_tags (default: false) - Remove HTML tags like <div>, <p>
  • process_markdown_links (default: false) - Extract text from [text](url) links
  • remove_markdown_headers (default: false) - Remove header markers #, ##, etc.
  • remove_markdown_lists (default: false) - Remove list markers -, *, +
  • normalize_whitespace (default: true) - Collapse multiple spaces/newlines

Installation

  1. Clone or download this repository
  2. Install dependencies:
    npm install
    
  3. Build the project:
    npm run build
    

Usage

As MCP Server

Add to your MCP client configuration (e.g., mcp_settings.json):

{
  "mcpServers": {
    "markdown-word-counter": {
      "command": "node",
      "args": [
        "/path/to/markdown-word-counter-mcp/build/index.js"
      ]
    }
  }
}

Available Tools

count_words

Count total words in a Markdown file.

Parameters:

  • file_path (string, required): Path to the Markdown file
  • remove_html_tags (boolean, optional): Remove HTML tags (default: false)
  • process_markdown_links (boolean, optional): Process Markdown links (default: false)
  • remove_markdown_headers (boolean, optional): Remove headers (default: false)
  • remove_markdown_lists (boolean, optional): Remove list markers (default: false)
  • normalize_whitespace (boolean, optional): Normalize whitespace (default: true)

Example (Default - Minimal Processing):

{
  "file_path": "document.md"
}

Example (Enable All Processing):

{
  "file_path": "document.md",
  "remove_html_tags": true,
  "process_markdown_links": true,
  "remove_markdown_headers": true,
  "remove_markdown_lists": true
}

detailed_word_count

Get detailed word count statistics with processing information.

Parameters:

  • Same as count_words

Example:

{
  "file_path": "document.md",
  "remove_markdown_headers": false
}

Standalone Usage

You can also run the server directly:

npm start

Development

  • npm run dev: Watch mode for development
  • npm run build: Build the TypeScript code
  • npm start: Run the built server

Word Counting Logic

The server uses the following logic to count words:

  1. Remove HTML tags: Strips all HTML markup
  2. Process Markdown links: Extracts link text, removes URLs
  3. Remove Markdown syntax: Headers (#), lists (-, *, +)
  4. Normalize whitespace: Collapses multiple spaces
  5. Count separately:
    • Chinese characters: Using Unicode range [\u4e00-\u9fa5]
    • English words: Using word boundaries \b\w+\b
  6. Sum totals: Chinese chars + English words = Total words

Examples

Default Processing (Minimal - Only Normalize Whitespace)

For a file containing:

# Hello 世界

This is a [link](http://example.com) with **bold** text.

- 中文测试
- English test

Default result (preserves all Markdown syntax):

  • Chinese Characters: 4 (中文测试)
  • English Words: 13 (Hello This is a link http example com with bold text English test)
  • Total Words: 17

Full Processing (All Options Enabled)

Same file with all processing enabled:

{
  "file_path": "document.md",
  "remove_html_tags": true,
  "process_markdown_links": true,
  "remove_markdown_headers": true,
  "remove_markdown_lists": true,
  "normalize_whitespace": true
}

Result with full processing:

  • Chinese Characters: 4 (世界中文测试)
  • English Words: 7 (Hello This is a link with bold text English test)
  • Total Words: 11

License

MIT

Server Config

{
  "mcpServers": {
    "markdown-word-counter": {
      "command": "node",
      "args": [
        "/path/to/markdown-word-counter-mcp/build/index.js"
      ]
    }
  }
}
Project Info
Created At
a year ago
Updated At
a year ago
Author Name
LoganUpdating
Star
-
Language
-
License
-
Category

Recommend Servers

View All
GovQL
@Alex Stout

# govql-mcp-server An MCP (Model Context Protocol) server for [GovQL](https://govql.us) — gives AI clients like Claude Desktop, Claude Code, and Cursor direct access to the US Congressional GraphQL API at [api.govql.us/graphql](https://api.govql.us/graphql) without bespoke HTTP wiring. For the design rationale (why FastMCP-Python, the passthrough+curated philosophy, roadmap through v0.4), see [design.md](https://github.com/govql/govql/blob/main/mcp-server/docs/design.md). ## What you can do with it Ask an agent questions like: - *"How did Vermont's two senators vote on the most recent nomination?"* - *"Which legislators in the 118th Congress switched parties during their service?"* - *"Compare Senator Sanders' voting record to Senator Murkowski's on cloture votes in the most recent Congress."* The agent picks the right tool, writes the GraphQL query against the live schema, and parses the response — no manual API wrangling. ## Install The server runs as a per-client subprocess over stdio. Pick your client: ### Claude Desktop Edit `claude_desktop_config.json` (Settings → Developer → Edit Config): ```json { "mcpServers": { "govql": { "command": "uvx", "args": ["govql-mcp-server"] } } } ``` Restart Claude Desktop. The `govql` tools appear in the tools panel. ### Claude Code Add to `.mcp.json` in your project (or `~/.mcp.json` for global): ```json { "mcpServers": { "govql": { "command": "uvx", "args": ["govql-mcp-server"] } } } ``` ### Cursor Settings → MCP → Add Server. Use the same `command` / `args` as above. ### Other clients Any MCP-compatible client that supports stdio servers will work. The command is `uvx govql-mcp-server` with no required arguments. ## Tools | Tool | Purpose | |---|---| | `execute_graphql` | Run any GraphQL query against the GovQL endpoint. Returns the result plus an `last_ingest` timestamp so the agent can reason about data freshness. | | `list_types` | Returns the names and kinds of every type in the GovQL schema. Optional `kind` filter (`"OBJECT"`, `"INPUT_OBJECT"`, `"ENUM"`, etc.) to narrow further. Start here when you don't know what's queryable. | | `describe_type` | Returns one type's full details — fields, arg signatures, input fields, enum values. Call after `list_types` to learn the shape of a specific type before writing a query. | ## Configuration All env vars are optional — the package is zero-config for end users. | Env var | Default | Purpose | |---|---|---| | `GOVQL_ENDPOINT` | `https://api.govql.us/graphql` | Endpoint to query. Override to point at a local dev stack. | | `GOVQL_TIMEOUT_MS` | `30000` | Per-request HTTP timeout. | | `LOG_LEVEL` | `INFO` | Logging level. Logs go to stderr only (stdout is reserved for the MCP transport). | ## Limits (enforced by the upstream API) - Max query depth: 10 - Max query complexity: ~10 billion points (`first: N` multiplies child cost by N — keep page sizes reasonable on deeply nested queries) - Rate limit: 100 requests / 60 s per source IP A depth or complexity violation surfaces as a GraphQL `errors` entry in the tool response so the agent can adjust and retry. ## Data freshness Every `execute_graphql` response includes a `last_ingest` ISO timestamp. Vote data refreshes hourly; legislator data refreshes daily. ## Status Version 0.1.0 ships three foundational tools: a GraphQL passthrough (`execute_graphql`) and two narrow schema-discovery tools (`list_types`, `describe_type`). Curated higher-level tools (`find_legislator`, `get_voting_record`, `compare_voters`, etc.) are planned for subsequent releases — see [design.md](https://github.com/govql/govql/blob/main/mcp-server/docs/design.md) for the roadmap. ## Links - [GovQL project site](https://govql.us) - [GraphQL API](https://api.govql.us/graphql) - [Source / issues](https://github.com/govql/govql)

3 hours ago