Agents Md Generator

Created By

nushey2 months ago

MCP server to create and update agents.md efficiently

Overview

agents-md-generator

MCP server that analyzes codebases with tree-sitter and generates AGENTS.md files.

Compatible with any MCP-capable client: Claude Code, Gemini CLI, Cursor, Windsurf, and others.

How it works: The server does all the heavy lifting locally — AST parsing, incremental change detection, environment variable scanning, entry point detection. It writes a compact structured payload to disk and returns step-by-step instructions to your AI client. The client reads the payload and writes AGENTS.md. No large data travels over the MCP wire.

Supported Languages

Python · C# · TypeScript · JavaScript · Go

Installation

See INSTALLATION.md for the full guide including prerequisites and troubleshooting.

Requirements: Python 3.11+, uv, Git, and any MCP-compatible client.

Claude Code

claude mcp add agents-md uvx agents-md-generator

Or add it manually to ~/.claude.json (Linux/macOS) or %USERPROFILE%\.claude.json (Windows):

{
  "mcpServers": {
    "agents-md": {
      "command": "uvx",
      "args": ["agents-md-generator"]
    }
  }
}

Gemini CLI

Add it to ~/.gemini/settings.json:

{
  "mcpServers": {
    "agents-md": {
      "command": "uvx",
      "args": ["agents-md-generator"]
    }
  }
}

Other MCP clients (Cursor, Windsurf, etc.)

The server uses stdio transport. Add this entry to your client's MCP config under mcpServers:

"agents-md": {
  "command": "uvx",
  "args": ["agents-md-generator"]
}

Restart your client — uvx downloads the package automatically on first run.

Usage

Once registered, ask your AI client:

"Generate the AGENTS.md for this project"

The client will call generate_agents_md automatically.

Tool Parameters

Parameter	Type	Default	Description
`project_path`	string	`"."`	Path to the project root
`force_full_scan`	boolean	`false`	Ignore cache and rescan everything from scratch

Note on force_full_scan: Use this only when explicitly requested. When asking Claude to improve or update an existing AGENTS.md, leave it as false — the incremental scan already provides all the data needed.

What Gets Generated

The generated AGENTS.md follows the agents.md open standard. It is written as a README for AI agents, not as documentation for humans. Sections include:

Project Overview — tech stack and top-level architecture shape
Architecture & Data Flow — detected layers or domains with data flow direction
Conventions & Patterns — naming rules, export contracts, import rules, and how to add new entities end-to-end
Environment Variables — variables detected in source files and .env.example
Setup Commands — exact install and run commands from package.json, Makefile, etc.
Development Workflow — build, watch, and dev server commands
Testing Instructions — test commands and framework info (if detected)
Code Style — lint/format commands (if config files detected)
Build and Deployment — CI pipeline info (if detected)

Sections with no detected data are omitted entirely.

How Incremental Scanning Works

First run (cold start): All git-tracked source files are parsed with tree-sitter and cached
Subsequent runs: Only files whose SHA-256 hash changed since the last scan are re-parsed
Semantic diff: For modified files, only changed public symbols are included in the payload
No source changes? The tool stops and asks whether you want to improve the existing AGENTS.md content anyway
Private symbols and test file internals are excluded from both cache and payload — only the public API surface matters for AGENTS.md

How Large Payloads Are Streamed

For large codebases the analysis payload can be too big to return inline over the MCP wire. The server handles this transparently through a second tool: get_payload_chunk.

Flow:

generate_agents_md runs the full analysis, writes the payload to disk, and returns a small response with total_chunks and instructions
The client calls get_payload_chunk(project_path, chunk_index=0), then increments chunk_index until the response contains has_more: false
The client concatenates all data fields in order and parses the result as JSON
The payload file is automatically deleted after the last chunk is read

This flow is pure MCP — no filesystem access required from the client side. Any MCP-compatible client can follow it.

Cache and Payload Location

All runtime artifacts are stored outside your project, in the user cache directory:

~/.cache/agents-md-generator/<project-hash>/cache.json  ← incremental scan cache

The <project-hash> is a SHA-256 of the project's absolute path — unique per project. Nothing is written to your repository.

Note: The server also writes a temporary payload.json to this directory during analysis, but it is managed entirely by the get_payload_chunk tool and deleted automatically after the last chunk is read. You never need to access it directly.

Project Configuration

Create .agents-config.json at your project root to customize behavior. This file is optional — all fields have defaults.

{
  "impact_threshold": "medium",
  "exclude": [
    "**/node_modules/**",
    "**/bin/**",
    "**/obj/**",
    "**/.git/**",
    "**/dist/**",
    "**/build/**",
    "**/__pycache__/**",
    "**/*.min.js",
    "**/*.min.css",
    "**/*.bundle.js",
    "**/vendor/**",
    "**/packages/**",
    "**/.venv/**",
    "**/venv/**",
    "**/bower_components/**",
    "**/app/lib/**",
    "**/wwwroot/lib/**",
    "**/wwwroot/libs/**",
    "**/static/vendor/**",
    "**/public/vendor/**",
    "**/assets/vendor/**",
    "**/site-packages/**"
  ],
  "include": [],
  "languages": "auto",
  "agents_md_path": "./AGENTS.md",
  "max_file_size_bytes": 1048576,
  "dir_aggregation_threshold": 8
}

Options

Key	Default	Description
`impact_threshold`	`"medium"`	Minimum change impact to include in incremental payload (see Impact Threshold)
`exclude`	(see above)	Glob patterns to exclude from analysis
`include`	`[]`	If non-empty, only analyze files matching these patterns
`languages`	`"auto"`	`"auto"` detects all supported languages, or pass a list like `["typescript", "python"]`
`agents_md_path`	`"./AGENTS.md"`	Output path for the generated file
`max_file_size_bytes`	`1048576`	Files larger than this are skipped (default: 1 MB)
`dir_aggregation_threshold`	`8`	Directories with this many or more files of the same language are collapsed into a single directory summary instead of per-file entries. Reduces payload size significantly on large codebases. Set to a high number to disable.

You can commit .agents-config.json to share exclusion rules and thresholds with your team.

Impact Threshold

The impact_threshold controls which symbol changes are included in incremental scan payloads. Changes below the threshold are silently ignored — AGENTS.md is not regenerated for them.

Change type	Symbol kind	Extra condition	Impact
any	any	Has HTTP decorator (`@HttpGet`, `@app.route`, `@Get`, …)	`high`
`added` or `removed`	`class`, `interface`, `struct`	—	`high`
`removed`	`method`	public	`high`
`modified`	any	public	`medium`
`added`	`function` or `method`	public	`medium`
any	any	none of the above	`low`

Choosing a threshold:

"high" — Only regenerate AGENTS.md for breaking or structural changes. Best for large, stable codebases where minor additions are frequent.
"medium" (default) — Regenerate when the public API surface grows or changes. Suitable for most projects.
"low" — Regenerate on any public symbol change. Best for early-stage projects where the architecture is still evolving.

What the Analysis Detects

Environment Variables

The server scans all source files for environment variable references using language-specific patterns:

Language	Pattern detected
JavaScript / TypeScript	`process.env.VAR_NAME`
Python	`os.environ['VAR']`, `os.getenv('VAR')`
Go	`os.Getenv("VAR")`
Ruby	`ENV['VAR']`
Rust	`env!("VAR")`, `var("VAR")`

It also parses .env.example, .env.template, and .env.sample files at the project root.

Entry Points

Files named index, main, app, server, program, bootstrap, or startup (with any supported extension) are detected as entry points and annotated with their inferred role (e.g., "HTTP server bootstrap", "Electron main process").

Public API Surface

Tree-sitter parses each source file and extracts public symbols — classes, functions, methods, interfaces — filtering out private/protected members and underscore-prefixed symbols. For classes and structs, constructors (when they have parameters) and public properties are also included, revealing dependency injection patterns and data shapes. Interface methods are always included as they define the public contract. These are used to detect naming conventions, DI patterns, and export contracts across layers.

Architectural Distillation

For large codebases, the tool applies several heuristics to ensure the payload remains high-signal:

Boilerplate Suppression: Common directories like Migrations, bin, obj, and Properties are automatically flagged and collapsed in the project structure, preventing them from bloating the directory listing.
Low-Entropy Summarization: Files that primarily contain data structures (DTOs, Entities) with no logic methods are "minified". Instead of listing every property, the tool provides a high-level summary (e.g., "Contains 25 DTO classes").
Semantic Clustering: The aggregator groups these minified summaries at the directory level, allowing the consuming AI to understand entire data layers through a single line of signal.
Instruction Prioritization: Foundation mandates (instructions) are placed at the very top of the payload, ensuring the AI agent understands the project's "Rules of Engagement" before processing the code architecture.

Credits

AGENTS.md format based on the open agents.md standard.

Try in Playground

Server Config

{
  "mcpServers": {
    "io-github-nushey-agents-md-generator": {
      "args": [
        "agents-md-generator"
      ],
      "command": "uvx"
    }
  }
}

Project Info

Created At

2 months ago

Updated At

2 months ago

Author Name

nushey

Star

Language

License

Recommend Servers

View All

Fractera

@Roma Bolshiyanov (Armstromg)

Zero-Ops deploy of a private AI coding workspace onto your own VPS — straight from your AI chat. Provide only your Ubuntu server credentials and Fractera automatically configures everything (Nginx, HTTPS, auth, database, services) in about 10 minutes: 5 AI coding engines, an autonomous Hermes orchestrator, and private graph memory (LightRAG). No terminal, no DevOps. The deployment is IP-first and free; attaching a custom domain with HTTPS is an optional later step. The connector can register the user, recommend a VPS, run and monitor the full deploy, and answer questions about the project via get_project_info.

a day ago

Salesforce Mcp Server

@DataGrout

Give your AI agents native, governed access to Salesforce. 700+ purpose-built tools spanning every Salesforce object — Accounts, Leads, Opportunities, Cases, Campaigns, Orders, and more — all protected by DataGrout's enterprise security layer.

2 days ago

LADU Adu Tools

@sirbots

LADU’s MCP server connects AI assistants directly to LA County ADU data. Ask about any address and get a plain-English buildability verdict, the parcel’s zoning and regulatory overlays (hillside, coastal, fire, flood, historic, fault zones), and the governing city’s ADU rules. Query permit statistics for any of 51 supported LA cities — totals, year-over-year trends, and valuation distributions from 19,000+ cleaned permit records going back to 2018. Compare two cities side by side, estimate project costs by type and finish level, or find permitted ADUs near an address.

16 hours ago

Coding Agent by OpenHelm

@Max Beech

Plan, implement, and validate software changes using a virtual computer equipped with a full coding agent, development tools, browsers, and file access. Coding Agent by OpenHelm enables your agent to carry out work such as: • Implementing new features • Fixing bugs • Refactoring codebases • Updating documentation • Creating and validating pull requests What makes OpenHelm the best solution for this: • AI monitors implementation progress and verifies work against the intended outcome • Can investigate code, documentation, issues, and related systems using a real browser • Multi-step workflows support planning, coding, testing, validation, and iteration • Operates within a secure remote environment designed for autonomous development work For software teams looking to accelerate execution, OpenHelm provides an environment where development tasks can be completed from start to finish with minimal supervision.

19 minutes ago

MCP Advisor

@istarwyh

MCP Advisor & Installation - Use the right MCP server for your needs

TypeScript

a year ago

Firecrawl Mcp Server

@mendableai

Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients.

JavaScript

a year ago

Bucket Feature Flags MCP Server

@bucketco

Flag features directly from chat in your code editor, including VS Code, Cursor, Windsurf, Claude Code—any IDE with MCP support.

a year ago

Mcp Server Chatsum

@chatmcp

summarize chat message

typescript

a year ago

Ia Qa Llm Toolbox

@JC Jamet

130+ QA & dev tools for AI agents. Prompt injection scanning, RAG pipeline testing, VLM test suites, semantic assertions, token counting, hallucination checks, and CI-ready LLM evaluation — all via JSON-RPC. Free, no API key, no signup.

2 days ago

Openhelm

@Max Beech

Execute one-off or scheduled tasks within a virtual computer that can spin up a real browser, a full coding agent, an email inbox and more. OpenHelm unlocks the ability for your agent to carry out all sorts of tasks, such as: • Deep research on a topic • Emailing + handling responses • Coding work • Updating files What makes OpenHelm the best solution for this: • Underpinned by a powerful AI that monitors work and ensures goals are actually achieved • Browser is SUPER stealthy, meaning it can access more of the web than other bots • Work is managed by a powerful agentic system, facilitating multi-step flows For performing complex, one-off or recurring work, OpenHelm provides a secure, remote environment to make it all possible.

2 days ago

Slack

@modelcontextprotocol

Channel management and messaging capabilities

a year ago

Playwright Mcp

@microsoft

Playwright MCP server

TypeScript

10 months ago

Madeonsol — Solana Kol & Memecoin Intelligence

@LamboPoewert

Real-time Solana memecoin intelligence for AI agents and trading bots — KOL wallet trades, Pump.fun deployer reputation, token quality scoring, wallet PnL, and pre-confirm launch alerts. 40+ tools. Free API key, or pay-per-call via x402.

15 hours ago

EverArt

@modelcontextprotocol

AI image generation using various models

a year ago

CryptoQuant | On-chain intelligence, natively in your agent.

@CryptoQuant

CryptoQuant MCP gives AI agents direct access to institutional-grade on-chain and market data across Bitcoin, Ethereum, XRP, stablecoins, and ERC20 tokens — covering 245 endpoints across exchange flows, miner data, network indicators, derivatives, fund data, and more. Built for analysts and developers who need defensible, data-driven answers, not approximations. Query real-time metrics like exchange netflow, MVRV, SOPR, realized price bands, funding rates, and miner revenue directly inside any MCP-compatible agent — no manual data export, no dashboard switching. Backed by CryptoQuant's professional research team, whose institutional-grade analysis is also accessible via the MCP through research and QuickTake endpoints. Supported assets: BTC, ETH, XRP, Stablecoins, ERC20, TRX Data categories: Exchange flows, miner flows, market indicators, network data, derivatives, fund data, lightning network, mempool, DeFi, AMM/DEX data

a day ago

Ladu Adu Tools

@ladu-dev

16 hours ago

Littleorange Video Mcp

@littleorange-ai

a day ago

Sentry

@modelcontextprotocol

Retrieving and analyzing issues from Sentry.io

a year ago

Four Leaf

@fourleafai

Job search and interview prep MCP. 11 tools spanning job discovery, role intelligence, practice question generation, resume scoring, comp benchmarks, and negotiation strategy. Cross-LLM via the hosted endpoint with OAuth 2.1.

19 hours ago

Tavily Mcp

@tavily-ai

JavaScript

a year ago

EdgeOne Pages MCP

@TencentEdgeOne

An MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.

TypeScript

a year ago

Pathmode

@pathmodeio

Build structured intent specs through Socratic AI conversation (zero-config), or connect to your Pathmode workspace for strategic context, dependency graphs, and team governance. v1.5.0 ships with a Claude Code skill pack — 7 auto-triggering skills (compile-intent, grill-intent, verify-intent, review-against-intent, split-intent-to-issues, handoff-intent, setup-pathmode-workflow) installable via `npx @pathmode/mcp-server install-skills`. Makes agents build the right thing, not just any thing.

2 days ago

Upnote Lens Mcp

@elsboo

Let AI read, search, and summarize your UpNote notes — and create new ones — right from your chat.

3 hours ago

Perplexity Ask MCP Server

@ppl-ai

A Model Context Protocol Server connector for Perplexity API, to enable web search without leaving the MCP ecosystem.

JavaScript

a year ago

AI Work Market — USDC settlement rails for AI labor on Base Mainnet)

@Dario (DME)

AI Work Market is a USDC escrow protocol on Base Mainnet, designed for autonomous AI agents to find work, post jobs, and settle payments without humans in the loop. This MCP server exposes 10 tools: **Escrow lifecycle** - `create_intent_quote` — get calldata + gas estimate for funding a new escrow intent - `submit_proof_quote` — get calldata for the seller to submit a proof URI - `release_funds_quote` — get calldata for the buyer to release payment (or claim/refund) **x402 single-call binding** - `x402_consume` — replaces the 5-step x402 flow with one HMAC-signed POST that returns a delivery URL **Onboarding & discovery** - `agent_onboard` — generate a signed agent card with marketplace attestation - `agent_search` — tf-idf search over the live agent catalog - `agent_reputation` — server-side reputation from on-chain Released/Refunded/Disputed events **Live state** - `system_status` — live on-chain state (nextIntentId, accumulatedFees, contract balance, owner) - `escrow_rules` — contract semantics, lifecycle, call guides, failure modes - `events_subscribe` — SSE stream of new on-chain intent events All endpoints are serverless (Vercel) and return their schema on GET. No browser, no wallet UI required for an agent to integrate. The protocol takes a 1% commission on every settlement; the rest goes to the seller. The full AgentCard is at `/.well-known/agent-card.json` (A2A-compatible). The OpenAPI 3.0.3 spec is at `/.well-known/openapi.json` with `components.securitySchemes` (none, hmacX402). `robots.txt` allows GPTBot, ClaudeBot, anthropic-ai, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, Amazonbot.

6 hours ago

Scrape Agent Mcp

@artespraticas

Service name: Scrape Agent Service base URL: https://scrapeagent.xyz Website URL: https://scrapeagent.xyz Email: your email Category: Data Description: Pay-per-use web scraping API built on x402. Extract clean text, links or HTML from any URL for $0.01 USDC on Base. No API key, no subscription needed. Perfect for AI agents that need web data on demand. Endpoint paths: /api/scrape/x402 Notes: Endpoint returns HTTP 402 with valid accepts[] payload on both GET and POST requests without X-PAYMENT header. Custom domain, landing page at root, OpenAPI spec at /openapi.json.

6 hours ago

Jina AI MCP Tools

@PsychArch

A Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.

JavaScript

a year ago

Framelink Figma MCP Server

@GLips

MCP server to provide Figma layout information to AI coding agents like Cursor

TypeScript

a year ago

AgentQL MCP Server

@tinyfish-io

Model Context Protocol server that integrates AgentQL's data extraction capabilities.

JavaScript

a year ago

A4B CMMS

@a4b

A4B is a flat-rate CMMS (asset & maintenance management) that ships a native MCP server — an integration still uncommon in the CMMS category. AI assistants like Claude and ChatGPT can query asset inventory, create and update assets and maintenance tasks, search history, and generate reports. Secured with OAuth sign-in, audit logging, and per-organization isolation.

2 days ago