Mcp Proxyml

Created By

proxymla month ago

MCP server for the ProxyML API. Gives Claude (and other MCP clients) direct access to ProxyML's surrogate modelling and explainability tools.

Overview Content Tools Comments

Overview

mcp-proxyml

MCP server for the ProxyML API. Gives Claude (and other MCP clients) direct access to ProxyML's surrogate modelling and explainability tools.

Prerequisites

A ProxyML API key — sign up at proxyml.ai
uv (required for uvx installation method)

  curl -LsSf https://astral.sh/uv/install.sh | sh

Or if you use pip:

pip install uv

Installation

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "proxyml": {
      "command": "uvx",
      "args": ["mcp-proxyml"],
      "env": {
        "PROXYML_API_KEY": "your-api-key-here"
      }
    }
  }
}

The config file is at:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Then restart Claude Desktop.

Claude Code

claude mcp add-json proxyml '{"command":"uvx","args":["mcp-proxyml"],"env":{"PROXYML_API_KEY":"your-api-key-here"}}'

Other MCP clients

pip install mcp-proxyml
PROXYML_API_KEY=your-key mcp-proxyml

Environment variables

Variable	Required	Description
`PROXYML_API_KEY`	Yes	Your ProxyML API key

Security & permissions

What this server can access

Filesystem — proxyml_infer_schema reads one CSV file at the path you provide. No other tool reads or writes local files.

Network — outbound HTTPS only, to https://api.proxyml.ai (or the URL set in PROXYML_BASE_URL). No other hosts are contacted.

Environment — only PROXYML_API_KEY and PROXYML_BASE_URL are read. No other environment variables are accessed.

No shell execution, no local ports, no other system resources.

What is sent to ProxyML's servers

Tool	What is transmitted
`proxyml_infer_schema`	Nothing — CSV is read and processed locally only
`proxyml_put_schema`	Feature schema: column names, types, and constraints
`proxyml_get_schema`	Schema name (string)
`proxyml_synthesize_data`	Schema name, sample count, and optionally one feature vector
`proxyml_train_surrogate`	Synthetic feature vectors + your model's predictions for those vectors
`proxyml_predict_batch`	Feature vectors you pass explicitly
`proxyml_explain_local` / `proxyml_explain_local_batch`	Feature vector(s) you pass explicitly
`proxyml_find_counterfactual`	One feature vector and a target label
`proxyml_list_surrogates` / `proxyml_get_summary` / `proxyml_export_surrogate` / `proxyml_diff_models` / `proxyml_detect_drift`	Version IDs and/or threshold values (no feature data)
`proxyml_get_usage`	Nothing

The typical workflow is designed so that raw training data never leaves your environment. proxyml_infer_schema derives column metadata from your CSV locally; proxyml_synthesize_data generates samples server-side from that schema; and proxyml_train_surrogate sends only those synthetic samples plus your model's predictions for them. Your original dataset is not transmitted.

If you use proxyml_explain_local, proxyml_predict_batch, or proxyml_find_counterfactual with real data points rather than synthetic ones, those feature vectors will be sent to the ProxyML API.

Authentication

Your API key is sent on every request as an X-API-KEY HTTP header over TLS. It is read from the PROXYML_API_KEY environment variable and is never logged or written to disk by this server.

Tools

Schema

Tool	Description
`proxyml_infer_schema`	Infer a feature schema from a local CSV file — no data sent to the server
`proxyml_get_schema`	Retrieve a stored schema by name
`proxyml_put_schema`	Upload or replace a feature schema

Training

Tool	Description
`proxyml_synthesize_data`	Generate synthetic samples from the stored schema
`proxyml_train_surrogate`	Train a linear surrogate on samples scored by your model
`proxyml_list_surrogates`	List trained surrogate models, newest first
`proxyml_predict_batch`	Get surrogate predictions for a list of instances

Explainability

Tool	Description
`proxyml_get_summary`	Feature importances and model summary
`proxyml_export_surrogate`	Full coefficient export for audit and governance
`proxyml_explain_local`	Per-feature contribution breakdown for a single instance
`proxyml_explain_local_batch`	Per-feature contributions for multiple instances in one call
`proxyml_find_counterfactual`	Find the nearest point that flips the prediction
`proxyml_diff_models`	Compare feature importances between two surrogate versions

CI/CD

Tool	Description
`proxyml_detect_drift`	Compare two versions and return a structured pass/fail against coefficient and fidelity thresholds

Account

Tool	Description
`proxyml_get_usage`	Current tier, request count, and quota — useful as a pre-flight check

Typical workflow

1. proxyml_infer_schema      — point at a CSV, get a schema back
2. proxyml_put_schema        — upload it
3. proxyml_synthesize_data   — generate synthetic samples
4. [score samples with your model]
5. proxyml_train_surrogate   — send samples + predictions, get a surrogate
6. proxyml_get_summary       — see which features drive predictions
7. proxyml_explain_local     — explain a specific decision
8. proxyml_find_counterfactual — find what would need to change

Steps 1–2 are one-time setup. Steps 3–5 can be repeated to retrain as your model changes; use proxyml_diff_models to compare versions.

Agentic workflows

Drift detection in CI/CD

proxyml_detect_drift is designed for use in deployment pipelines. It wraps proxyml_diff_models and applies thresholds to produce a structured pass/fail:

On model deployment:
1. proxyml_train_surrogate          — train surrogate on new model version
2. proxyml_detect_drift(a, b)       — compare against previous version
   → passed: false                  — block deployment or flag for review
   → passed: true                   — proceed

Thresholds can be tuned per use case:

proxyml_detect_drift(
  version_a="<previous>",
  version_b="<new>",
  coefficient_threshold=0.15,   # tighter for high-stakes models
  fidelity_threshold=0.02
)

Dev model validation without production data

Validate a model trained in a lower environment by comparing its predictions against a surrogate trained on production data — no production data required in the dev environment.

This workflow requires a step the MCP server can't do on its own (scoring with your dev model), but works naturally in Claude Code where the agent can execute code directly:

1. proxyml_synthesize_data(num_points=100)   → synthetic samples
2. [agent runs: dev_predictions = dev_model.predict(samples)]
3. proxyml_predict_batch(samples)            → surrogate predictions
4. [agent computes MAE and compares to tolerance]

Example prompt for Claude Code:

Using ProxyML, validate my dev model against the production surrogate.
Synthesize 100 samples from the "default" schema, score them with my model
at dev_model.predict(), get surrogate predictions with proxyml_predict_batch,
then compute the mean absolute error and tell me whether it's within 0.1.

The surrogate acts as a proxy for production behaviour — if the dev model agrees with it within tolerance, it's likely behaving consistently with what was trained on real data.

Counterfactual investigation

When a model makes a decision that needs explaining — a rejected loan application, a flagged transaction, a declined insurance quote — chain proxyml_explain_local and proxyml_find_counterfactual to answer both "why?" and "what would need to change?":

1. proxyml_explain_local(instance)              → which features drove this decision
2. proxyml_find_counterfactual(instance, target) → nearest point that flips it

Example prompt:

My model rejected this application: [age=34, income=42000, loan_amount=15000, ...].
Using ProxyML, explain why it was rejected and find the minimum changes that
would result in an approval. Highlight which changes are realistic given that
age is immutable.

Claude will call proxyml_explain_local to surface the top contributing features, then proxyml_find_counterfactual with the target outcome, and interpret the difference in plain language.

Iterative surrogate improvement

When proxyml_train_surrogate returns a low fidelity warning or other training diagnostic, the agent can use it to guide the next iteration rather than stopping:

1. proxyml_train_surrogate(samples, predictions)
   → warning: "Surrogate fidelity is low (R²=0.52)..."
2. proxyml_synthesize_data(num_points=500)       — increase sample count
3. [re-score with model]
4. proxyml_train_surrogate(larger_samples, predictions)
5. proxyml_detect_drift(v1, v2)                  — confirm improvement, not regression

Example prompt:

Train a surrogate for my regression model using the "default" schema with 200
samples. If fidelity is below 0.7, keep doubling the sample count and retraining
until it passes or you reach 1600 samples. Use proxyml_detect_drift after each
retrain to confirm the model is improving rather than just changing.

The training warnings (convergence, sparsity, class imbalance, high correlation) are designed to be actionable — the agent can read them and decide whether to adjust num_samples, revisit the schema, or flag for human review.

Governance report

Claude can generate a governance report from existing tools without a dedicated endpoint. Example prompt:

Using ProxyML, generate a governance report for surrogate version <id>.
Include: task type, training date, fidelity metrics, top 5 features by importance,
any warnings from training, and a plain-English summary of what drives predictions.
Format it as a structured document suitable for attaching to a deployment ticket.

Claude will call proxyml_get_summary (and proxyml_list_surrogates to find metadata) and compose the report.

Try in Playground

Server Config

{
  "mcpServers": {
    "proxyml": {
      "command": "uvx",
      "args": [
        "mcp-proxyml"
      ],
      "env": {
        "PROXYML_API_KEY": "your-api-key-here"
      }
    }
  }
}

Project Info

Created At

a month ago

Updated At

6 days ago

Author Name

proxyml

Star

Language

License

Recommend Servers

View All

Almega

@almega-ai

Give your AI agents a wallet they can't abuse. Almega is an MCP server that puts a control layer in front of every payment: per-agent spending limits, allow-listed categories, 1-click human approval on sensitive transactions, and a full audit ledger. Two backends ship in one file — `memory` (zero-config, 30-second demo) and `stripe` (real Stripe Issuing test-mode virtual cards, no real money). 7 tools, stdio transport, Python 3.10+, MIT.

14 hours ago

Lu71 — Agentic Dispute Resolution

@Team Lu71

Agentic dispute resolution. File chargebacks on Visa, Mastercard, Stripe Issuing, and Lithic when AI agent purchases go wrong. Verify crypto addresses against sanctions and scam databases before sending (10 chains). Record signed purchase intent as tamper proof evidence. Real time webhook updates. 5% success fee only when you win.

2 hours ago

SeedBase — Synthetic Test Data

@Marcel Gläser

Generate realistic, FK-consistent test data for your databases from your AI assistant. List projects, get schema DDL, generate datasets as SQL.

a day ago

EdgeOne Pages MCP

@TencentEdgeOne

An MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.

TypeScript

a year ago

Aiimagemultistyle

@codecraftm

A Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.

a year ago

Agentline

@Sameer

AgentLine is the telephony layer for AI agents. It gives your agent a real phone number making outbound calls, receiving inbound calls, and handling SMS all through a single API. No telecom infrastructure, no WebSocket wrangling, no separate STT/TTS providers to configure.

18 hours ago

Deckextract

Download DocSend and Papermark links as files. Converts decks to PDF or PPTX and data rooms to a ZIP of PDFs, including email-gated and passcode-protected links.

17 hours ago

Disclos — EU AI Act

@Disclos

Remote MCP server for EU AI Act compliance. Add one URL to Claude, Cursor, or Windsurf — no install — and your AI classifies any AI system against Regulation (EU) 2024/1689, returns the three-wave timeline, explains risk tiers, and crosswalks to ISO 42001, NIST AI RMF, and GDPR.

38 minutes ago

Kubova

@Kubova

Kubova packs cargo into shipping containers and onto pallets and returns a verifiable 3D loading plan — exact coordinates, fit, utilization, weight balance. Over MCP, an assistant can request a plan and act on the structured result. Free tier; public REST API too.

4 hours ago

AgentQL MCP Server

@tinyfish-io

Model Context Protocol server that integrates AgentQL's data extraction capabilities.

JavaScript

a year ago

Human Design (gethumandesign)

@gethumandesign

Calculate Human Design bodygraph charts from birth data, save people, compare charts, and analyse group dynamics — in any MCP client. Hosted remote server (Streamable HTTP, OAuth 2.0) by gethumandesign.com; free account to connect. Listed in the official MCP registry as com.gethumandesign.www/mcp.

14 hours ago

Filesystem

Secure file operations with configurable access controls

a year ago

MCP Advisor

@istarwyh

MCP Advisor & Installation - Use the right MCP server for your needs

TypeScript

a year ago

Memara Memory

@Memara

Persistent memory API and MCP server for AI agents and workflows. Store, search, and retrieve memories with semantic search across Claude, ChatGPT, n8n, Zapier, Dify, and more. Built for developers who need reliable, long-term context for their AI applications.

16 hours ago

MiniMax MCP

@MiniMax-AI

Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

Python

a year ago

Wpnews

3 hours ago

Layup Sport Booking

Search bookable London sports availability — courts, pitches, lanes, classes, pickup games — across every major UK leisure-centre operator and aggregator. ~100k slots indexed across 527 venues, 5 sports. Read-only, anonymous, CC-BY-4.0 attribution for OpenActive sources.

21 hours ago

Erabi

@HMAKT99

ERABI is the open, cryptographically auditable intent exchange for AI agents: register an identity in one command, discover providers ranked by reputation (never by payment), fire intents, and build verifiable reputation and earnings from dual-signed outcomes on a public hash-chained ledger. Zero-config — `npx -y erabi-mcp` joins the live public network with no accounts, no API keys. Six tools: register, discover, intent, report_outcome, my_reputation, my_earnings. Live explorer: https://erabi-explorer.vercel.app

a day ago

PostgreSQL

@modelcontextprotocol

Read-only database access with schema inspection

a year ago

Filesystem

@modelcontextprotocol

2 months ago

Inboxguard

Scan and fix a domain's email deliverability (SPF, DKIM, DMARC, MTA-STS, TLS-RPT, BIMI, DNS blocklists) — and remediate the DNS at the registrar.

17 hours ago

K Data Gate

@loved0543-dotcom

19 hours ago

Search1API

One API for Search, Crawling, and Sitemaps

a year ago

flatten-mcp

@shayaShav

An MCP server that flattens Claude Code sessions — keeping every prompt and event verbatim while reclaiming context tokens, so you resume the exact same raw conversation at a lower token count instead of compacting it into a lossy summary. It moves bulky tool output (large file reads, command logs, base64 screenshots) into a sidecar file, leaving a tiny retrievable reference in its place. Crash-safe, idempotent, and fully reversible. Real example from the README: a 317,236-token session flattened to 182,287 tokens.

13 hours ago

Sequential Thinking

@modelcontextprotocol

An MCP server implementation that provides a tool for dynamic and reflective problem-solving through a structured thinking process.

a year ago

Linkedai

@DatTheMaster

LinkedIn for AI agents. Agents register structured profiles, list projects, evaluate fit via FitReports, and propose connections — all through a hosted MCP server. 27 tools, zero install. Handlers (humans) approve connections. Built on Cloudflare Workers + KV.

11 hours ago

Mnemom

19 hours ago

Matchbox

@Matchbox (Co-fe GmbH)

Describe a real-world problem in plain language and Matchbox finds products built to solve it - with reasoning, honest caveats, what each product won't cover, and a frank 'no strong match' when nothing fits. The catalog (~12,000 products) focuses on early-stage and lesser-known products that search engines and LLM training data usually miss. Never sponsored; payment never affects ranking. Tools: find_products_for_problem, search_catalog, get_product. No auth required.

9 hours ago

Swipr

@nochinxx

Swipe-to-review GitHub PRs with AI context. Paste a repo URL, get open PRs as a card stack with risk scores, AI summaries, similar past changes, and contributor history. Works as a Claude MCP plugin — review PRs directly from Claude Desktop or Cursor without a browser. 12 tools including risk scoring, semantic similarity search, caller lookup, and test coverage detection.

2 days ago

Versium Reach

@Versium

Find leads, enrich your contacts, and verify emails just by describing what you need. Versium REACH builds and sizes B2B and B2C audiences and fills in the contact and company data you're missing, all in plain language with no manual exports or API code. US data only. Estimates are free; building a list draws on your Versium account credits and always confirms with you first. Requires an active Versium REACH subscription with API access.

2 days ago