Sail MCP Server for Spark SQL

Created By

lakehqa year ago

Sail is an open-source computation framework that serves as a drop-in replacement for Apache Spark (SQL and DataFrame API) in both single-host and distributed settings. The built-in MCP server in Sail exposes tools for LLM agents to register datasets and execute Spark SQL queries.

# sail

# spark

Overview Content Tools Comments

Overview

what is Sail?

Sail is a unified platform designed for stream processing, batch processing, and compute-intensive workloads, including AI tasks. It serves as a drop-in replacement for Spark SQL and the Spark DataFrame API, functioning in both single-host and distributed environments.

how to use Sail?

To use Sail, install it via pip with pip install "pysail[spark]", or build it from source for optimized performance. Start the Sail server using command line, Python API, or deploy it on Kubernetes for distributed processing.

key features of Sail?

Unified processing for stream, batch, and AI workloads.
Drop-in replacement for Spark SQL and DataFrame API.
Supports local and distributed server setups.
Easy integration with PySpark.

use cases of Sail?

Real-time data analytics and processing.
Batch processing of large datasets.
AI model training and inference in a distributed environment.

FAQ from Sail?

Is Sail compatible with existing Spark applications?

Yes! Sail is designed to be a drop-in replacement for Spark SQL and DataFrame API.

Can I run Sail on Kubernetes?

Yes! Sail can be deployed on Kubernetes for distributed processing.

What support options are available for Sail?

LakeSail offers flexible enterprise support options for Sail.

Try in Playground

Server Config

{
  "mcpServers": {
    "sail": {
      "command": "sail",
      "args": [
        "spark",
        "mcp-server",
        "--transport",
        "stdio"
      ]
    }
  }
}

Project Info

Created At

a year ago

Updated At

a year ago

Author Name

lakehq

Star

Language

License

Recommend Servers

View All

Linkedai

@DatTheMaster

LinkedIn for AI agents. Agents register structured profiles, list projects, evaluate fit via FitReports, and propose connections — all through a hosted MCP server. 27 tools, zero install. Handlers (humans) approve connections. Built on Cloudflare Workers + KV.

13 hours ago

Matchbox

@Matchbox (Co-fe GmbH)

Describe a real-world problem in plain language and Matchbox finds products built to solve it - with reasoning, honest caveats, what each product won't cover, and a frank 'no strong match' when nothing fits. The catalog (~12,000 products) focuses on early-stage and lesser-known products that search engines and LLM training data usually miss. Never sponsored; payment never affects ranking. Tools: find_products_for_problem, search_catalog, get_product. No auth required.

11 hours ago

MCP for Indexa Capital

@InvIngeniero

Check your portfolio, cash transactions, movements, payed fees, growth history and more

a day ago

Cliqo Mcp

Create and manage short links - shorten URLs, list / inspect links, track credits. No subscriptions.

20 hours ago

flatten-mcp

@shayaShav

An MCP server that flattens Claude Code sessions — keeping every prompt and event verbatim while reclaiming context tokens, so you resume the exact same raw conversation at a lower token count instead of compacting it into a lossy summary. It moves bulky tool output (large file reads, command logs, base64 screenshots) into a sidecar file, leaving a tiny retrievable reference in its place. Crash-safe, idempotent, and fully reversible. Real example from the README: a 317,236-token session flattened to 182,287 tokens.

15 hours ago

Aiimagemultistyle

@codecraftm

A Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.

a year ago

Redis

@modelcontextprotocol

A Model Context Protocol server that provides access to Redis databases. This server enables LLMs to interact with Redis key-value stores through a set of standardized tools.

a year ago

Rightblogger

@RightBlogger

RightBlogger MCP gives any AI agent direct access to SEO keyword research, Google Search Console performance, and your WordPress/Ghost/Webflow CMS — research keywords, read posts, and pull GSC data straight from Claude, Cursor, or any MCP client.

2 days ago

//beforeyouship — LLM Cost Modeling From Your Editor

@Indiegoing

Query realistic LLM cost models without leaving your editor. beforeyouship models the **true monthly cost** of an LLM app architecture — retries, prompt caching, batch discounts, infra overhead, and 3×/10× growth — across GPT-5.x, Claude, Gemini, DeepSeek, and more. Not a token calculator: a planning tool for the design phase, before you commit to a stack. **No API key needed to try it** — demo mode covers the six free-tier models. A Pro key from [beforeyouship.dev](https://beforeyouship.dev) unlocks the full 18-model catalog. ## What you can ask - "How much will a RAG chatbot cost at 10,000 requests/day?" - "Compare Claude Haiku vs Gemini Flash pricing for my workload" - "What's the cheapest model for a multi-step agent at scale?" - "Show me current per-token prices for Anthropic models" ## Tools ### `estimate_cost` Full cost model for an architecture at a given usage level. Returns Naive / Realistic / Worst Case monthly cost per model, 3×/10× growth scenarios, and an opinionated recommendation with reasoning. ### `get_model_prices` Current per-1M-token pricing — input, output, cached input, batch — with context windows and staleness metadata. ### `list_archetypes` Seven preset architecture patterns (simple chatbot, chatbot with history, RAG pipeline, multi-model router, coding assistant, document processor, multi-step agent) used as starting points for estimates. ## Setup **Claude Code:** ```bash claude mcp add --transport http beforeyouship https://beforeyouship.dev/api/mcp ``` **Cursor / other clients** — add a remote server: ```json { "mcpServers": { "beforeyouship": { "type": "streamable-http", "url": "https://beforeyouship.dev/api/mcp" } } } ``` Add an `Authorization: Bearer bys_...` header with a Pro key for the full catalog. ## Try it > Estimate the monthly cost of a RAG pipeline at 10,000 requests/day

20 hours ago

Mcp Server Chatsum

@chatmcp

summarize chat message

typescript

a year ago

Serper MCP Server

@garymengcom

A Serper MCP Server

Python

a year ago

Mnemom

21 hours ago

Search1API

One API for Search, Crawling, and Sitemaps

a year ago

302_browser_use_mcp

@302ai

Automatically create a remote browser to complete your specified tasks, developed based on Browser Use + Sandbox. 自动创建一个远程浏览器，完成你指定的任务，基于Browser Use + Sandbox开发。

a year ago

Gp Intel

@gparientee

Verified European private equity ownership data: who owns a company, PE firm portfolios, exits by year. 21,000+ companies, 900+ GPs, hand-checked, source link on every response. No auth.

15 hours ago

Indian Food Nutrition Mcp - Log Indian meals with your AI using accurate data. India's official IFCT 2017 nutrition tables + USDA (8,335 foods), by text or photo. Local-first, open source.

@krishnabhat

One-line description: Log Indian meals with your AI using accurate data. India's official IFCT 2017 nutrition tables + USDA (8,335 foods), by text or photo. Local-first, open source. Long description: An MCP server that gives Claude (and soon ChatGPT) accurate Indian food data. Most calorie databases are US-centric and wrong for home-cooked Indian food. This wraps India's official Food Composition Tables (IFCT 2017, National Institute of Nutrition) plus USDA. Log by talking ("2 rotis and a katori of dal") or by photo; the model identifies the food, the database supplies the numbers (no LLM guessing), and your history feeds back so the AI can coach you against what you actually ate. Local SQLite, no account, no telemetry. AGPL-3.0. Tools: search_food, log_meal, get_day, get_history, edit_entry, delete_entry, fetch_image

15 hours ago

orkestr MCP

@orkestr

The orkestr MCP server gives AI agents full control of the orkestr deployment platform over the Model Context Protocol. From an MCP client an agent can create and manage projects from a GitHub, GitLab, Bitbucket, or Codeberg repo, spin up environments, trigger and roll back deployments, deploy and invoke serverless functions, provision and back up managed PostgreSQL and Redis add-ons, manage custom domains, and read live logs, build logs, metrics, and health, all on infrastructure that stays in the EU.

5 hours ago

Slack

@modelcontextprotocol

Channel management and messaging capabilities

a year ago

EdgeOne Pages MCP

@TencentEdgeOne

An MCP service designed for deploying HTML content to EdgeOne Pages and obtaining an accessible public URL.

TypeScript

a year ago

MCP Server for Milvus

@zilliztech

The Milvus MCP server enables AI applications to interact with Milvus vector databases using natural language commands. It allows AI models to perform vector searches, manage collections, and retrieve data without writing custom database queries. This integration facilitates seamless access to vector data, enhancing the capabilities of AI tools like Claude Desktop and Cursor.

a year ago

Giveradar Mcp Server

Remote MCP server exposing 8.7M+ registered charities across 60+ countries, sourced from official government registries (IRS, Charity Commission, ACNC, DSD, RNA, and 60+ more). Read-only, no key required to start.

a day ago

SeedBase — Synthetic Test Data

@Marcel Gläser

Generate realistic, FK-consistent test data for your databases from your AI assistant. List projects, get schema DDL, generate datasets as SQL.

a day ago

21 hours ago

a day ago

@modelcontextprotocol

A Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.

4 months ago

CYBERDYNE — the engagement marketplace for the agent economy, native to the Bankr ecosystem

@Cyberdyne-OS

Engagement marketplace on Base, native to the Bankr ecosystem: AI agents and communities fund quests (follows, reposts, replies, quotes, original posts); verified-X humans complete them and are paid per approved action from a non-custodial x402 escrow — in USDC, BNKR, or any Bankr-launched token.

10 hours ago

Lu71 — Agentic Dispute Resolution

@Team Lu71

Agentic dispute resolution. File chargebacks on Visa, Mastercard, Stripe Issuing, and Lithic when AI agent purchases go wrong. Verify crypto addresses against sanctions and scam databases before sending (10 chains). Record signed purchase intent as tamper proof evidence. Real time webhook updates. 5% success fee only when you win.

4 hours ago

Shipping Rates & Postage Calculator

@Online Shipping Calculator

Live USPS, UPS & FedEx shipping rates plus USPS postage and stamp prices for any US parcel or letter. Free, no API key, no signup. Two read-only tools: - get_shipping_rates — compare carrier rates between US ZIP codes (and international), cheapest first - get_postage_price — USPS postage with Forever-stamp counts and exact stamp combinations for letters, large envelopes, and postcards Rates are discounted online-postage prices, not retail counter prices.

11 hours ago

Catalyst Governance

@Stratogenic-AI

Governance middleware for AI agents — permission gates, human-in-the-loop approvals, compliance scanning across 8 frameworks, and hash-linked audit ledger. Hosted SSE endpoint, no self-hosting required.

a day ago

Solnk MCP

@solnk

solnk is a social media management platform that lets you schedule and publish content to nine networks — Instagram, TikTok, YouTube, X, LinkedIn, Pinterest, Facebook, Threads, and Bluesky — from one place. Its MCP server lets AI agents draft and publish social posts through a single interface, with a draft-first safety model so nothing goes live without review. Includes a content calendar, team approval workflow, and analytics. Start free at solnk.com.

a day ago