Local Speech-to-Text MCP Server

Created By

SmartLittleAppsa year ago

A high-performance Model Context Protocol (MCP) server providing local speech-to-text transcription using whisper.cpp, optimized for Apple Silicon.

# apple

# mcp

Overview Content Tools Comments

Overview

What is Local Speech-to-Text MCP Server?

Local Speech-to-Text MCP Server is a high-performance Model Context Protocol (MCP) server that provides local speech-to-text transcription using whisper.cpp, specifically optimized for Apple Silicon devices.

How to use Local Speech-to-Text MCP Server?

To use the server, clone the repository from GitHub, install the necessary dependencies, and configure your MCP client to connect to the server. You can transcribe audio files in various formats.

Key features of Local Speech-to-Text MCP Server?

100% Local Processing for complete privacy
Optimized for Apple Silicon with 15x+ real-time transcription speed
Speaker Diarization to identify and separate multiple speakers
Universal Audio Support with automatic conversion from various formats
Multiple Output Formats including txt, json, vtt, srt, csv
Low Memory Footprint of less than 2GB
Full TypeScript support for modern development

Use cases of Local Speech-to-Text MCP Server?

Transcribing meetings or lectures for documentation.
Creating subtitles for videos from audio content.
Assisting in accessibility by providing text for spoken content.

FAQ from Local Speech-to-Text MCP Server?

Is the transcription process cloud-based?

No, all processing is done locally, ensuring privacy.

What audio formats are supported?

The server supports WAV, FLAC, MP3, M4A, and more, with automatic conversion capabilities.

Do I need a HuggingFace account for speaker diarization?

Yes, a HuggingFace token is required for speaker diarization functionality.

Project Info

Created At

a year ago

Updated At

a year ago

Author Name

SmartLittleApps

Star

Language

TypeScript

License

MIT license

Recommend Servers

View All

Puppeteer

@modelcontextprotocol

Browser automation and web scraping

a year ago

MCP Advisor

@istarwyh

MCP Advisor & Installation - Use the right MCP server for your needs

TypeScript

a year ago

PostgreSQL

@modelcontextprotocol

Read-only database access with schema inspection

a year ago

12 days ago

12 days ago

@modelcontextprotocol

Retrieving and analyzing issues from Sentry.io

a year ago

Papaya Pay Any Bill

@Papaya

Ready for a new way to bill pay? Pay any bill in a snap, right from a chat. Describe or snap a photo of your bill (electric, water, gas, internet, phone, medical, credit card, rent, parking tickets, traffic violations and more) and Papaya reads it, then hands you a secure link to pay by card. Fast, secure, and no juggling twelve logins, with full or partial payments and status updates. Powered by Papaya (papayapay.com).

12 days ago

12 days ago

基于七牛云产品构建的 Model Context Protocol (MCP) Server，支持用户在 AI 大模型客户端的上下文中通过该 MCP Server 来访问七牛云存储资源、利用 Dora 服务进行图片操作等。如果有什么需求欢迎在下方评论，您也可以在 github 仓库中提 issue。

Python

a year ago

EverArt

@modelcontextprotocol

AI image generation using various models

a year ago

Egypt Payments Mcp

@junter1989k-ai

12 days ago

Aws Kb Retrieval Server

@modelcontextprotocol

An MCP server implementation for retrieving information from the AWS Knowledge Base using the Bedrock Agent Runtime.

a year ago

Jina AI MCP Tools

@PsychArch

A Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.

JavaScript

a year ago

Baidu Map

@baidu-maps

百度地图核心API现已全面兼容MCP协议，是国内首家兼容MCP协议的地图服务商。

a year ago

Framelink Figma MCP Server

@GLips

MCP server to provide Figma layout information to AI coding agents like Cursor

TypeScript

a year ago

mcp-server-flomo MCP Server

@chatmcp

Write notes to Flomo

JavaScript

a year ago

302_sandbox_mcp

@302ai

Create a remote sandbox that can execute code/run commands/upload and download files. 创建远程沙盒，可以执行代码/运行命令/上传下载文件

a year ago

South Africa Payments Mcp

12 days ago

12 days ago

Persistent Adaptive Planning Intelligence - structured loop engineering for AI coding assistants, with memory that persists across sessions, tools, and teammates.

12 days ago

12 days ago

12 days ago

12 days ago

Bangladesh Payments Mcp

@junter1989k-ai

12 days ago

Serper MCP Server

@garymengcom

A Serper MCP Server

Python

a year ago

GBOX Android MCP

@babelcloud

GBOX provides environments for AI Agents to operate computer and mobile devices. Mobile Scenario: Your agents can use GBOX to develop/test android apps, or run apps on the Android to complete various tasks(mobile automation). Desktop Scenario: Your agents can use GBOX to operate desktop apps such as browser, terminal, VSCode, etc(desktop automation). MCP: You can also plug GBOX MCP to any Agent you like, such as Cursor, Claude Code. These agents will instantly get the ability to operate computer and mobile devices.

a year ago

12 days ago

Language models generate; Concordance verifies. The verify tool checks a claim deterministically (no model in the loop) and returns HOLDS / BROKEN / INCOMPLETE with the worked reasoning and a sealed receipt (content_hash + cite_url) that re-fetches byte-identical or not at all. Also: ranked search over an ~11k-record library, seal_fetch to re-verify any receipt, redact to strip PII before text travels, and a sealed connection graph. Runs sovereign/offline too (stdlib-first Python). A public false-positive benchmark covers every domain: the engine has never sealed a falsehood. 38 tools live; remote endpoint at https://narrowhighway.com/mcp.

12 days ago

Framesail AI

@framesail

Official remote MCP server for Framesail AI. Create long-form (faceless YouTube) videos end to end from any MCP client: script, locked character references, storyboard, voiceover, and final video editing — with characters and style held consistent across every shot. Making long-form AI video today means 8+ tabs stitched by hand — an LLM for the script, a voice model, an image model, a video model — with characters drifting between tools and style resetting at every export. Framesail replaces the patchwork: the whole pipeline runs in one place and manages your video's context end to end. Six stages: Style (paste images, videos, or YouTube links and Framesail reverse-engineers the look, voice, and direction), Script (write it yourself or generate it in your narrative style), Reference images (auto-generated for every character, place, and prop), Voiceover (one narrator or many characters, with word-level timing), Storyboard (planned scene by scene), and Editor (captions, music, SFX, then export). No black box: you control every prompt, asset, model, and setting.

12 days ago

Norway Payments Mcp

@junter1989k-ai

12 days ago