Datahub

Created By
acryldataa year ago
Overview

what is Datahub?

Datahub is a Model Context Protocol server implementation that allows AI agents to query for metadata and context about your data ecosystem.

how to use Datahub?

To use Datahub, you can authenticate by configuring a global ~/.datahubenv file using datahub init, or by setting the appropriate environment variables for your DataHub instance.

key features of Datahub?

  • Search across all entity types with arbitrary filters.
  • Fetch metadata for any entity.
  • Traverse the lineage graph, both upstream and downstream.
  • List SQL queries associated with a dataset.

use cases of Datahub?

  1. Enabling AI agents to understand data context and metadata.
  2. Facilitating data lineage tracking for compliance and auditing.
  3. Assisting data scientists in querying and retrieving relevant data efficiently.

FAQ from Datahub?

  • What is the purpose of Datahub?

Datahub serves as a centralized metadata repository that allows AI agents to interact with and understand the data ecosystem.

  • How do I authenticate with Datahub?

You can authenticate using the datahub init command or by setting environment variables for your DataHub instance.

  • Is Datahub compatible with both OSS and Cloud versions?

Yes! Datahub supports both DataHub OSS and DataHub Cloud.

Server Config

{
  "mcpServers": {
    "datahub": {
      "command": "uvx",
      "args": [
        "mcp-server-datahub"
      ],
      "env": {
        "DATAHUB_GMS_URL": "<your-datahub-url>",
        "DATAHUB_GMS_TOKEN": "<your-datahub-token>"
      }
    }
  }
}
Project Info
Created At
a year ago
Updated At
a year ago
Author Name
acryldata
Star
-
Language
-
License
-

Recommend Servers

View All
//beforeyouship — LLM Cost Modeling From Your Editor
@Indiegoing

Query realistic LLM cost models without leaving your editor. beforeyouship models the **true monthly cost** of an LLM app architecture — retries, prompt caching, batch discounts, infra overhead, and 3×/10× growth — across GPT-5.x, Claude, Gemini, DeepSeek, and more. Not a token calculator: a planning tool for the design phase, before you commit to a stack. **No API key needed to try it** — demo mode covers the six free-tier models. A Pro key from [beforeyouship.dev](https://beforeyouship.dev) unlocks the full 18-model catalog. ## What you can ask - "How much will a RAG chatbot cost at 10,000 requests/day?" - "Compare Claude Haiku vs Gemini Flash pricing for my workload" - "What's the cheapest model for a multi-step agent at scale?" - "Show me current per-token prices for Anthropic models" ## Tools ### `estimate_cost` Full cost model for an architecture at a given usage level. Returns Naive / Realistic / Worst Case monthly cost per model, 3×/10× growth scenarios, and an opinionated recommendation with reasoning. ### `get_model_prices` Current per-1M-token pricing — input, output, cached input, batch — with context windows and staleness metadata. ### `list_archetypes` Seven preset architecture patterns (simple chatbot, chatbot with history, RAG pipeline, multi-model router, coding assistant, document processor, multi-step agent) used as starting points for estimates. ## Setup **Claude Code:** ​```bash claude mcp add --transport http beforeyouship https://beforeyouship.dev/api/mcp ​``` **Cursor / other clients** — add a remote server: ​```json { "mcpServers": { "beforeyouship": { "type": "streamable-http", "url": "https://beforeyouship.dev/api/mcp" } } } ​``` Add an `Authorization: Bearer bys_...` header with a Pro key for the full catalog. ## Try it > Estimate the monthly cost of a RAG pipeline at 10,000 requests/day

19 hours ago