//beforeyouship — LLM Cost Modeling From Your Editor
@Indiegoing
Query realistic LLM cost models without leaving your editor. beforeyouship models
the **true monthly cost** of an LLM app architecture — retries, prompt caching,
batch discounts, infra overhead, and 3×/10× growth — across GPT-5.x, Claude,
Gemini, DeepSeek, and more. Not a token calculator: a planning tool for the
design phase, before you commit to a stack.
**No API key needed to try it** — demo mode covers the six free-tier models.
A Pro key from [beforeyouship.dev](https://beforeyouship.dev) unlocks the full
18-model catalog.
## What you can ask
- "How much will a RAG chatbot cost at 10,000 requests/day?"
- "Compare Claude Haiku vs Gemini Flash pricing for my workload"
- "What's the cheapest model for a multi-step agent at scale?"
- "Show me current per-token prices for Anthropic models"
## Tools
### `estimate_cost`
Full cost model for an architecture at a given usage level. Returns
Naive / Realistic / Worst Case monthly cost per model, 3×/10× growth
scenarios, and an opinionated recommendation with reasoning.
### `get_model_prices`
Current per-1M-token pricing — input, output, cached input, batch — with
context windows and staleness metadata.
### `list_archetypes`
Seven preset architecture patterns (simple chatbot, chatbot with history,
RAG pipeline, multi-model router, coding assistant, document processor,
multi-step agent) used as starting points for estimates.
## Setup
**Claude Code:**
```bash
claude mcp add --transport http beforeyouship https://beforeyouship.dev/api/mcp
```
**Cursor / other clients** — add a remote server:
```json
{
"mcpServers": {
"beforeyouship": {
"type": "streamable-http",
"url": "https://beforeyouship.dev/api/mcp"
}
}
}
```
Add an `Authorization: Bearer bys_...` header with a Pro key for the full catalog.
## Try it
> Estimate the monthly cost of a RAG pipeline at 10,000 requests/day