MCP Integration
Industry-standard config on the outside. Cost-controlled architecture on the inside.
The Model Context Protocol (MCP) is an open
protocol led by Anthropic that lets AI agents plug into external tools —
filesystems, GitHub, databases, design files — through a single config file.
Claude Desktop, Cursor, Zed, and others speak it; hundreds of off-the-shelf
MCP servers already exist.
OpenClacky speaks MCP too. The config format is identical to Claude Desktop
and Cursor. But the architecture underneath is different — because we want
to fix the token-explosion problem that most MCP integrations have.
What everyone else pays for
The straightforward approach is to dump every tool schema from every MCP
server into the system prompt:
- A typical GitHub MCP server is around 6,000 tokens of tool schema
- Plug in three or four servers and your system prompt blows past 30,000 tokens
- That's a fixed cost paid on every single turn
- And once your system prompt is that long, a single cache miss is brutally expensive
A lot of users excitedly install five or six MCP servers, then discover their
bill has tripled. This is why.
OpenClacky's approach: one constant tool, on-demand catalogs
Our goal: identical user config experience, but main system prompt cost is
essentially independent of how many servers you have.
Three things happen together:
1. The main agent gets exactly one extra tool: mcp_call
If mcp.json contains any servers at all, the main agent registers a single
MCP-related tool:
mcp_call(server, tool, arguments)
Its schema doesn't grow with the number of servers. Whether you have 1 or
50 MCP servers, this tool always costs about 80 tokens in the main context.
Zero-MCP users don't see the tool at all. Cost: 0.
2. Each server is a one-line summary
In the agent's Skill list, every MCP server appears as one line:
mcp:github — Search repos, read issues, open PRs on GitHub.
About 50 tokens per server, regardless of whether the server exposes 5 tools
or 50.
3. Tool catalogs load on demand — and only into a subagent
When the main agent decides to use a server, it calls:
invoke_skill("mcp:github", "<task>")
That forks a subagent. The subagent starts up, and only then does the
server's full tool list (each tool's inputSchema) get injected — as the
first user message in the subagent's history. The subagent uses that schema
to construct mcp_call invocations and returns results to the parent.
The important parts:
- The main agent's context is never polluted with tool schemas
- The subagent has full, precise tool definitions and won't guess parameters
- Parent and child agents share the exact same tool registry, so prompt cache is preserved across forks
Cost comparison
Suppose you've installed 5 MCP servers, each exposing ~10 tools:
| Naive approach | OpenClacky | |
|---|---|---|
| Main system-prompt addition | ~30,000 tokens | ~330 tokens |
| Per-turn fixed cost | 30,000 × turns | 330 × turns |
| Adding a 6th server | +6,000 tokens | +50 tokens |
| Cost when MCP is unused | 0 | 0 |
The numbers themselves aren't magic — we just put the schemas in the right
place. But over a long session, the difference compounds into an order of
magnitude.
How to configure
Same format as Claude Desktop / Cursor. Drop this in ~/.clacky/mcp.json:
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/projects"],
"description": "Read/write files inside the projects folder."
},
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_xxx" },
"description": "Search repos, read issues, open PRs."
}
}
}
Project-level config also works: put .clacky/mcp.json at your project root
and it only applies inside that project.
The only OpenClacky-specific field is description — that's what the main
agent sees, and a clear one-liner significantly improves how accurately it
picks the right server.
A few details worth knowing
Lazy process startup. MCP server processes are not spawned when OpenClacky
boots. The first actual call to a server triggers spawn; idle servers shut
down after 5 minutes. This is consistent with OpenClacky's "no gateway"
philosophy — we don't run processes you aren't using. See the
Tech Deep Dive for context.
Config layering. ~/.clacky/mcp.json (global) and
<project>/.clacky/mcp.json (per-project) both get loaded. Project-level
overrides global on name collision.
Slash commands. Every MCP server gets an automatic /mcp-<name> slash
command, callable directly from the CLI or Web UI.
Relationship to Skills. Internally, an MCP server is a synthesized virtual
Skill. Skills are OpenClacky's core abstraction (see the
Tech Deep Dive), and MCP rides that abstraction
instead of bolting on a parallel one. That's a big part of why the
main-context cost stays this small.
Recommended setup
Light use (no impact on main-context speed): 1–3 servers, picking your
highest-frequency ones — typically filesystem + GitHub.
Heavy use: feel free to install 10+ servers. OpenClacky's architecture
makes "lots of MCP servers" no longer a tax — the main context grows linearly
at ~50 tokens per server, so even 20 servers only adds ~1,000 tokens.