MCP Integration

Industry-standard config on the outside. Cost-controlled architecture on the inside.

The Model Context Protocol (MCP) is an open
protocol led by Anthropic that lets AI agents plug into external tools —
filesystems, GitHub, databases, design files — through a single config file.
Claude Desktop, Cursor, Zed, and others speak it; hundreds of off-the-shelf
MCP servers already exist.

OpenClacky speaks MCP too. The config format is identical to Claude Desktop
and Cursor. But the architecture underneath is different — because we want
to fix the token-explosion problem that most MCP integrations have.

What everyone else pays for

The straightforward approach is to dump every tool schema from every MCP
server into the system prompt:

A typical GitHub MCP server is around 6,000 tokens of tool schema
Plug in three or four servers and your system prompt blows past 30,000 tokens
That's a fixed cost paid on every single turn
And once your system prompt is that long, a single cache miss is brutally expensive

A lot of users excitedly install five or six MCP servers, then discover their
bill has tripled. This is why.

OpenClacky's approach: one constant tool, on-demand catalogs

Our goal: identical user config experience, but main system prompt cost is
essentially independent of how many servers you have.

Three things happen together:

1. The main agent gets exactly one extra tool: `mcp_call`

If mcp.json contains any servers at all, the main agent registers a single
MCP-related tool:

mcp_call(server, tool, arguments)

Its schema doesn't grow with the number of servers. Whether you have 1 or
50 MCP servers, this tool always costs about 80 tokens in the main context.

Zero-MCP users don't see the tool at all. Cost: 0.

2. Each server is a one-line summary

In the agent's Skill list, every MCP server appears as one line:

mcp:github — Search repos, read issues, open PRs on GitHub.

About 50 tokens per server, regardless of whether the server exposes 5 tools
or 50.

3. Tool catalogs load on demand — and only into a subagent

When the main agent decides to use a server, it calls:

invoke_skill("mcp:github", "<task>")

That forks a subagent. The subagent starts up, and only then does the
server's full tool list (each tool's inputSchema) get injected — as the
first user message in the subagent's history. The subagent uses that schema
to construct mcp_call invocations and returns results to the parent.

The important parts:

The main agent's context is never polluted with tool schemas
The subagent has full, precise tool definitions and won't guess parameters
Parent and child agents share the exact same tool registry, so prompt cache is preserved across forks

Cost comparison

Suppose you've installed 5 MCP servers, each exposing ~10 tools:

	Naive approach	OpenClacky
Main system-prompt addition	~30,000 tokens	~330 tokens
Per-turn fixed cost	30,000 × turns	330 × turns
Adding a 6th server	+6,000 tokens	+50 tokens
Cost when MCP is unused	0	0

The numbers themselves aren't magic — we just put the schemas in the right
place. But over a long session, the difference compounds into an order of
magnitude.

How to configure

Same format as Claude Desktop / Cursor. Drop this in ~/.clacky/mcp.json:

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/projects"],
      "description": "Read/write files inside the projects folder."
    },
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": { "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_xxx" },
      "description": "Search repos, read issues, open PRs."
    }
  }
}

Project-level config also works: put .clacky/mcp.json at your project root
and it only applies inside that project.

The only OpenClacky-specific field is description — that's what the main
agent sees, and a clear one-liner significantly improves how accurately it
picks the right server.

A few details worth knowing

Lazy process startup. MCP server processes are not spawned when OpenClacky
boots. The first actual call to a server triggers spawn; idle servers shut
down after 5 minutes. This is consistent with OpenClacky's "no gateway"
philosophy — we don't run processes you aren't using. See the
Harness Engineering for context.

Config layering. ~/.clacky/mcp.json (global) and
<project>/.clacky/mcp.json (per-project) both get loaded. Project-level
overrides global on name collision.

Slash commands. Every MCP server gets an automatic /mcp-<name> slash
command, callable directly from the CLI or Web UI.

Relationship to Skills. Internally, an MCP server is a synthesized virtual
Skill. Skills are OpenClacky's core abstraction (see the
Harness Engineering), and MCP rides that abstraction
instead of bolting on a parallel one. That's a big part of why the
main-context cost stays this small.

Recommended setup

Light use (no impact on main-context speed): 1–3 servers, picking your
highest-frequency ones — typically filesystem + GitHub.

Heavy use: feel free to install 10+ servers. OpenClacky's architecture
makes "lots of MCP servers" no longer a tax — the main context grows linearly
at ~50 tokens per server, so even 20 servers only adds ~1,000 tokens.

See a complete mcp.json example → · Back to Core Features →