Harness Benchmark · 2026-04-30

On par with Claude Code,
well ahead of the rest.

One task, four agents, identical conditions.

3-task totals

The chart first

Total cost, request count, and cache hit rate for all four agents under identical prompt, model, and skill. Numbers come from per-request OpenRouter CSV logs.

Total cost
Per-request OpenRouter billing CSV · 3 tasks × 4 agents summed request by request · completed on 2026-04-30 within the same time window
OpenClacky
This project
$5.10
Claude Code
$5.49
OpenClaw
$15.70
Hermes
$30.14
Why the 6× gap

The direct cause: cache × request count

Per-prompt pricing is almost identical. The 6× total-cost gap comes from this: OpenClacky finished the same tasks with fewer requests and a higher cache hit rate.

Cache hit rate
OpenClacky
90.6%
Claude Code
95.2%
OpenClaw
88.7%
Hermes
60.3%
Request count
OpenClacky
51
Claude Code
70
OpenClaw
81
Hermes
218

51 requests + 90.6% hit rate → $5.10. Hermes 218 requests + 60.3% hit rate → $30.14.

Artifact comparison

Same prompt, four very different outputs.

Marketing / PPT HTML outputs are embedded inline; social-content text outputs are listed as files.

01 · guizang-ppt-skill

10-page Horizontal-Swipe Business Deck (single HTML)

guizang-ppt-skill · AI-Agent industry trend talk

OpenClacky
$1.23
Claude Code
$1.45
OpenClaw
$5.07
Hermes
$10.96
02 · marketing-psychology

AI Customer-Service SaaS Marketing Plan + Live Homepage

marketing-psychology skill · dual deliverable

OpenClacky
$1.72
Claude Code
$1.20
OpenClaw
$7.47
Hermes
$4.65
Test conditions

Controlled variables

To keep the results comparable, all four agents were run within the same time window, using identical prompts, the same underlying model, and the same skill versions.

Identical prompt Identical model (claude-opus-4-7) Identical skill (same version pre-installed everywhere) Separate OpenRouter API keys Per-request CSV reconciliation, no estimates Single run, no cherry-picking

Want the harness engineering details?

The tech deep dive covers cache design, pipeline decomposition, and every design trade-off.

Download OpenClacky — free Read the tech deep dive