MCP Tools
Compression, retrieval, and stats as MCP tools for Claude Code, Cursor, and any MCP-compatible host.
Headroom's MCP server exposes compression, retrieval, and observability as tools that any MCP-compatible AI coding tool can call -- Claude Code, Cursor, Codex, and more. No proxy required.
Installation
# MCP tools only (lightweight)
pip install "headroom-ai[mcp]"
# Or with the proxy
pip install "headroom-ai[proxy]"Setup for Claude Code
# Register with Claude Code (one-time)
headroom mcp install
# Start Claude Code — it now has headroom tools
claudeClaude Code can now compress content on demand, retrieve originals, and check session stats.
For automatic compression of all traffic, also run the proxy:
# Terminal 1
headroom proxy
# Terminal 2
ANTHROPIC_BASE_URL=http://127.0.0.1:8787 claudeTools
headroom_compress
Compress content on demand. The LLM calls this when it wants to shrink large content before reasoning over it.
Parameters:
content(required) -- text to compress (files, JSON, logs, search results)
Returns:
compressed-- compressed texthash-- key for retrieving the original lateroriginal_tokens/compressed_tokens/savings_percenttransforms-- which compression algorithms were applied
Example flow:
Claude: Let me compress this large output to save context space.
-> headroom_compress(content="[5000 lines of grep results...]")
<- {
"compressed": "[key matches with context...]",
"hash": "a1b2c3d4e5f6...",
"original_tokens": 12000,
"compressed_tokens": 3200,
"savings_percent": 73.3,
"transforms": ["router:search:0.27"]
}The original is stored locally for 1 hour. If the LLM needs the full content later, it calls headroom_retrieve.
headroom_retrieve
Retrieve original uncompressed content by hash.
Parameters:
hash(required) -- hash key from a previous compressionquery(optional) -- search within the original to return only matching items
Returns:
original_content(full retrieval) orresults(filtered search)source--"local"or"proxy"
Retrieval checks the local store first, then falls back to the proxy's store. Hashes from either source work transparently.
headroom_stats
Session compression statistics.
Returns:
compressions,retrievals,tokens_saved,savings_percentestimated_cost_saved_usdrecent_events-- last 10 compression/retrieval eventssub_agents-- stats from sub-agent MCP instancescombined-- main + sub-agent totalsproxy-- request count, cache hits, cost saved (if proxy is running)
Sub-agent stats are aggregated via a shared stats file at ~/.headroom/session_stats.jsonl.
Streamable HTTP Transport (Remote / Docker)
For agents running on a different machine than the Headroom proxy (e.g., Docker, cloud), MCP tools are available over HTTP using the MCP Streamable HTTP protocol.
Proxy auto-exposes /mcp
When you run headroom proxy, MCP tools are automatically available at /mcp:
headroom proxy # → http://host:8787/mcpRemote agents connect with:
{
"mcpServers": {
"headroom": {
"url": "http://proxy-host:8787/mcp"
}
}
}Standalone HTTP server
Run MCP tools without the full proxy:
headroom mcp serve --transport http --port 8080Remote install
Configure Claude Code to use remote MCP over HTTP:
headroom mcp install --remote http://proxy-host:8787/mcpThis writes URL-based config instead of the default command-based config.
Protocol
The Streamable HTTP transport implements the MCP specification:
POST /mcp-- Send JSON-RPC requests (tool calls, list tools)GET /mcp-- Server-sent events stream (server-initiated messages)DELETE /mcp-- Terminate session
Stateless mode by default -- each request is independent, no session tracking needed.
CLI commands
# Install — local (stdio, default)
headroom mcp install
# Install — remote (HTTP, for Docker/network)
headroom mcp install --remote http://proxy-host:8787/mcp
# Install — custom proxy URL
headroom mcp install --proxy-url http://host:9000
# Overwrite existing config
headroom mcp install --force
# Serve — stdio (default, called by Claude Code)
headroom mcp serve
# Serve — HTTP (for remote agents)
headroom mcp serve --transport http --port 8080
# Serve — debug mode
headroom mcp serve --debug
# Check status
headroom mcp status
# Uninstall
headroom mcp uninstallCross-tool compatibility
| Tool | Transport | Setup |
|---|---|---|
| Claude Code (local) | stdio | headroom mcp install |
| Claude Code (remote) | HTTP | headroom mcp install --remote http://host:8787/mcp |
| Cursor | stdio / HTTP | Add to MCP settings |
| Docker agents | HTTP | Point to http://proxy:8787/mcp |
| Any MCP host | stdio / HTTP | headroom mcp serve or --transport http |
Architecture
MCP only (no proxy)
The LLM calls headroom_compress on demand. Compression happens locally in the MCP process. Originals are stored in a local CompressionStore with 1-hour TTL.
MCP + Proxy (full setup)
The proxy compresses all traffic at the HTTP level (before the LLM sees content). MCP tools operate after the LLM receives content. They handle different data and do not double-compress.
headroom_retrieve checks the local store first, then falls back to the proxy's store.
Remote (HTTP transport)
Remote Agent (any machine)
|
| POST /mcp (JSON-RPC)
v
Headroom Proxy :8787/mcp (Streamable HTTP)
|
| in-process access
v
Compression Pipeline + CompressionStoreThe proxy's /mcp endpoint shares the same compression store and pipeline as the proxy itself -- no HTTP round-trips to self.
Troubleshooting
"MCP SDK not installed" -- Run pip install "headroom-ai[mcp]".
"Proxy not running" -- Start the proxy with headroom proxy in another terminal. Only needed for proxy-backed retrieval.
"Entry not found or expired" -- Local content expires after 1 hour, proxy content after 5 minutes.
Claude doesn't see headroom tools -- Run headroom mcp status, restart Claude Code, and verify with /mcp inside Claude Code.