Headroom

MCP Tools

Compression, retrieval, and stats as MCP tools for Claude Code, Cursor, and any MCP-compatible host.

Headroom's MCP server exposes compression, retrieval, and observability as tools that any MCP-compatible AI coding tool can call -- Claude Code, Cursor, Codex, and more. No proxy required.

Installation

# MCP tools only (lightweight)
pip install "headroom-ai[mcp]"

# Or with the proxy
pip install "headroom-ai[proxy]"

Setup for Claude Code

# Register with Claude Code (one-time)
headroom mcp install

# Start Claude Code — it now has headroom tools
claude

Claude Code can now compress content on demand, retrieve originals, and check session stats.

For automatic compression of all traffic, also run the proxy:

# Terminal 1
headroom proxy

# Terminal 2
ANTHROPIC_BASE_URL=http://127.0.0.1:8787 claude

Tools

headroom_compress

Compress content on demand. The LLM calls this when it wants to shrink large content before reasoning over it.

Parameters:

  • content (required) -- text to compress (files, JSON, logs, search results)

Returns:

  • compressed -- compressed text
  • hash -- key for retrieving the original later
  • original_tokens / compressed_tokens / savings_percent
  • transforms -- which compression algorithms were applied

Example flow:

Claude: Let me compress this large output to save context space.

-> headroom_compress(content="[5000 lines of grep results...]")

<- {
    "compressed": "[key matches with context...]",
    "hash": "a1b2c3d4e5f6...",
    "original_tokens": 12000,
    "compressed_tokens": 3200,
    "savings_percent": 73.3,
    "transforms": ["router:search:0.27"]
   }

The original is stored locally for 1 hour. If the LLM needs the full content later, it calls headroom_retrieve.

headroom_retrieve

Retrieve original uncompressed content by hash.

Parameters:

  • hash (required) -- hash key from a previous compression
  • query (optional) -- search within the original to return only matching items

Returns:

  • original_content (full retrieval) or results (filtered search)
  • source -- "local" or "proxy"

Retrieval checks the local store first, then falls back to the proxy's store. Hashes from either source work transparently.

headroom_stats

Session compression statistics.

Returns:

  • compressions, retrievals, tokens_saved, savings_percent
  • estimated_cost_saved_usd
  • recent_events -- last 10 compression/retrieval events
  • sub_agents -- stats from sub-agent MCP instances
  • combined -- main + sub-agent totals
  • proxy -- request count, cache hits, cost saved (if proxy is running)

Sub-agent stats are aggregated via a shared stats file at ~/.headroom/session_stats.jsonl.

Streamable HTTP Transport (Remote / Docker)

For agents running on a different machine than the Headroom proxy (e.g., Docker, cloud), MCP tools are available over HTTP using the MCP Streamable HTTP protocol.

Proxy auto-exposes /mcp

When you run headroom proxy, MCP tools are automatically available at /mcp:

headroom proxy  # → http://host:8787/mcp

Remote agents connect with:

{
  "mcpServers": {
    "headroom": {
      "url": "http://proxy-host:8787/mcp"
    }
  }
}

Standalone HTTP server

Run MCP tools without the full proxy:

headroom mcp serve --transport http --port 8080

Remote install

Configure Claude Code to use remote MCP over HTTP:

headroom mcp install --remote http://proxy-host:8787/mcp

This writes URL-based config instead of the default command-based config.

Protocol

The Streamable HTTP transport implements the MCP specification:

  • POST /mcp -- Send JSON-RPC requests (tool calls, list tools)
  • GET /mcp -- Server-sent events stream (server-initiated messages)
  • DELETE /mcp -- Terminate session

Stateless mode by default -- each request is independent, no session tracking needed.

CLI commands

# Install — local (stdio, default)
headroom mcp install

# Install — remote (HTTP, for Docker/network)
headroom mcp install --remote http://proxy-host:8787/mcp

# Install — custom proxy URL
headroom mcp install --proxy-url http://host:9000

# Overwrite existing config
headroom mcp install --force

# Serve — stdio (default, called by Claude Code)
headroom mcp serve

# Serve — HTTP (for remote agents)
headroom mcp serve --transport http --port 8080

# Serve — debug mode
headroom mcp serve --debug

# Check status
headroom mcp status

# Uninstall
headroom mcp uninstall

Cross-tool compatibility

ToolTransportSetup
Claude Code (local)stdioheadroom mcp install
Claude Code (remote)HTTPheadroom mcp install --remote http://host:8787/mcp
Cursorstdio / HTTPAdd to MCP settings
Docker agentsHTTPPoint to http://proxy:8787/mcp
Any MCP hoststdio / HTTPheadroom mcp serve or --transport http

Architecture

MCP only (no proxy)

The LLM calls headroom_compress on demand. Compression happens locally in the MCP process. Originals are stored in a local CompressionStore with 1-hour TTL.

MCP + Proxy (full setup)

The proxy compresses all traffic at the HTTP level (before the LLM sees content). MCP tools operate after the LLM receives content. They handle different data and do not double-compress.

headroom_retrieve checks the local store first, then falls back to the proxy's store.

Remote (HTTP transport)

Remote Agent (any machine)
    |
    | POST /mcp (JSON-RPC)
    v
Headroom Proxy :8787/mcp (Streamable HTTP)
    |
    | in-process access
    v
Compression Pipeline + CompressionStore

The proxy's /mcp endpoint shares the same compression store and pipeline as the proxy itself -- no HTTP round-trips to self.

Troubleshooting

"MCP SDK not installed" -- Run pip install "headroom-ai[mcp]".

"Proxy not running" -- Start the proxy with headroom proxy in another terminal. Only needed for proxy-backed retrieval.

"Entry not found or expired" -- Local content expires after 1 hour, proxy content after 5 minutes.

Claude doesn't see headroom tools -- Run headroom mcp status, restart Claude Code, and verify with /mcp inside Claude Code.

On this page