SmartCrusher
Statistical JSON and array compression that keeps important items and drops the rest, achieving 70-90% token reduction.
SmartCrusher is Headroom's compressor for JSON tool outputs. It analyzes arrays statistically, keeps the important items (errors, anomalies, relevant matches), and drops the rest. This is the compressor that fires automatically when ContentRouter detects JSON arrays.
How It Works
SmartCrusher doesn't blindly truncate arrays. It scores each item across five dimensions:
- First/Last items -- Context for pagination and recency
- Error items -- 100% preservation of error states (never dropped)
- Anomalies -- Statistical outliers (> 2 standard deviations from the mean)
- Relevant items -- Matches to the user's query via BM25/embeddings
- Change points -- Significant transitions in data
The result: a 1,000-item array becomes ~50 items with all the information the LLM actually needs.
What Gets Preserved
| Category | Preserved | Why |
|---|---|---|
| Errors | 100% | Critical for debugging |
| First N | 100% | Context and pagination |
| Last N | 100% | Recency |
| Anomalies | All | Unusual values matter |
| Relevant | Top K | Match user's query |
| Others | Sampled | Statistical representation |
Quick Start
import { } from "headroom-ai";
// SmartCrusher fires automatically for JSON tool outputs
const = [
{ : "system" as , : "You are a helpful assistant." },
{ : "user" as , : "Find errors in the last 24 hours" },
{
: "tool" as ,
: .({ : new (1000).({ : "ok" }) }),
: "call_1",
},
];
const = await ();
.(`Tokens saved: ${.tokensSaved}`);
// SmartCrusher keeps errors, anomalies, and relevant itemsfrom headroom import SmartCrusher
crusher = SmartCrusher()
# Before: 1000 search results (45,000 tokens)
tool_output = {"results": ["...1000 items..."]}
# After: ~50 important items (4,500 tokens) -- 90% reduction
compressed = crusher.crush(tool_output, query="user's question")Configuration
import { } from "headroom-ai";
// Configure via the Headroom proxy or HeadroomClient
const = await (messages, {
: "gpt-4o",
: 10000, // SmartCrusher will reduce JSON to fit
});
.(`Transforms: ${.transformsApplied}`);
// ["smart_crusher", "cache_aligner"]from headroom import SmartCrusher, SmartCrusherConfig
config = SmartCrusherConfig(
min_tokens_to_crush=200, # Only compress if > 200 tokens
max_items_after_crush=50, # Keep at most 50 items
keep_first=3, # Always keep first 3 items
keep_last=2, # Always keep last 2 items
relevance_threshold=0.3, # Keep items with relevance > 0.3
anomaly_std_threshold=2.0, # Keep items > 2 std dev from mean
preserve_errors=True, # Always keep error items
)
crusher = SmartCrusher(config)
compressed = crusher.crush(tool_output, query="find payment failures")Configuration Options
| Option | Default | Description |
|---|---|---|
min_tokens_to_crush | 200 | Only compress arrays with more than this many tokens |
max_items_after_crush | 50 | Maximum items to keep after compression |
keep_first | 3 | Always keep the first N items |
keep_last | 2 | Always keep the last N items |
relevance_threshold | 0.3 | Minimum relevance score to keep an item |
anomaly_std_threshold | 2.0 | Standard deviation threshold for anomaly detection |
preserve_errors | True | Always keep items containing error states |
Example: Before and After
Consider a tool that returns 1,000 search results:
# Before compression: 45,000 tokens
{
"results": [
{"id": 1, "status": "ok", "message": "Success", "timestamp": "..."},
{"id": 2, "status": "ok", "message": "Success", "timestamp": "..."},
# ... 995 more "ok" results ...
{"id": 998, "status": "error", "message": "Connection timeout", "timestamp": "..."},
{"id": 999, "status": "ok", "message": "Success", "timestamp": "..."},
{"id": 1000, "status": "ok", "message": "Success", "timestamp": "..."},
]
}
# After SmartCrusher: 4,500 tokens (90% reduction)
# Kept: first 3, last 2, the error at id=998, statistical sampleThe LLM sees the structure, the error, and a representative sample -- everything it needs to answer "find errors in the last 24 hours" without wading through 1,000 identical success responses.
Automatic routing
You don't need to call SmartCrusher directly. The ContentRouter detects JSON arrays and routes them to SmartCrusher automatically. Direct usage is available when you want fine-grained control over the configuration.