Text & Log Compression

Specialized compressors for search results, build logs, diffs, and general text. Each preserves what matters for its content type.

Headroom provides specialized compressors for text-based content that isn't JSON or source code. Each one understands the structure of its content type and preserves what the LLM needs while dropping the noise.

Compressor	Input Type	What It Preserves	Typical Savings
`SearchCompressor`	grep/ripgrep output	Relevant matches, file diversity	80-95%
`LogCompressor`	Build/test logs	Errors, stack traces, summaries	85-95%
`DiffCompressor`	Unified diffs	Changed lines, context	60-80%
`TextCompressor`	General text	Relevant paragraphs, anchors	60-80%
`LLMLinguaCompressor`	Any text (max compression)	Semantic meaning via ML	80-95%

SearchCompressor

Compresses search results (grep, ripgrep, ag) while keeping the matches that matter.

from headroom.transforms import SearchCompressor

search_results = """
src/utils.py:42:def process_data(items):
src/utils.py:43:    \"\"\"Process items.\"\"\"
src/models.py:15:class DataProcessor:
src/models.py:89:    def process(self, items):
... hundreds more matches ...
"""

compressor = SearchCompressor()
result = compressor.compress(search_results, context="find process")

print(f"Compressed {result.original_match_count} matches to {result.compressed_match_count}")
print(result.compressed)

What gets preserved:

Exact query matches (lines containing the search term)
High-relevance matches (scored by BM25 similarity)
File diversity (results from different files are kept)
First/last matches (context from start and end)

Configuration

from headroom.transforms import SearchCompressor, SearchCompressorConfig

config = SearchCompressorConfig(
    max_results=50,                # Keep up to 50 matches
    preserve_file_diversity=True,  # Ensure different files represented
    relevance_threshold=0.3,       # Minimum relevance score to keep
)

compressor = SearchCompressor(config)

LogCompressor

Compresses build and test output while preserving errors, warnings, and summaries.

from headroom.transforms import LogCompressor

build_output = """
===== test session starts =====
collected 500 items
tests/test_foo.py::test_1 PASSED
... hundreds of passed tests ...
tests/test_bar.py::test_fail FAILED
AssertionError: expected 5, got 3
===== 1 failed, 499 passed =====
"""

compressor = LogCompressor()
result = compressor.compress(build_output)

print(result.compressed)
print(f"Compression ratio: {result.compression_ratio:.1%}")

What gets preserved:

Errors and failures (any line with ERROR, FAILED, Exception)
Warnings
Full stack traces for debugging
Test/build summary lines
Section headers (structural markers like =====)

What gets dropped:

Hundreds of PASSED lines
Verbose success output
Repeated patterns

DiffCompressor

Compresses unified diffs while keeping the actual changes and enough context to understand them.

from headroom.transforms import DiffCompressor

diff_output = """
diff --git a/src/main.py b/src/main.py
--- a/src/main.py
+++ b/src/main.py
@@ -42,7 +42,7 @@
 def process(items):
-    return [x for x in items]
+    return [x.strip() for x in items if x]
"""

compressor = DiffCompressor()
result = compressor.compress(diff_output)

TextCompressor

General-purpose text compression with anchor preservation. Best for documentation, README files, and prose content.

from headroom.transforms import TextCompressor

long_text = """
... thousands of lines of documentation ...
"""

compressor = TextCompressor()
result = compressor.compress(long_text, context="authentication")

print(result.compressed)

What gets preserved:

Paragraphs relevant to the context query
Headers and section markers
Document structure and organization

LLMLingua (Optional, Maximum Compression)

For maximum compression on any text, Headroom integrates with Microsoft's LLMLingua-2, a BERT-based token classifier trained via GPT-4 distillation. It achieves up to 20x compression while preserving semantic meaning.

from headroom.transforms import LLMLinguaCompressor, LLMLinguaConfig

config = LLMLinguaConfig(
    device="auto",                # auto, cuda, cpu, mps
    code_compression_rate=0.4,    # Conservative for code
    json_compression_rate=0.35,   # Moderate for JSON
    text_compression_rate=0.25,   # Aggressive for text
)

compressor = LLMLinguaCompressor(config)
result = compressor.compress(long_output)

print(f"Before: {result.original_tokens} tokens")
print(f"After: {result.compressed_tokens} tokens")
print(f"Saved: {result.savings_percentage:.1f}%")

LLMLingua is opt-in

LLMLingua adds ~2GB of model weights and 50-200ms latency per request. Install it only when you need maximum compression: pip install "headroom-ai[llmlingua]"

Memory Management

from headroom.transforms import unload_llmlingua_model, is_llmlingua_model_loaded

# Check if model is loaded
print(is_llmlingua_model_loaded())  # True

# Free ~1GB RAM when done
unload_llmlingua_model()

Content Type Detection

If you're building your own routing logic, you can use the content type detector directly:

from headroom.transforms import detect_content_type, ContentType

content = "src/main.py:42:def process():"

detection = detect_content_type(content)
if detection.content_type == ContentType.SEARCH_RESULTS:
    result = SearchCompressor().compress(content, context="process")
elif detection.content_type == ContentType.BUILD_OUTPUT:
    result = LogCompressor().compress(content)
elif detection.content_type == ContentType.PLAIN_TEXT:
    result = TextCompressor().compress(content, context="process")

When Each Compressor Is Used

The ContentRouter selects the right compressor automatically. Here's when each fires:

Content Pattern	Compressor	Detection Signal
`file:line:content` lines	SearchCompressor	grep/ripgrep output format
pytest, npm, cargo markers	LogCompressor	Build tool output patterns
`---/+++` and `@@` markers	DiffCompressor	Unified diff format
Prose, documentation	TextCompressor	Fallback for non-structured text
Any (max compression mode)	LLMLinguaCompressor	Explicitly enabled

Performance

Compressor	Typical Input	Output	Speed
SearchCompressor	1,000 matches	30-50 matches	~2ms
LogCompressor	5,000 lines	100-200 lines	~3ms
DiffCompressor	Large diff	Changed hunks only	~2ms
TextCompressor	10,000 chars	2,000 chars	~2ms
LLMLinguaCompressor	Any text	5-20% of original	50-200ms

Text & Log Compression

On this page