OpenAI SDK
Auto-compress messages in the OpenAI Node.js SDK with a single withHeadroom() wrapper.
Headroom wraps the OpenAI Node.js SDK to automatically compress messages before every chat.completions.create() call. All other methods (embeddings, images, audio) pass through unchanged.
Installation
npm install headroom-ai openaiProxy required
The TypeScript SDK sends messages to a local Headroom proxy for compression. Start the proxy before using the SDK:
pip install "headroom-ai[proxy]"
headroom proxyQuick start
import { } from 'headroom-ai/openai';
import from 'openai';
const = (new ());
// Messages are compressed automatically before sending
const = await .chat.completions.create({
: 'gpt-4o',
: longConversation,
});That's it. Every call to client.chat.completions.create() compresses the messages first. The response format is identical to the unwrapped client.
How it works
withHeadroom() returns a proxy around your OpenAI client that intercepts chat.completions.create():
- Extracts
messagesfrom the request params - Sends them to the Headroom proxy's
/v1/compressendpoint - Replaces the original messages with the compressed result
- Forwards the request to OpenAI as normal
All other client methods are untouched:
import { } from 'headroom-ai/openai';
import from 'openai';
const = (new ());
// These pass through unchanged
const = await .embeddings.create({
: 'text-embedding-3-small',
: 'Hello world',
});Options
Pass compression options as the second argument:
import { } from 'headroom-ai/openai';
import from 'openai';
const = (new (), {
: 'gpt-4o',
: 'http://localhost:8787',
});Streaming
Streaming works normally. Compression happens before the request is sent:
import { } from 'headroom-ai/openai';
import from 'openai';
const = (new ());
const = await .chat.completions.create({
: 'gpt-4o',
: longConversation,
: true,
});
for await (const of ) {
..(.choices[0]?.delta?.content ?? '');
}Tool calling
Tool call messages and tool results are compressed like any other message content. Large tool outputs (JSON arrays, logs) see the biggest savings:
import { } from 'headroom-ai/openai';
import from 'openai';
const = (new ());
const = await .chat.completions.create({
: 'gpt-4o',
: [
{ : 'user', : 'Search for recent errors' },
{
: 'assistant',
: null,
: [{ : 'call_1', : 'function', : { : 'search', : '{"q":"errors"}' } }],
},
{
: 'tool',
: 'call_1',
: hugeJsonResult, // Compressed automatically
},
],
: [{ : 'function', : { : 'search', : {} } }],
});