Glean 拾遗
Recent picks

2picks · chronological

06-16

A Local-First Context Compression Layer for AI Agents: Library, Proxy, and MCP in One Stack

Headroom is a local-first context compression layer built specifically for AI coding agents. It slashes token consumption by 60-95% by compressing tool outputs, logs, files, and RAG results before they reach the LLM, all while maintaining answer accuracy. Usable as a Python/TypeScript library, a transparent proxy, a CLI wrapper for popular agents, or an MCP server, it fits into existing workflows without friction. Internally, it combines JSON structure-aware compression, AST-based code minification, and a custom fine-tuned model, grounded by a novel CCR reversible compression system that guarantees original data is never lost. This tool is ideal for engineers who rely heavily on coding agents and want to cut API costs without altering their current toolchain.

github.com · 18 min · Agents · Ast-Minification · Context Engineering
06-16

The Context Compression Layer for AI Agents: 60–95% Fewer Tokens, Zero Accuracy Loss

Headroom is a local-first context compression layer for AI agents that slashes token usage from tool outputs, logs, files, and RAG chunks by 60–95% before they reach the LLM, with preserved accuracy. It offers library, proxy, MCP server, and agent wrapper modes, using a content router to select the best compressor for JSON, code, or prose. Reversible compression ensures originals are retrievable on demand. With cross-agent memory and `headroom learn` for mining failed sessions, it is ideal for engineers running coding agents daily and anyone seeking to slash LLM costs without changing their workflow.

github.com · 18 min · Agent Architecture · Ai-Memory · Context Engineering