Reasoning and Knowledge Reasoning and Learning Architectures

Can a reasoning model's thinking trace compress context effectively?

Does the raw reasoning trace produced by a thinking model naturally function as a context compressor without specialized training or modules? And how does this compare to dedicated compression methods?

Note · 2026-05-28 · sourced from Context Engineering

Context compression — shortening long inputs with minimal information loss to accelerate inference — has been treated as a separate problem requiring specialized compression modules or compression-specific training. Thinking as Compression (TaC) collapses that separation. Its claim is that thinking is already a process of distilling and compressing information, so a reasoning model's thinking trace can serve directly as the compressed context. The trace focuses on key information, skips redundancies, revisits important evidence, and links scattered facts into a compact form — exactly the operations a compressor performs.

Empirically, simply prompting a thinking model to generate a trace and using that trace as the shortened context already outperforms most representative compression methods, with no dedicated compressor and no compression-specific training. The intrinsic capability was hiding in plain sight: the same mechanism that lets reasoning models solve hard problems also produces a usable compression of their inputs.

The conceptual payoff is a reframing of what a thinking trace is. It is usually read as a means to an answer; TaC reads it as a reusable artifact — a compressed representation of the relevant context that downstream models can consume. This rhymes with the deeper equivalence in the vault: since Can text-trained models compress images better than specialized tools?, modeling and compression are two views of the same operation, and TaC extends that identity from the model's weights to its traces. It also connects to evidence that trace verbosity is largely separable from reasoning content — since Can minimal reasoning chains match full explanations?, the compressive core of a trace is small. Counterpoint that motivates the companion finding: raw traces struggle with budget control and shortcut behaviors, so "the trace is a compressor" needs a control mechanism to be reliable. Why it matters: it eliminates a whole class of dedicated compression machinery by repurposing reasoning the model already does.


— "Thinking as Compression: Your Reasoning Model is Secretly a Context Compressor", https://arxiv.org/abs/2605.28713

Related concepts in this collection

Concept map
12 direct connections · 113 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere
Original note title

a reasoning model's thinking trace is itself an effective context compressor requiring no dedicated module