Can a reasoning model's thinking trace compress context effectively?
Does the raw reasoning trace produced by a thinking model naturally function as a context compressor without specialized training or modules? And how does this compare to dedicated compression methods?
Context compression — shortening long inputs with minimal information loss to accelerate inference — has been treated as a separate problem requiring specialized compression modules or compression-specific training. Thinking as Compression (TaC) collapses that separation. Its claim is that thinking is already a process of distilling and compressing information, so a reasoning model's thinking trace can serve directly as the compressed context. The trace focuses on key information, skips redundancies, revisits important evidence, and links scattered facts into a compact form — exactly the operations a compressor performs.
Empirically, simply prompting a thinking model to generate a trace and using that trace as the shortened context already outperforms most representative compression methods, with no dedicated compressor and no compression-specific training. The intrinsic capability was hiding in plain sight: the same mechanism that lets reasoning models solve hard problems also produces a usable compression of their inputs.
The conceptual payoff is a reframing of what a thinking trace is. It is usually read as a means to an answer; TaC reads it as a reusable artifact — a compressed representation of the relevant context that downstream models can consume. This rhymes with the deeper equivalence in the vault: since Can text-trained models compress images better than specialized tools?, modeling and compression are two views of the same operation, and TaC extends that identity from the model's weights to its traces. It also connects to evidence that trace verbosity is largely separable from reasoning content — since Can minimal reasoning chains match full explanations?, the compressive core of a trace is small. Counterpoint that motivates the companion finding: raw traces struggle with budget control and shortcut behaviors, so "the trace is a compressor" needs a control mechanism to be reliable. Why it matters: it eliminates a whole class of dedicated compression machinery by repurposing reasoning the model already does.
— "Thinking as Compression: Your Reasoning Model is Secretly a Context Compressor", https://arxiv.org/abs/2605.28713
Related concepts in this collection
-
Can text-trained models compress images better than specialized tools?
Do general-purpose language models trained only on text outperform domain-specific compressors like PNG and FLAC on their native data? This tests whether compression ability is universal or requires domain specialization.
the foundational modeling-is-compression identity TaC extends to thinking traces
-
Can minimal reasoning chains match full explanations?
Does removing all explanatory text from chain-of-thought reasoning preserve accuracy? This tests whether verbose intermediate steps are necessary for solving problems or just artifacts of how language models are trained.
evidence that a trace's compressive core is small and verbosity is separable
-
Can we steer reasoning toward brevity without retraining?
This explores whether model reasoning style occupies learnable geometric directions in activation space, and whether we can shift toward concise thinking by steering through that space without expensive retraining.
a different route to compact traces, via activation steering rather than prompting the trace as context
-
Can thinking traces be made reliably budget-controllable?
Raw thinking traces compress well but ignore budget targets and take shortcuts. Can reward optimization make them controllable and useful for deployment?
extends: the companion finding this note's counterpoint anticipates — raw traces lack budget control, so TaC-C adds reward training to make "the trace is a compressor" reliable
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
Original note title
a reasoning model's thinking trace is itself an effective context compressor requiring no dedicated module