Activation Steering for Chain-of-Thought Compression

Paper · arXiv 2507.04742 · Published July 7, 2025

Large language models (LLMs) excel at complex reasoning when they include intermediate steps, known as chains of thought (CoTs). However, these rationales are often overly verbose, even for simple problems, leading to wasted context, increased latency, and higher energy consumption. We observe that verbose, English-heavy CoTs and concise, math-centric CoTs occupy distinct regions in the model’s residual-stream activation space. By extracting and injecting a steering vector to transition between these modes, we can reliably shift generation toward more concise reasoning, effectively compressing CoTs without retraining. We formalize this approach as Activation-Steered Compression (ASC), an inference-time technique that shortens reasoning traces by directly modifying hidden representations. In addition, we provide a theoretical analysis of the impact of ASC on the output distribution, derived from a closed-form KL-divergence-bounded constraint to regulate steering strength. Using only 50 paired verbose and concise examples, ASC achieves up to 67.43% reduction in CoT length on MATH500 and GSM8K datasets, while maintaining accuracy across 7B, 8B, and 32B parameter models. As a training-free method, ASC introduces negligible runtime overhead and, on MATH500, delivers an average 2.73× speedup in end-to-end reasoning wall-clock time on an 8B model. This makes ASC a practical and efficient tool for streamlining the deployment of reasoning-capable LLMs in latency- or cost-sensitive settings. The code is available at https://github.com/ArminAzizi98/ASC.

Why Activation Steering for CoT Compression? Existing methods for compressing CoT reasoning can be broadly categorized into three approaches: (i) retraining-based methods that fine-tune models to produce shorter rationales, using techniques such as knowledge distillation [34] or embedding reasoning within compact latent tokens [16]; (ii) prompt-engineering strategies that employ carefully designed instructions to encourage models to “reason briefly,” utilize contrastive demonstrations, or favor symbolic sketches over verbose prose [2, 33]; and (iii) heuristic early-exit mechanisms that halt generation once a confidence or entropy threshold is reached, trading completeness for speed [32]. Activation steering offers an intriguing and effective middle ground. It is lightweight, requires only the addition of a single vector during inference, and directly reshapes hidden representations to enable on-the-fly compression. Because it does not involve updating the model parameters, this method is deployment-agnostic, making it equally applicable to both open-source and closed-source checkpoints. Moreover, it is orthogonal and compatible with the three categories of CoT compression described above. Finally, steering aligns well with interpretability: by nudging hidden states toward the sub-manifold of focused and concise reasoning, it exposes a controllable axis linking latent representations to computational efficiency.

Verbose Chain of Thought (603 tokens)

First, I need to add the three functions f(x), g(x), and h(x) together. To do this, I should start by writing down each function separately.\n\nSo, f(x) is -3xˆ2 + x - 4, g(x) is -5xˆ2 + 3x - 8, and h(x) is 5xˆ2 + 5x + 1.\n\nNext, I’ll add them term by term. That means I’ll combine the coefficients of the xˆ2 terms, the x terms, and the constant terms separately.\n\nLet’s start with the xˆ2 terms. For f(x), the coefficient is -3, for g(x) it’s -5, and for h(x) it’s 5. Adding these together: -3 + (-5) + 5.\n\nWait, that’s -3 -5 +5. Hmm, that simplifies to (-3 -5) +5, which is -8 +5, resulting in -3.\n\nOkay, so the xˆ2 term in the sum is -3xˆ2.\n\nNow, moving on to the x terms. For f(x), the coefficient is 1, for g(x) it’s 3, and for h(x) it’s 5. Adding these: 1 + 3 +5.\n\nThat’s straightforward: 1+3 is 4, plus 5 is 9. So the x term is 9x.\n\nFinally, the constant terms. For f(x), it’s -4, for g(x) it’s -8, and for h(x) it’s 1. Adding these: -4 + (-8) +1.\n\nThat’s -4 -8 +1, which is (-12) +1, resulting in -11.\n\nPutting it all together, the sum is -3xˆ2 +9x -11.\n\nWait, let me double-check my calculations to make sure I didn’t make any mistakes.\n\nFor the xˆ2 terms: -3 (from f) + (-5) (from g) +5 (from h) = (-3 -5) +5 = -8 +5 = -3. That seems correct.\n\nFor the x terms: 1 (from f) +3 (from g) +5 (from h) = 1+3+5=9. That’s correct.\n\nFor the constants: -4 (from f) + (-8) (from g) +1 (from h) = (-4 -8) +1 = -12 +1 = -11. That’s correct.\n\nSo, combining these, the final polynomial is -3xˆ2 +9x -11.\n\nI think that’s all. I don’t see any mistakes in the addition.\n\nFinal Answer\nThe sum of the polynomials is \boxed{-3xˆ2 + 9x - 11}.\n

Concise Chain of Thought (251 tokens)

[\n f(x) + g(x) + h(x) = (-3x^2 + x - 4) + (-5x^2 + 3x - 8) + (5x^2 + 5x + 1)]\n\nCombine like terms: \n\n1. Quadratic terms:\n [-3x^2 - 5x^2 + 5x^2 = (-3 - 5 + 5)x^2 = -3x^2 ]\n\n2. Linear terms **:\n [ x + 3x + 5x = (1 + 3 + 5)x = 9x ]\n\n3. **Constant terms:\n [-4 - 8 + 1 = -11 ]\n\n Combine results:\n[f(x) + g(x) + h(x) = -3x^2 + 9x - 11 ]\n\nFinal answer:\n[\boxed{-3x^2 + 9x - 11}

Chain of drafts (CoD) [31] and the approach of [20] reduce verbosity by embedding explicit length constraints in the prompt. CoD instructs the model to “think step by step” but keep the each draft to at most five words, whereas [20] limits the final answer to a user-specified number of sentences to create inference-time interventions. Although such heuristics can shorten outputs, they assume that the model will faithfully obey length directives, a behavior that recent studies show is unreliable for reasoning-oriented LLMs [11]. The closest work to ours is SEAL [4], which constructs its steering vector by manually labeling the thought segments as execution, reflection, or transition, and then damping the latter two segment types. In contrast, (i) we learn a single verbosity axis from paired VERBOSE–vs.–CONCISE CoTs without any manual labels, (ii) rely solely on off-the-shelf prompts to generate training pairs, and (iii) obtain a domain-agnostic vector that generalizes across reasoning tasks. Therefore, our method provides a taxonomy-free, training-free complement to SEAL’s category-based calibration.