Can capability boundary collapse be addressed by operating at representational rather than token level?

This explores whether the apparent ceiling on what a model can do — its 'capability boundary collapse' — is really a token-level artifact that loosens once you reason in representation, embedding, or latent space instead of in surface text.

This reads the question as: when a model seems to hit a wall, is the wall in the model's actual reasoning, or in the medium it's forced to reason through — strings of tokens? The corpus leans hard toward the second answer, and that reframing is the interesting part. Several notes converge on the idea that the boundary you observe is an artifact of measuring and operating at the token surface, and that it softens or moves when you go down a level.

The sharpest version comes from work showing the exploration–exploitation trade-off in RLVR is a measurement artifact: when you inspect hidden states with effective-rank metrics instead of token outputs, exploration and exploitation barely correlate, and you can push both at once for real accuracy gains Is the exploration-exploitation trade-off actually fundamental?. A 'fundamental' boundary turned out to be an illusion of the level you looked at. Parallel to this, what looks like a reasoning cliff is often an execution cliff — models that know an algorithm still fail to grind it out in text-only generation, but solve the same problems once given tools to execute Are reasoning model collapses really failures of reasoning?. The collapse is in the channel, not the competence.

If token surface is the bottleneck, the natural move is to compute somewhere richer. Large Concept Models reason over whole-sentence embeddings in a language-agnostic space before decoding, planning at the paragraph level rather than emitting tokens one at a time Can reasoning happen at the sentence level instead of tokens?. A whole family of latent-reasoning architectures — depth-recurrent models, Heima, Coconut — scale test-time compute by iterating hidden states with no verbalized steps at all, suggesting that writing your thoughts out is a training habit, not a requirement for thinking Can models reason without generating visible thinking tokens?. This is the most direct 'yes' to the question: move the work off the token track and the ceiling lifts.

But the corpus also complicates the clean dichotomy. Tokens aren't a uniform surface to escape — they're unevenly load-bearing. Only about 20% of tokens are high-entropy 'forking points' that actually carry the learning signal, and training on just those matches full updates Do high-entropy tokens drive reasoning model improvements?. Models internally rank tokens by function, preserving symbolic-computation tokens while discarding grammar and filler Which tokens in reasoning chains actually matter most?. So part of 'operating at representation level' may not mean abandoning tokens but identifying which ones are doing representational work. And there's a cost to going fully continuous that the question doesn't anticipate — fine-tuning into continuous spaces is itself implicated elsewhere in catastrophic forgetting, so the level switch isn't free.

What you didn't ask but might want: there's a third path between 'tokens' and 'pure latent space' — structured abstraction. RLAD trains models to generate reasoning abstractions that enforce breadth-first exploration where deep token chains underthink Can abstractions guide exploration better than depth alone?, and cognitive tools elicit latent capability that was already present by isolating reasoning operations into modular calls, no retraining needed Can modular cognitive tools unlock reasoning without training?. Both suggest the capability often already exists inside the model — the boundary is about how you route to it, not whether it's there. Read together, the corpus's answer is: yes, much of what looks like a hard capability boundary is a token-level framing problem, but the fix is less 'abandon tokens for embeddings' and more 'stop assuming the token stream is where the reasoning lives.'

Sources 8 notes

Is the exploration-exploitation trade-off actually fundamental?

Hidden-state analysis using Effective Rank metrics shows near-zero correlation between exploration and exploitation, revealing the trade-off emerges only at token level. VERL demonstrates simultaneous enhancement achieving 21.4% accuracy gains on Gaokao 2024.

Are reasoning model collapses really failures of reasoning?

Models confined to text-only generation cannot execute multi-step procedures at scale, even when they know the underlying algorithm. Tool-enabled models solve problems beyond the supposed reasoning cliff, suggesting the bottleneck is procedural execution bandwidth.

Can reasoning happen at the sentence level instead of tokens?

Meta's Large Concept Model operates on sentence embeddings rather than tokens, reasoning in a language-agnostic space before decoding to any target language. This hierarchical approach with paragraph-level planning produces more coherent output than flat token generation.

Can models reason without generating visible thinking tokens?

Multiple architectures—depth-recurrent models, Heima, and Coconut—demonstrate that test-time compute scales through hidden state iteration rather than token generation. This suggests verbalization is a training artifact, not a reasoning requirement.

Do high-entropy tokens drive reasoning model improvements?

Only ~20% of tokens exhibit high entropy as pivotal reasoning decision points; RLVR primarily adjusts these forking tokens. Training exclusively on them matches or exceeds full-gradient performance, revealing that the minority carries the learning signal.

Which tokens in reasoning chains actually matter most?

Greedy likelihood-preserving pruning reveals six functional token categories; symbolic computation tokens are preferentially preserved while grammar and meta-discourse are pruned first. Student models trained on these pruned chains outperform those trained on frontier-model compression.

Can abstractions guide exploration better than depth alone?

RLAD jointly trains abstraction and solution generators, showing that allocating test-time compute to diverse abstractions outperforms parallel solution sampling at large budgets. Abstractions create structured breadth-first exploration that prevents the underthinking failure mode of depth-only reasoning chains.

Can modular cognitive tools unlock reasoning without training?

Four cognitive tools implemented as sandboxed LLM calls improved GPT-4.1 on AIME2024 from 26.7% to 43.3% without any RL training. Modularity enforces operation isolation that pure prompting cannot guarantee, eliciting pre-existing reasoning capability.

Can capability boundary collapse be addressed by operating at representational rather than token level?

Sources 8 notes

Next inquiring lines