Can reasoning happen at the sentence level instead of tokens?
Does moving from token-level to sentence-level reasoning in embedding space preserve the capability for complex reasoning while enabling language-agnostic processing? This challenges assumptions about how LLMs must operate.
Current LLMs operate at the token level — every reasoning step is a next-token prediction. Meta's Large Concept Model (LCM) challenges this by operating at the sentence level, reasoning in an abstract embedding space (SONAR) where each "concept" corresponds to a sentence.
The architectural difference is fundamental. The LCM:
- Does not see tokens — it receives and produces sentence-level embeddings
- Is language-agnostic — the same reasoning process works for any language or modality because SONAR encodes meaning, not surface form
- Separates reasoning from instantiation — reasoning happens once in the abstract space; decoding to a specific language happens afterward and can target any language without re-running the reasoning
The hierarchical structure adds a planning layer. The LCM predicts a sequence of concepts auto-regressively until it produces a "break concept" — analogous to a paragraph break. At that point, a Large Planning Model (LPM) generates a plan to condition the LCM for the next sequence. This two-level architecture (sentence-level prediction + paragraph-level planning) is designed to produce more coherent long-form output than flat token-level generation.
The comparison to JEPA (LeCun, 2022) is instructive: both predict representations in embedding space rather than raw observations. But where JEPA emphasizes learning the representation space via self-supervision, LCM focuses on accurate prediction within an existing embedding space (SONAR). The embedding quality is assumed, not learned end-to-end.
This connects to the latent reasoning thread through a different mechanism. Can models reason without generating visible thinking tokens? achieves reasoning without tokens via recurrent depth in continuous space. LCM achieves it via sentence-level embeddings. Both challenge the assumption that verbalized token-by-token generation is necessary for reasoning, but from different angles: depth-recurrent models reason within a single token's representation; LCM reasons between sentence-level units.
The practical implication: if reasoning can happen at the concept level rather than the token level, then the verbalized chain-of-thought paradigm is not the only path to sophisticated reasoning. The question is whether sentence-level granularity captures enough structure for complex reasoning tasks, or whether some tasks require finer-grained (sub-sentence) reasoning steps.
Related concepts in this collection
-
Can models reason without generating visible thinking tokens?
Explores whether intermediate reasoning must be verbalized as text tokens, or if models can think in hidden continuous space. Challenges a foundational assumption about how language models scale their reasoning capabilities.
alternative latent reasoning via recurrent depth; LCM is a third approach (sentence-level, not token-level or depth-recurrent)
-
Can models reason without generating visible thinking steps?
Do machine reasoning systems actually require verbalized chains of thought, or can they solve complex problems through hidden computation? This challenges how we measure and understand reasoning.
the shared question: does reasoning require verbalization? LCM says no, operating at sentence granularity
-
Do embedding dimensions fundamentally limit retrievable document combinations?
Can single-vector embeddings represent any top-k document subset a user might need? Research using communication complexity theory suggests there are hard geometric limits independent of training data or model architecture.
LCM relies entirely on embedding quality (SONAR); the mathematical limits of embeddings constrain what LCM can represent
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
Large Concept Models enable sentence-level reasoning in a language-agnostic embedding space — hierarchical abstraction beyond token-level processing