LLM Reasoning and Architecture

Can LLMs handle multiple tasks at once during inference?

Do language models maintain multiple distinct in-context learning tasks simultaneously in their internal representations, and if so, what prevents them from actually generating outputs for more than one task?

Note · 2026-02-23 · sourced from MechInterp
What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

LLMs can perform multiple, computationally distinct in-context learning tasks simultaneously during a single inference call — task superposition. This emerges even when models are trained to learn one task at a time.

The distinction between superposition types matters:

Three scaling findings:

  1. Larger models can solve more ICL tasks in parallel
  2. Larger models better calibrate their output distribution across simultaneous tasks
  3. Task vectors can be composed via arithmetic operations to steer behavior

The critical limitation: generation collapse. After the first token is generated, the model converges on predicting tokens for a single task, negating multi-task execution. The first token acts as a commitment point that collapses the superposition. "Superposed decoding" algorithms attempt to maintain the multi-task state but remain early-stage.

The Waluigi effect — outputs collapsing to unintended simulacra — is one consequence of task superposition. When the model maintains multiple task interpretations simultaneously, generation collapse can resolve to an unintended one. This connects to the "LLMs as multiverse generators" perspective: the model simultaneously represents multiple possible continuations, and decoding forces a collapse.

The practical implication: the model's representational capacity for parallel computation far exceeds what standard autoregressive decoding can exploit. The bottleneck is not in representation but in generation — a single token sequence can only express one task at a time.

Task superposition has implications for ICL-based sequential decision making. Since Why do trajectories matter more than individual examples for in-context learning?, presenting multiple trajectories in context enables ICL of new tasks. Task superposition may be the representational mechanism that makes this possible: the model maintains multiple task interpretations from the in-context trajectories simultaneously, extracting the shared structure needed for generalization. Generation collapse then explains why the model commits to a single policy despite potentially representing multiple viable strategies.


Source: MechInterp

Related concepts in this collection

Concept map
13 direct connections · 145 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

LLMs perform multiple ICL tasks simultaneously in superposition — but generation collapse after the first token prevents practical use