SYNTHESIS NOTE
Psychology, Society, and Alignment Model Architecture and Internals Reasoning, Retrieval, and Evaluation

Where does mode collapse in language models really come from?

Researchers investigate whether mode collapse—when models narrow to repetitive outputs—stems from training algorithms or the preference data itself. Understanding the root cause is crucial for fixing diversity loss in creative and synthetic tasks.

Synthesis note · 2026-06-03 · sourced from Evaluations

Post-training alignment narrows LLM output to a few favored responses — mode collapse — which kneecaps creative writing, social simulation, pluralistic alignment, and synthetic-data generation. Prior work blames algorithmic causes (inadequate reward models, majority-favoring optimization). This paper relocates the cause to the data: a pervasive typicality bias in preference data, where annotators systematically prefer familiar, typical text (a well-established cognitive-psychology effect). Mode collapse is thus an inherent property of preference data itself, formalized theoretically and verified on real preference datasets.

The fix follows from the diagnosis and is strikingly cheap: Verbalized Sampling (VS), a training-free prompting strategy that asks the model to verbalize a distribution over responses with their probabilities ("generate 5 jokes and their probabilities") rather than a single answer. Across creative writing, dialogue simulation, open-ended QA, and synthetic data, VS lifts diversity 1.6-2.1× over direct prompting without sacrificing factual accuracy or safety — and, tellingly, more capable models benefit more, suggesting the diversity was latent and suppressed rather than absent.

This reframes the vault's diversity-collapse thread at the source. Since Does outcome-based RL diversity loss spread across unsolved problems? locates collapse in RL dynamics, VS adds the data-level origin and a decoding-time remedy; and it offers a practical lever for Why do LLMs generate novel ideas from narrow ranges? — verbalize the distribution to surface the suppressed tail.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 121 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

mode collapse is a data-level property of preference data driven by typicality bias not an algorithmic artifact — and verbalized sampling restores diversity training-free