Can models trained on many imperfect experts outperform each one?
Do generative models trained on diverse, imperfect human experts develop an implicit consensus that surpasses any individual contributor? This explores whether aggregating diverse perspectives at training time, rather than inference time, can denoise human biases.
The Transcendence paper formalizes a surprising property: generative models trained on many experts with diverse capacities and biases can outperform any single expert. The mechanism is implicit majority voting. When trained on diverse human players (chess), the model's cross-entropy optimization converges on the consensus behavior — which, by the wisdom-of-the-crowd effect, is often better than any individual contributor.
Low-temperature sampling is the key enabler. At low temperature, the model's output distribution concentrates on its highest-probability predictions — the consensus. This is formally equivalent to a majority vote. The advantage is primarily due to performing much better on a small subset of states — likely the critical, outcome-determining positions where individual human biases diverge most and the crowd wisdom is most valuable.
Diversity in the training data is a necessary condition. Without diversity, there is no denoising — a model trained on clones of one expert can only approach that expert's level. The practical conditions for transcendence: (1) diverse training sources with different biases, (2) a task where individual biases are uncorrelated (so they cancel under aggregation), and (3) low-temperature decoding to extract the consensus.
This connects to but is distinct from Why does majority voting outperform more complex inference methods?. That note describes inference-time majority voting over multiple samples from one model. Transcendence describes training-time majority voting implicitly encoded in a single model's weights through diverse training data. The mechanism is analogous — aggregation denoises — but operates at different timescales.
The implication for LLM training is provocative: the "average" of many imperfect human demonstrations may be better than any individual human demonstration, provided the imperfections are diverse rather than correlated. This challenges the assumption that training data quality should be maximized per-example; quantity and diversity of perspectives may matter as much as individual quality.
Source: Training Fine Tuning
Related concepts in this collection
-
Why does majority voting outperform more complex inference methods?
Simple majority voting across independent samples often matches or beats sophisticated alternatives like Best-of-N and sequential revision. What makes this basic approach so hard to beat for reasoning models?
inference-time voting analog; this is the training-time version
-
Does voting discard useful reasoning from losing chains?
When multiple reasoning chains compete through majority voting, intermediate steps from non-winning chains are discarded. Could extracting and mixing those intermediate facts improve both the final answer and our ability to understand the reasoning?
shows limits of pure voting; transcendence may have similar limits
-
Does training on AI-generated content permanently degrade model quality?
When generative models train on outputs from previous models, do the resulting models lose rare patterns permanently? The question matters because future training data will inevitably contain synthetic content.
counterpoint: while diversity enables transcendence, synthetic data collapses diversity
-
Can generative and discriminative models reach agreement?
Generative and discriminative decoding often produce conflicting answers. Can a game-theoretic framework force these two complementary procedures to reconcile their predictions into a single, more reliable output?
related consensus mechanism: transcendence achieves consensus across diverse training experts at training time, while Consensus Game achieves consensus between generative and discriminative decoding modes at inference time; both extract a signal more reliable than any single perspective
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
generative models transcend their training experts through implicit majority voting that denoises diverse human biases