INQUIRING LINE

Can individually accurate agents still fail at population-level representation?

This explores whether AI agents that model individual people well can still misrepresent a whole population — i.e., whether per-person accuracy adds up to a faithful crowd, or quietly flattens it.


This explores whether AI agents that nail the individual can still distort the population — and the corpus suggests yes, for a specific and underappreciated reason: individual accuracy and population fidelity are different targets, and optimizing the first can silently sabotage the second. The clearest case comes from work showing GPT-4.5 out-predicting *every individual human* at judging social appropriateness — yet all the models share an identical set of systematic errors on unwritten norms Can AI learn social norms better than humans?. Each agent is individually superb, but if you built a population out of them you wouldn't get a diverse society — you'd get one very confident person cloned a thousand times, every copy wrong in exactly the same places. That correlated error is invisible at the individual level and catastrophic at the population level.

The interview-based agents that replicate human responses at 85% accuracy sharpen the puzzle Can AI agents learn people better from interviews than surveys?. Each agent is faithful to its source person — but the residual 15%, and the fact that accuracy was driven by *factual content* rather than the texture of how people actually differ, means an aggregate of these agents can smooth over the very variance that makes a population a population. High per-agent fidelity does not guarantee the distribution of disagreement survives.

There's a mechanism here worth seeing directly. Training a model on many diverse experts pushes it toward an *implicit majority vote* that denoises uncorrelated individual errors and beats any single expert Can models trained on many imperfect experts outperform each one?. That's a feature when you want the best single answer — but it's exactly the failure mode for representation. The same averaging that makes a model individually excellent collapses minority positions and tail behaviors. Convergence toward consensus and faithful population representation are in direct tension; you can't fully have both from one optimization.

The simulation literature adds the structural version of the trap. LLMs look socially competent when one model puppeteers every interlocutor, but fail systematically once agents hold private information — the omniscient setup hides the grounding work real interaction requires Why do LLMs fail when simulating agents with private information?. A population isn't just many accurate individuals; it's individuals with asymmetric, private, sometimes conflicting knowledge. And when you actually network agents together, coordination degrades predictably with scale: they accept neighbors' information without verification, propagating errors through the group even while each agent stays locally competent Why do multi-agent systems fail to coordinate at scale?.

So the answer the corpus points to: individual accuracy is a per-node property; population representation is a property of the *distribution and the interactions* — variance, correlation of errors, private information, network effects. Optimizing agents to be individually right tends to correlate their mistakes and compress their diversity, which is precisely what wrecks population-level fidelity. The thing worth taking away is that 'accurate' and 'representative' aren't a spectrum of the same quality — they can actively trade against each other, and a benchmark that only scores individuals will never see the gap.


Sources 5 notes

Can AI learn social norms better than humans?

GPT-4.5 outperformed every individual human at judging social appropriateness across 555 scenarios, challenging the theory that embodied cultural experience is necessary. However, all AI models share identical systematic errors on unwritten norms.

Can AI agents learn people better from interviews than surveys?

A 1,052-person study found agents built from voice interviews replicated participant responses nearly as well as people replicate their own answers. Factual content, not linguistic style, drove this accuracy—even summary bullet points retained 83% fidelity.

Can models trained on many imperfect experts outperform each one?

Generative models trained on many diverse experts with different biases converge toward consensus behavior through cross-entropy optimization. Low-temperature sampling reveals this implicit majority vote, which outperforms any single expert by denoising uncorrelated individual errors on critical decision states.

Why do LLMs fail when simulating agents with private information?

Research shows LLMs perform well when one model controls all interlocutors but fail systematically when agents possess private information. This reveals that apparent social competence relies on grounding work that models skip in omniscient settings.

Why do multi-agent systems fail to coordinate at scale?

AgentsNet benchmark shows agents fail to coordinate strategies either by agreeing too late or adopting strategies without informing neighbors. Agents accept neighbor information without verification, enabling error propagation while remaining capable of detecting direct conflicts.

Next inquiring lines