Why do different AI models generate similar outputs independently?

This explores why separately-trained AI models tend to land on the same answers — the 'Artificial Hivemind' effect — and what in their construction pushes them toward convergence rather than diversity.

This explores why separately-trained AI models tend to land on the same answers, even when you'd expect different architectures and labs to produce genuine variety. The most direct evidence comes from INFINITY-CHAT, which tested 70+ models across 26K open-ended queries and found that they independently generate strikingly similar — sometimes identical — responses Do different AI models actually produce diverse outputs?. The culprit isn't coordination; it's shared inputs. Models overlap heavily in their training data and lean on near-identical alignment procedures, so the diversity you'd hope to get from running an ensemble of different models largely evaporates.

The corpus suggests the convergence happens in two stages. First, pretraining gives every large model a heavily overlapping picture of the same internet. Then post-training narrows things further: controlled experiments show that reinforcement learning amplifies a single dominant output format from pretraining within the first epoch while actively suppressing the alternatives Does RL training collapse format diversity in pretrained models?. So even where a base model held multiple ways of expressing something, the alignment step collapses them toward one — and since labs use similar RLHF recipes, they collapse toward similar ones. A related finding reinforces this: RL post-training seems to teach output *format* and organization rather than new knowledge, which is why a tiny model can match much larger ones just by adapting its format Can small models reason well by just learning output format?. If what alignment mostly installs is a shared house style, shared style is exactly what you'd expect to see across models.

Here's the part you might not have expected: this convergence pulls models *away* from humans even as it pulls them *toward* each other. Newer generations like GPT-4.5 and o4-mini actually diverge further from human lexical patterns than earlier models did, because RLHF optimizes for what raters score as high-quality, not for what looks human Why do newer AI models diverge further from human writing patterns?. So the shared optimization target is the deep reason for both effects — models cluster together in a region of output space that human raters reward, which is not the same region where human writers naturally sit.

There's also a more fundamental framing worth sitting with: LLM output is produced by sampling from probability distributions over the same kind of training corpus, which is a structurally different operation from how humans use language to address one another Are language models and human speakers doing the same thing?. When many systems are all estimating roughly the same distribution and decoding it the same way, agreement is almost the default rather than a surprise. The genuinely interesting tension is that the same technology is also described as essentially mutable — outputs shift with sampling, prompt wording, and audience Why does AI output change with every prompt and context?. So models are at once highly variable within a single system and highly convergent across systems: the randomness lives in the sampling, but the center of gravity is shared. If you want to go deeper, the practical sting is in the first note — model ensembles promise diversity they can't actually deliver when every member learned from the same place and was aligned the same way.

Sources 6 notes

Do different AI models actually produce diverse outputs?

INFINITY-CHAT analyzed 70+ models across 26K open-ended queries and found an "Artificial Hivemind" effect: models independently generate strikingly similar or identical responses due to overlapping training data and alignment procedures, undermining the diversity benefits of model ensembles.

Does RL training collapse format diversity in pretrained models?

Controlled experiments show RL consistently amplifies one format distribution from pretraining within the first epoch while collapsing alternatives. The winning format depends on model scale, not necessarily performance, and is largely hidden when starting from proprietary pretrained models.

Can small models reason well by just learning output format?

A 1.5B parameter model with LoRA-only post-training matched larger full-parameter RL models on reasoning tasks, suggesting RL teaches output format organization rather than new factual knowledge. This efficiency indicates reasoning and knowledge storage are separable capabilities.

Why do newer AI models diverge further from human writing patterns?

ChatGPT-4.5 and o4-mini show greater lexical diversity differences from human text than earlier models, yet human judges cannot reliably distinguish them. Training objectives like RLHF appear to optimize for quality ratings rather than human-like writing patterns.

Are language models and human speakers doing the same thing?

LLMs produce strings via probability distributions; humans use language to address and relate to others. They share surface form but differ in what produces output, what it does socially, and what receivers should do with it.

Why does AI output change with every prompt and context?

AI outputs exhibit essential mutability—they vary with sampling, prompt wording, and audience interpretation. This is not a defect but a defining feature of tokens as media, making them fundamentally different from fixed commodities and resistant to traditional quality assurance.

Why do different AI models generate similar outputs independently?

Sources 6 notes

Next inquiring lines