INQUIRING LINE

What makes output convergence across models inevitable given input-side homogenization?

This explores whether AI outputs from different models are doomed to look alike because the inputs themselves get flattened first — and whether that convergence is really inevitable or just a stack of design choices.


This reads the question as: if users keep rephrasing their prompts toward what models handle best, does that input-side flattening force every model's output to converge? The corpus suggests convergence is real, but it's driven by pressures at three different stages stacking on top of each other — not by one inevitable law.

The input side is the part the question names directly. Does high-frequency text homogenize user input before generation? describes "Adam's Law": the same distributional property that makes a model accurate on common phrasings also pulls users toward those phrasings, because distinct prompts get quietly flattened at comprehension time. Distinctiveness gets filtered out before generation even starts. So homogenization isn't something the model does to its answer — it's something that happens to your question on the way in.

But the input channel is only the first squeeze. Does RL training collapse format diversity in pretrained models? shows the training stage doing the same thing from the other end: RL post-training amplifies one dominant format from pretraining within the first epoch and collapses the alternatives — and which format wins depends on model scale, not on which is better. Meanwhile Why aren't bigger models better for generating diverse outputs? points at the sampling stage: larger models concentrate probability mass on their preferred outputs, so the bigger the model, the fewer distinct samples it produces per draw. Input flattening, training collapse, and probability-mass concentration are three separate convergence engines that happen to all point the same direction. That's why Does AI homogenize culture the way mass media did? can observe independent, nominally competing LLMs landing on similar outputs — and argues this homogeneity is more invisible than old mass media, because personalized framing disguises the sameness from any single user.

Here's the part you might not expect: convergence at the output may not mean convergence underneath. Can identical outputs hide broken internal representations? finds that networks can produce identical outputs while having radically different, fractured internal structure — so "the answers look the same" is weak evidence that "the models are the same." And Does setting temperature to zero actually make LLM outputs reliable? adds that even a model repeating the exact same output isn't converging on truth — it's just replaying one draw from its distribution. Sameness and correctness are not the same thing.

The word doing the most work in your question is "inevitable" — and the corpus quietly argues against it. The smaller-model result shows diversity is recoverable by choosing differently; Can models reliably improve themselves without external feedback? shows that the systems that escape diversity collapse all do it the same way: by smuggling in an external anchor (a past model version, a third-party judge, a user correction, a tool result). Convergence is what you get by default when every stage optimizes for the high-frequency center and nothing external pushes back. It looks inevitable only because the counter-pressure has to be added on purpose.


Sources 7 notes

Does high-frequency text homogenize user input before generation?

Adam's Law shows LLMs flatten distinct prompts at comprehension time as users rephrase toward higher-frequency forms the model handles best. The same distributional property that creates accuracy on common tasks filters out distinctiveness on the input side.

Does RL training collapse format diversity in pretrained models?

Controlled experiments show RL consistently amplifies one format distribution from pretraining within the first epoch while collapsing alternatives. The winning format depends on model scale, not necessarily performance, and is largely hidden when starting from proprietary pretrained models.

Why aren't bigger models better for generating diverse outputs?

Research shows that for synthetic data generation, models around 500M parameters outperform larger ones in output diversity per sample. Larger models concentrate probability mass on preferred outputs, reducing the variety of distinct samples generated within a fixed budget.

Does AI homogenize culture the way mass media did?

AI mass-generates similar flows disguised as personalized outputs, suppressing novelty more deeply than pre-stamped commodities because contextual customization makes homogeneity invisible to individual users. Evidence: independent LLMs converge on similar outputs despite nominal competition.

Can identical outputs hide broken internal representations?

Networks trained with SGD reproduce outputs perfectly while having radically different internal structure than evolved networks, with weight perturbations revealing fractured, entangled representations that prevent transfer to novel contexts or creative recombination.

Does setting temperature to zero actually make LLM outputs reliable?

Fixed seeds and zero temperature replicate the same output repeatedly, but that output remains one draw from the model's probability distribution. McDonald's omega testing across 100 repetitions reveals that consistency does not equal reliability.

Can models reliably improve themselves without external feedback?

Pure self-improvement stalls due to the generation-verification gap, diversity collapse, and reward hacking. Reliable improvement methods succeed by smuggling in external anchors: past model versions, third-party judges, user corrections, or tool feedback.

Next inquiring lines