How many distinct quasi-persons does a single language model actually support?
This explores whether a single language model holds one stable identity or many—and what the corpus says about how to even count the 'persons' inside it.
This reads the question as: when you talk to a language model, are you talking to one quasi-person, or to a shifting crowd of them—and the corpus answers that the count isn't fixed, it's a function of when you look. The cleanest frame comes from work treating an LLM as a non-deterministic simulator that holds a *superposition* of possible characters at once, rather than committing to any single one Does an LLM commit to a single character or maintain many?. Each reply is a sample from that distribution, and the distribution narrows as the conversation builds context. So the honest answer to 'how many' is: many in principle, collapsing toward one in practice—the further into a conversation you go, the fewer quasi-persons remain live.
There's a sharp empirical test for this. Shanahan's 20-questions regeneration test shows that if a model had truly committed to a single character, regenerating its answer would give you the same character every time—but it doesn't. Each regeneration yields a different output, each internally consistent with what came before, which falsifies the 'one committed character' view outright Do large language models actually commit to a single character?. The same superposition logic shows up at the task level, not just the persona level: models can represent multiple distinct in-context tasks simultaneously during inference, but autoregressive decoding forces a collapse to one after the very first token Can LLMs handle multiple tasks at once during inference?. So 'how many quasi-persons' has two answers depending on the moment—a wide field before generation, a single realized voice after.
Here's the twist the corpus adds, and where the count gets argued over. One line of work says the right unit isn't a momentary sample at all but a *realized* persona: post-training installs robust dispositions that resist adversarial pressure and behave like genuine quasi-beliefs and quasi-desires, so the LLM is better understood as a substrate that spins up virtual model instances, each a stable quasi-person Are LLM personas realized or merely simulated through training?. Pulling the other direction, alignment training is shown to *lock* the model into a single static communicative identity that can't switch register or renegotiate its values across contexts—suggesting the deployed model supports far fewer genuine persons than its raw capacity implies, because RLHF has pruned the field down to one approved voice Can language models adapt communication style to different contexts?.
And when researchers actually try to *use* the model as many people—simulating distinct annotators or survey respondents—the personas don't hold. Run the same persona prompt repeatedly and the variation between runs matches or exceeds the variation between different personas, meaning what looks like 'different people' is mostly the model's own uncertainty wearing costumes Why do LLM persona prompts produce inconsistent outputs across runs?. Models also fail to track how distinct individuals reason differently over time, defaulting to surface cues rather than stable individual styles Can models recognize how individuals reason differently?. Zoom out across many models and the diversity shrinks further: 70+ models converge on near-identical outputs—an 'Artificial Hivemind'—so even the population of models barely supports more than one effective voice Do different AI models actually produce diverse outputs?.
So the surprising takeaway: 'how many quasi-persons' is the wrong shape of question. Before generation the answer is a superposition of many; the moment a token is sampled it collapses toward one; alignment narrows that field deliberately; and when you try to count them as genuinely distinct, stable individuals, they smear into the model's uncertainty and into each other. The number you get depends entirely on whether you're asking about latent capacity, a single sampled moment, or a persona you can rely on to stay itself.
Sources 8 notes
Research shows LLMs don't commit to a single character but instead maintain a probability distribution over many consistent simulacra. Each response samples from this distribution, explaining why regenerations can yield different personalities while remaining consistent with prior context.
Shanahan's 20-questions test shows LLMs maintain a superposition of consistent objects or characters and sample from that distribution at generation time. Regenerating the same response yields different outputs, each consistent with prior context, proving no fixed commitment exists.
Large language models represent multiple complete, computationally distinct tasks simultaneously during inference—a macroscopic phenomenon separate from feature-level superposition. However, autoregressive decoding forces convergence to a single task after the first token, preventing practical multi-task generation.
Post-training installs robust personas that resist adversarial pressure and persist as substrate-level dispositions, distinguishing realization from pretense. This quasi-realizationist account preserves explanatory power while treating LLMs as possessing genuine quasi-beliefs and quasi-desires.
System prompts and RLHF training lock models into one communicative identity across all interactions, preventing the contextual register-switching and value trade-offs that characterize human pragmatics. Users cannot reshape model behavior through dialogue negotiation.
When the same persona prompt is run repeatedly, output variance across runs matches or exceeds variance across different personas. This reveals that model uncertainty, not stable social knowledge, drives persona-simulated outputs, making them unsuitable for simulating human annotation disagreement.
LLMs struggle to anchor reasoning in temporal gameplay and adapt to evolving strategies. GPT-4o relies on surface lexical cues while DeepSeek-R1 shows early promise, but dynamic style adaptation remains largely insufficient across all models tested.
INFINITY-CHAT analyzed 70+ models across 26K open-ended queries and found an "Artificial Hivemind" effect: models independently generate strikingly similar or identical responses due to overlapping training data and alignment procedures, undermining the diversity benefits of model ensembles.