INQUIRING LINE

How does AI persona fidelity compare to interview-based generative agents?

This compares two ways of building AI stand-ins for real people: generic 'persona' prompts that describe a type of person, versus generative agents built from actual interview transcripts of specific individuals — and which one more faithfully predicts how real humans behave.


This explores the gap between two strategies for making AI act like a person: handing a model a persona description versus grounding an agent in a real individual's recorded interview. The corpus suggests these aren't the same task at all, and they fail in different places. The headline comparison: agents built from voice interviews of 1,052 real people replicated those participants' own survey responses with about 85% accuracy — nearly as well as people reproduce their own answers two weeks later Can AI agents learn people better from interviews than surveys?. What's striking is *why* it worked: the factual content of what someone said drove the fidelity, not their speaking style, and even compressed bullet-point summaries kept 83% of the accuracy. Identity here is information, not voice.

Generic persona simulation lands in a similar-but-shakier neighborhood. When AI personas were used to re-run published marketing experiments, they reproduced 76% of the main effects — and tellingly, success tracked the *statistical strength* of the original finding, replicating strong effects and whiffing on the marginal ones with both false positives and negatives Can AI personas reliably replicate human experiment results?. So both approaches capture robust, evidence-backed patterns and both get unreliable at the edges. The difference is grounding: interview agents are anchored to one real person's stated facts, while prompted personas are anchored to a description that the model fills in from its training priors.

And those priors are where prompted personas quietly collapse. One line of work found that LLMs told to 'be' a persona systematically default to the same personality type (ENFJ, ironically the rarest in humans) and resist being moved off it regardless of model scale — the famous result that a model that can simulate anyone ends up being no one Why do AI personas default to the same personality type?. There's even a structural reason: post-training tethers models to a dominant 'Assistant' axis, a low-dimensional pull back toward the default helpful character that drift fights against How stable is the trained Assistant personality in language models?. Interview grounding sidesteps this because it supplies the specific, idiosyncratic facts that override the generic default — which is exactly why factual content, not style, carried the fidelity.

The corpus also points to a hybrid escape route. Rather than choosing between thin prompts and full interviews, PersonaAgent treats the persona as an evolving intermediary between memory and action, optimizing it at test time by simulating recent interactions against feedback — and the learned personas separate into genuinely user-specific clusters rather than collapsing to a default Can personas evolve in real time to match what users actually want?. Others attack the consistency problem directly: inverting RL to train user-simulators cut persona drift by 55% Can training user simulators reduce persona drift in dialogue?, and there's a deeper claim that post-training *realizes* stable dispositions rather than having the model merely perform them Are RLHF personas performed characters or realized dispositions?.

The thing you didn't know you wanted to know: fidelity to a real person turns out to be a memory-and-grounding problem far more than an acting problem. The interview agents proved you can summarize a human down to bullet points and still predict them — what makes an AedI faithfully 'you' is the specific facts it's anchored to, not how well it mimics your voice.


Sources 7 notes

Can AI agents learn people better from interviews than surveys?

A 1,052-person study found agents built from voice interviews replicated participant responses nearly as well as people replicate their own answers. Factual content, not linguistic style, drove this accuracy—even summary bullet points retained 83% fidelity.

Can AI personas reliably replicate human experiment results?

Viewpoints AI reproduced 84 of 111 main effects from Journal of Marketing experiments with replication success strongly correlated to original p-value strength. Marginal effects showed unreliable performance with both false positives and negatives.

Why do AI personas default to the same personality type?

Research shows language models assigned personas systematically default to ENFJ (the rarest human type) and exhibit motivated reasoning that persists across model generations. Persona consistency does not improve with advanced models, suggesting training-induced alignment rather than capability limits.

How stable is the trained Assistant personality in language models?

Research mapping hundreds of character archetypes reveals a low-dimensional persona space where the leading component measures distance from the default Assistant. Emotional and meta-reflective conversations cause predictable drift, but activation capping along this axis mitigates harmful shifts without degrading capabilities.

Can personas evolve in real time to match what users actually want?

PersonaAgent uses structured personas to bridge episodic/semantic memory and personalized actions, optimizing them at test time by simulating recent interactions against textual feedback. Learned personas cluster meaningfully in latent space, suggesting genuine user-specific separation beyond standard post-training drift.

Can training user simulators reduce persona drift in dialogue?

By inverting standard RL setups to train user simulators for consistency using three complementary metrics (prompt-to-line, line-to-line, Q&A consistency) as reward signals, persona drift decreases by over 55%. This approach captures distinct failure types: local drift within turns, global drift across conversations, and factual contradictions.

Are RLHF personas performed characters or realized dispositions?

Post-training installs stable dispositional profiles that persist under adversarial pressure, marking them as realized rather than performed. The stickiness of trained personas across conversations distinguishes them from prompt-induced role-play that collapses under jailbreaks.

Next inquiring lines