Psychology and Social Cognition

Does model capability translate to better persona consistency?

As language models become more advanced, do they naturally become better at maintaining consistent personas across conversations? PersonaGym testing across multiple models and thousands of interactions explores whether scaling helps with persona adherence.

Note · 2026-02-22 · sourced from Personas Personality
What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

The PersonaGym evaluation framework tests 6 open and closed-source LLMs on persona adherence across 200 personas and 10,000 questions. The finding: Claude 3.5 Sonnet achieves only a 2.97% relative improvement in PersonaScore over GPT 3.5 — despite being a much more advanced model by every other measure.

This suggests persona consistency is an orthogonal capability that standard training does not improve. Models get better at reasoning, coding, instruction-following, and knowledge retrieval as they scale — but they do not get meaningfully better at maintaining a consistent persona across varied interactions.

The explanation likely connects to how models are trained. Standard training objectives (next-token prediction, RLHF for helpfulness) optimize for response quality on a per-turn basis. Persona consistency requires cross-turn coherence — remembering what you said earlier, maintaining behavioral patterns, avoiding contradiction with your established character. These are different optimization targets that standard training doesn't address.

Since Can open language models adopt different personalities through prompting?, the problem compounds: models resist persona change AND their base persona-adherence capability doesn't improve with scale. More capability doesn't mean more flexibility or more consistency.

This finding challenges the assumption that "better models will naturally solve persona problems." Dedicated persona training — whether through Why does supervised learning fail to enforce persona consistency? or other methods — appears necessary.


Source: Personas Personality

Related concepts in this collection

Concept map
13 direct connections · 99 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

persona adherence does not scale with general model capability — advanced models show minimal improvement over basic models