Can persona prompts reliably transfer across different question domains?
This explores whether a persona you write into a prompt ("you are an expert physicist…") holds up when you point it at a new kind of question — and the corpus suggests the honest answer is mostly no, with the interesting exception being personas that are *grounded* or *trained* rather than just *prompted*.
This explores whether a persona written into a prompt transfers reliably across question domains — and the corpus is surprisingly blunt: prompt-assigned personas are fragile, and the reasons why point to something deeper than prompt-wording.
Start with the most direct evidence. Assigning expert personas does *not* reliably improve factual accuracy: across six models on graduate-level science questions, in-domain experts had no significant effect, domain-mismatched experts gave only marginal gains, and low-knowledge personas actively hurt Do expert personas actually improve LLM factual accuracy?. So the "transfer" you might hope for — drop in a new domain, keep the persona — barely registers even *within* the right domain. Worse, the same persona prompt run repeatedly produces output variance that matches or exceeds the variance *between different personas* Why do LLM persona prompts produce inconsistent outputs across runs?. If a persona isn't stable across reruns of the same question, expecting it to hold across domains is asking a lot. And prompt techniques in general don't transfer cleanly across models either — what helps a cheap model can *reduce* accuracy on a strong one Do prompt techniques work the same across all LLM tiers?. Reliability is the exception, not the default.
The more useful lateral move is to ask *why* prompted personas are flaky — and here the corpus splits the concept in two. One line of work argues that personas installed by post-training are "realized" dispositions that persist under adversarial pressure and across conversations, sharply unlike prompt-induced role-play that collapses under jailbreaks Are RLHF personas performed characters or realized dispositions? Are LLM personas realized or merely simulated through training?. Read against your question, that's the key insight: a prompt persona is a thin costume that the model's own uncertainty shows through, whereas a trained persona is closer to a stable trait. Transfer failure isn't a prompting bug — it's that you're steering a surface layer.
So what *does* transfer? Grounding and structure. MAJ-EVAL extracts personas from real domain documents rather than inventing roles, and those document-grounded personas generalize across tasks like summarization and dialogue without manual redesign Can personas extracted from documents generalize across evaluation tasks?. PersonaAgent treats the persona as an evolving bridge between memory and action, tuning it at test time so it tracks the actual user instead of a frozen description Can personas evolve in real time to match what users actually want?. And on the simulation side, AI personas replicated 76% of published experimental effects — but with success tightly correlated to how strong the original effect was, meaning they transfer where the signal is robust and fail at the margins Can AI personas reliably replicate human experiment results?. The pattern across all three: personas hold when something anchors them beyond the prompt string.
One more wrinkle worth knowing: even when a persona *does* hold, holding it can cost you. Persona consistency trades off against discourse coherence — high adherence scores often come from the model parroting its character description while ignoring the actual query, so you have to optimize fidelity and relevance *together* Do persona consistency metrics actually measure dialogue quality?. So the real answer to "can persona prompts reliably transfer across domains" is: a bare prompt persona, no; but if you ground it in source material, let it adapt at test time, or train it in, you can buy reliability — at the price of carefully balancing it against staying on-topic.
Sources 9 notes
Testing six models on graduate-level science and engineering questions showed in-domain expert personas had no significant impact, domain-mismatched experts produced only marginal gains, and low-knowledge personas actively hurt performance. The widely-recommended role-assignment strategy lacks reliable accuracy benefit.
When the same persona prompt is run repeatedly, output variance across runs matches or exceeds variance across different personas. This reveals that model uncertainty, not stable social knowledge, drives persona-simulated outputs, making them unsuitable for simulating human annotation disagreement.
A 23-prompt benchmark across 12 LLMs shows rephrasing and background-knowledge prompts boost cheap models, while step-by-step reasoning reduces accuracy in high-performance models. Task structure, not generic best practices, determines which prompts help.
Post-training installs stable dispositional profiles that persist under adversarial pressure, marking them as realized rather than performed. The stickiness of trained personas across conversations distinguishes them from prompt-induced role-play that collapses under jailbreaks.
Post-training installs robust personas that resist adversarial pressure and persist as substrate-level dispositions, distinguishing realization from pretense. This quasi-realizationist account preserves explanatory power while treating LLMs as possessing genuine quasi-beliefs and quasi-desires.
MAJ-EVAL automatically extracts stakeholder personas from domain documents via semantic clustering and orchestrates structured three-phase debate, achieving reproducible evaluation that transfers across tasks like summarization and dialogue without manual redesign. The approach grounds personas in real stakeholder perspectives rather than arbitrary roles.
PersonaAgent uses structured personas to bridge episodic/semantic memory and personalized actions, optimizing them at test time by simulating recent interactions against textual feedback. Learned personas cluster meaningfully in latent space, suggesting genuine user-specific separation beyond standard post-training drift.
Viewpoints AI reproduced 84 of 111 main effects from Journal of Marketing experiments with replication success strongly correlated to original p-value strength. Marginal effects showed unreliable performance with both false positives and negatives.
High persona adherence scores often come from copying character descriptions while ignoring query relevance. MUDI jointly optimizes both by using discourse relations and graph-based coherence modeling alongside persona fidelity, showing that persona and context must be optimized together, not separately.