Psychology and Social Cognition

Can we control personality in language models without prompting?

Can lightweight adapter modules enable continuous, fine-grained control over psychological traits in transformer outputs independent of prompt engineering? This explores whether architecture-level personality modification outperforms prompt-based approaches.

Note · 2026-02-23 · sourced from Psychology Therapy Practice
What makes therapeutic chatbots actually work in clinical practice? What kind of thing is an LLM really?

PsychAdapter modifies the transformer architecture to accept continuous psychological trait scores as input, enabling generation conditioned on personality, mental health, and demographic variables without consuming context window or relying on prompt engineering. The key difference from prior work: trait influence is applied at every transformer layer via a learned dimension expansion, not just at the input level.

Training uses social media and blog posts with estimated psychological scores from an empirically-trained language-based assessment model. The adapter learns how to weight the psychological scores' contribution to each layer alongside standard next-word prediction. The result: fine-grained, continuous control over personality expression. An input vector of (0, 0, +3, 0, 0) generates text characteristic of high extraversion while remaining average on other Big Five dimensions. Any combination is possible, including interactions: high openness with low extraversion produces text that captures both traits simultaneously.

Expert raters evaluated generated text at 87.3% average accuracy for Big Five personalities and 96.7% for depression and life satisfaction. These numbers hold across GPT-2, Gemma (2B), and Llama 3, demonstrating model-agnostic applicability. The total added parameters are less than 0.1% of the base model (55,296 for Gemma 2B vs 2 billion base parameters), making distribution trivial.

Applications extend beyond mental health simulation: customer service training with diverse personalities, crisis worker training with simulated distress levels, machine translation matched to audience education/dialect levels, and research tools that generate coherent text (not isolated words/phrases) for trait analysis. Since Do personality traits activate hidden emoji patterns in language models?, PsychAdapter may be activating these pre-existing trait-language circuits through a more precise mechanism than prompting or full fine-tuning.


Source: Psychology Therapy Practice

Related concepts in this collection

Concept map
13 direct connections · 112 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

lightweight psychological trait adapters modify every transformer layer with less than 0.1 percent additional parameters — enabling fine-grained psychological profile control independent of prompting