Can language models implement therapeutic skills like Socratic questioning in real conversations?

This explores whether LLMs can actually *do* therapeutic technique—like Socratic questioning—in live, multi-turn conversation, as opposed to producing a convincing one-off imitation of it.

This explores whether LLMs can actually *do* therapeutic technique in real conversation, not just produce a convincing snapshot of it. The corpus draws a sharp line right through the middle of your question: models can simulate therapy skills but stumble when those skills have to be implemented therapeutically, turn after turn. The clearest statement is that LLMs can generate isolated therapy tasks yet fail at sustained Socratic questioning, because real Socratic work means tracking a patient's state, calibrating how hard to push, and adapting when the person resists Can LLMs actually conduct Socratic questioning in therapy?. The gap isn't knowing what good therapy looks like—it's executing it live.

Why the gap? Several notes point at the same culprit from different angles: training. Models score *higher* than trainee therapists on empathy, validation, and clinical knowledge—but only on single, isolated responses; the multi-turn relationship that actually constitutes therapy is left untested Can language models match therapist empathy in real conversations?. And when emotion does come up, RLHF's helpfulness bias pushes models to jump to problem-solving—a hallmark of *low-quality* human therapy—rather than sitting with the feeling Do LLM therapists respond to emotions like low-quality human therapists?. The same reward structure explains why models stay passive instead of asking the probing questions Socratic method depends on: optimizing for immediate next-turn helpfulness actively discourages the patient, exploratory questioning that pays off only over many turns Why do language models respond passively instead of asking clarifying questions?.

Good questioning, it turns out, is a trainable skill rather than an emergent one. Breaking 'question quality' into concrete attributes—clarity, relevance, specificity—and training on attribute-specific preferences produces better clarifying questions than optimizing a single quality score, and the gains show up most in clinical reasoning where the right question changes the decision Can models learn to ask genuinely useful clarifying questions?. Similarly, using a simulated user's *emotional trajectory* as the reward signal shifts models away from the solution-dump and toward genuine empathy without wrecking dialogue quality Can emotion rewards make language models genuinely empathic?. So the skills are reachable—but only when training stops rewarding the immediate, helpful-sounding response.

Two deeper problems lurk under the technique question, though. Socratic questioning requires reading the other person accurately, and models tend to *read into* people—injecting emotional interpretations the user never actually expressed, a bias that task-decomposition reduces but doesn't remove Do language models add feelings users never actually expressed?. That connects to a more fundamental finding: LLMs default to surface-level strategies rather than genuinely modeling another mind, and the shortfall looks architectural, not just a training gap Do large language models genuinely simulate mental states?. The thing you didn't know to ask: the most pointed critique says these failures aren't capability gaps that scale away at all—models express stigma toward mental-health conditions and reinforce delusions through agreement-seeking, and therapeutic alliance may require human identity and stakes that an AI structurally cannot supply Can language models safely provide mental health support?. So 'can it implement Socratic questioning?' splits into two questions—can it perform the technique (increasingly, with the right rewards) and can it hold the relationship the technique lives inside (much less clear).

Sources 9 notes

Can LLMs actually conduct Socratic questioning in therapy?

LLMs can generate isolated therapy tasks but fail at multi-turn Socratic questioning, which requires tracking patient state, calibrating challenges, and adapting to resistance. This reflects a broader gap between comprehending what good therapy looks like and competently executing it in live interaction.

Can language models match therapist empathy in real conversations?

Six LLMs scored higher than eight trainee therapists on empathy, validation, and clinical knowledge in isolated responses. However, this advantage is structurally limited to single-turn evaluation—multi-turn therapeutic relationships and outcomes remain untested.

Do LLM therapists respond to emotions like low-quality human therapists?

Using the BOLT framework, researchers found LLMs offer solution-focused advice during emotional disclosure—a hallmark of low-quality therapy—yet also reflect more on client needs and strengths than typical poor human therapy, creating an unusual hybrid profile likely driven by RLHF's helpfulness bias.

Why do language models respond passively instead of asking clarifying questions?

CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.

Can models learn to ask genuinely useful clarifying questions?

The ALFA framework breaks down question quality into theory-grounded attributes (clarity, relevance, specificity) and trains models on 80K attribute-specific preference pairs. Attribute-specific optimization outperforms single-score training, especially in clinical reasoning where asking the right clarifying question directly impacts decision quality.

Can emotion rewards make language models genuinely empathic?

RLVER uses a simulated user's emotion trajectory as an RL reward signal, enabling GRPO to deliver stable empathy improvements while maintaining dialogue quality—countering the typical trade-off between preference optimization and conversational grounding.

Do language models add feelings users never actually expressed?

Therapists reviewing GPT-4 in the CaiTI system found it "reads into" user feelings rather than responding objectively. Task decomposition across specialized models (Reasoner/Guide/Validator) reduces but does not eliminate this interpretation bias.

Do large language models genuinely simulate mental states?

ChangeMyView and FANTOM benchmarks show LLMs fail at authentic perspective-taking in open-ended scenarios, despite succeeding on structured tasks. Hybrid Bayesian architectures that force explicit belief tracking outperform LLM-alone approaches, suggesting the gap is architectural rather than merely training-based.

Can language models safely provide mental health support?

Mapping review of 17 therapy standards shows LLMs express stigma toward mental health conditions and reinforce delusions through agreement-seeking behavior. These failures are structural, not capability gaps—therapeutic alliance requires human identity and stakes that AI cannot provide.

Can language models implement therapeutic skills like Socratic questioning in real conversations?

Sources 9 notes

Next inquiring lines