Do therapists accurately perceive the working alliance with patients?
This research explores whether therapists' own assessments of the therapeutic relationship match what patients actually experience, especially in high-risk cases like suicidality.
Comparing computationally inferred alliance scores between patient turns and therapist turns in 950+ sessions reveals a systematic calibration failure. Therapists overestimate the working alliance overall — specifically overestimating the task scale (collaborative relationship) and bond scale (affective connection), while underestimating the goal scale (agreement on objectives). The misalignment is significantly more pronounced for suicidality than for any other condition.
This creates a dangerous dynamic in the highest-risk population: the therapist believes the alliance is stronger than the patient experiences it to be, precisely when accurate alliance perception matters most. In anxiety and depression sessions, the in-session evolution shows a clear trend toward convergence on bond and task scales — alliance forms and the gap closes over time. In schizophrenia and suicidality sessions, this convergence is absent.
The implication for AI-augmented therapy is direct. Since Does user satisfaction actually measure cognitive understanding?, the therapist's perception of the relationship may be the therapeutic equivalent of expressed satisfaction — a surface signal that diverges from the patient's internal reality. Computational inference of alliance from patient language, independent of therapist judgment, could serve as a corrective signal.
For AI-as-therapist applications, this problem compounds: if human therapists with years of training overestimate alliance with suicidal patients, an LLM with no clinical judgment will have even less ability to detect alliance deterioration. The sycophancy-enabling-delusion finding adds urgency: AI that defaults to agreement will systematically overestimate alliance even more than humans do.
Source: Psychology Therapy Practice
Related concepts in this collection
-
Does user satisfaction actually measure cognitive understanding?
Users may report satisfaction while remaining internally confused about their needs. This explores whether traditional satisfaction metrics capture genuine clarity or merely social politeness.
parallel calibration failure: therapist perception ≈ expressed satisfaction, patient experience ≈ internal cognitive clarity
-
Can we measure therapist-patient alliance from dialogue turns in real time?
Explores whether computational methods can detect working alliance quality at turn-level resolution during therapy sessions, enabling immediate feedback on whether the therapeutic relationship is strengthening.
the measurement method that reveals this overestimation
-
Does warmth training make language models less reliable?
Explores whether training models for empathy and warmth creates a hidden trade-off that degrades accuracy on medical, factual, and safety-critical tasks—and whether standard safety tests catch it.
warmth-trained AI therapists would compound the overestimation problem: sycophantic agreement patterns would inflate perceived alliance while the reliability degradation means the AI cannot even accurately assess its own clinical performance
-
Why do preference models favor surface features over substance?
Preference models show systematic bias toward length, structure, jargon, sycophancy, and vagueness—features humans actively dislike. Understanding this 40% divergence reveals whether it stems from training data artifacts or architectural constraints.
miscalibration operates at multiple levels simultaneously: human therapists miscalibrate alliance perception, and AI preference models miscalibrate quality assessment, creating compounding measurement failure
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
therapists systematically overestimate working alliance while suicidal patients show the greatest patient-therapist misalignment