Psychology and Social Cognition Language Understanding and Pragmatics

Do reasoning scaffolds reshape which empathy skills models develop?

When language models receive identical empathy rewards, does adding explicit reasoning blocks before responses change which capabilities they actually improve? This matters for understanding how training structure, not just training signal, shapes model development.

Note · 2026-02-22 · sourced from Psychology Empathy
What kind of thing is an LLM really? How should we allocate compute budget at inference time? How should researchers navigate LLM reasoning research?

Under RLVER training with identical verifiable emotion rewards, models with and without explicit reasoning scaffolds develop along different axes:

This divergence under the same training signal is the key finding. The explicit reasoning scaffold doesn't just improve the model — it redirects what the model improves at. The think-then-say template forces the model to "access and refine higher-order empathetic skills" by externalizing its reasoning about the user's emotional state before responding.

This connects to the broader reasoning literature in two ways:

First, it parallels Does RL teach reasoning or just when to use it? — the thinking scaffold provides a pre-existing mechanism (extended deliberation), and RL teaches the model when and how to apply that mechanism to empathetic dialogue. The capability was latent; RL surfaces it through the scaffold.

Second, it complicates When does explicit reasoning actually help model performance?. Empathy is arguably a "continuous nuanced judgment" task, yet the thinking scaffold helps. The resolution may be that the scaffold here works not by imposing logical structure on empathy, but by creating space for the model to deliberate about social context before committing to a response.


Source: Psychology Empathy

Related concepts in this collection

Concept map
14 direct connections · 160 in 2-hop network ·dense cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

Thinking and non-thinking models develop distinct empathy profiles under RL training — thinking models enhance empathy and insight while non-thinking models focus on action-oriented capabilities