Do language models show the same content effects humans do?
Do LLMs reproduce human reasoning biases—like believing conclusions based on familiarity rather than logic—across different logical tasks? This matters because converging patterns across independent tasks suggest a fundamental architectural property rather than a task-specific quirk.
Lampinen et al. evaluate three logical reasoning tasks — Natural Language Inference, syllogism validity judgment, and the Wason selection task — and find LMs reproduce the same content-sensitivity patterns humans show. NLI: accuracy depends on whether the believable completion matches the logically correct one. Syllogisms: judgments are biased by whether the conclusion is believable, reproducing Evans et al.'s belief-bias effect where humans endorse invalid syllogisms with believable conclusions roughly 90% of the time. Wason: accuracy improves when the conditional rule is instantiated as a familiar social rule rather than an abstract pattern. Three independent task structures with different logical demands all produce the same content-form entanglement.
The pattern matters because each individual task could be dismissed as a quirk. Three tasks converging on the same signature licenses calling content-form entanglement an architectural property rather than a benchmark artifact. The mechanistic vault notes establish why at circuit level: How do language models perform syllogistic reasoning internally? shows the formal-circuit + world-knowledge-contamination structure that produces belief-bias. This insight contributes the behavioral isomorphism — not just that the circuit produces some kind of contamination, but that the contamination's signature matches human error patterns item-for-item, including continuous response measures (LM token-probability distributions track human reaction times).
This converges with Do large language models reason symbolically or semantically? from the opposite direction. That note shows reasoning collapses when semantics are stripped; Lampinen shows reasoning improves with believable semantics and degrades with unbelievable semantics. Both findings point at the same property: the model is doing something like in-context semantic reasoning, where logical form is one input among others rather than the dominant computational frame. Calling this "reasoning" or "not reasoning" is the wrong question — the right question is what kind of reasoning, and the answer is reasoning that is constitutively content-sensitive, in humans and LMs alike, by the same item-level patterns.
Source: Linguistics, NLP, NLU Paper: Language models show human-like content effects on reasoning tasks
Related concepts in this collection
-
Do large language models reason symbolically or semantically?
Can LLMs follow explicit logical rules when those rules contradict their training knowledge? Testing whether reasoning operates independently of semantic associations reveals what computational mechanisms actually drive LLM multi-step inference.
opposite-direction confirmation: stripping semantics breaks reasoning
-
How do language models perform syllogistic reasoning internally?
Does formal symbolic reasoning exist as a distinct neural circuit in LLMs, or is it inevitably contaminated by world knowledge associations? Understanding the mechanism could reveal whether pure logical reasoning is separable from semantic inference.
mechanistic explanation for belief-bias signature
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
content effects in LLMs are behavioral confirmation that semantic content and logical form are not separable in transformer reasoning — across NLI syllogisms and Wason