INQUIRING LINE

Why do language models reproduce human EPA structure despite different architecture?

This explores why LLMs recover the same three-axis affective structure humans use to organize meaning — Evaluation (good/bad), Potency (strong/weak), Activity (active/passive), the dimensions Osgood found across human cultures — even though a transformer is nothing like a brain.


This explores why LLMs reproduce the human EPA structure (the good–bad, strong–weak, active–passive axes that recur across human languages) without sharing our neural hardware. The corpus doesn't have a paper on EPA by name, but it has a strong lateral answer: the structure was never in the architecture to begin with — it's in language, and LLMs learn it the way they learn everything, by compressing relational patterns from text. One note argues directly that LLMs operationalize Saussure's *langue* — meaning built entirely from how words relate to other words, with no external referent or embodied grounding required Can language models learn meaning without engaging the world?. If affective meaning is already encoded in how humans use words relative to one another, a model that compresses those relations will recover the same low-dimensional scaffold, regardless of whether it's made of neurons or matrices.

The surprising part is how *geometric* this recovery turns out to be. The Polar Probe work shows that models spontaneously lay out syntactic relations in a structured coordinate system — encoding both the type and the direction of a relationship through distance and angle in activation space How do language models encode syntactic relations geometrically?. That's the same flavor of result you'd expect for EPA: a handful of interpretable axes emerging in the internal geometry because they're statistically efficient ways to represent the data, not because anyone built them in. Architecture shapes *how* this happens — deep-and-thin models compose abstract concepts layer by layer rather than spreading them across width Does depth matter more than width for tiny language models? — but the destination (a compact relational structure) is driven by the corpus, not the wiring.

There's a deeper framing here worth sitting with: from the outside, humans and LLMs are categorically different systems, but as *participants in the same discourse* they draw on the same symbolic substrate Do humans and LLMs differ fundamentally or just superficially?. EPA convergence is exactly what that view predicts — the shared structure lives in the language both parties use, so it shows up in both regardless of the machinery underneath. The question's premise ("despite different architecture") quietly assumes affective structure ought to come from brain-like hardware. The corpus suggests it comes from the data instead.

But the same notes that explain the convergence also warn you not to over-read it. Models routinely learn surface generalizations that mimic deep structure — passing tests on cues like word choice and length while missing the underlying grammar Can models pass tests while missing the actual grammar?. So reproducing EPA-shaped geometry doesn't prove a model *means* good and bad the way you do; it may have captured the statistical shadow of human affect without the thing that casts it. And the structure it absorbs is whatever the text overrepresented — the cultural-flattening work shows low-resource cultures getting represented internally through dominant-culture proxies Do LLMs represent low-resource cultures through dominant cultural proxies?. If EPA looks universal in a model, that may partly reflect whose affective language dominated the training corpus, not a culture-free law of meaning.

The thing you didn't know you wanted to know: the EPA puzzle inverts. The real surprise isn't that a non-brain reproduces human affective structure — it's that this structure was apparently sitting in plain language all along, recoverable by anything that compresses relations hard enough, which quietly raises the question of how much of human meaning is "in our heads" versus already laid down in the words we share.


Sources 6 notes

Can language models learn meaning without engaging the world?

Research shows LLMs learn culturally situated discourse patterns by compressing relational structure from text, demonstrating that fluent language generation requires no external referents or embodied grounding.

How do language models encode syntactic relations geometrically?

The Polar Probe shows LLMs represent syntactic type and direction through both distance and angular position between embeddings, nearly doubling accuracy over distance-only methods. This demonstrates neural networks spontaneously learn structured, symbolic-compatible geometry.

Does depth matter more than width for tiny language models?

MobileLLM shows deep-and-thin architectures yield 2.7–4.3% accuracy gains over balanced designs at 125M–350M scale by composing abstract concepts through layers rather than spreading parameters across width.

Do humans and LLMs differ fundamentally or just superficially?

Applied Habermas's observer/participant distinction to AI: from outside, humans and LLMs are utterly different; from within shared discourse, both draw on the same symbolic substrate, making the difference structural rather than absolute.

Can models pass tests while missing the actual grammar?

BabyLM evaluations showed models can produce correct outputs by relying on sentence length, word choice, and orthography rather than grammatical structure. Standard benchmarks cannot distinguish these two generalization types without tests specifically designed to rule out surface heuristics.

Do LLMs represent low-resource cultures through dominant cultural proxies?

Mechanistic interpretability analysis reveals that low-resource cultures like Ethiopia and Algeria are structurally represented through high-resource cultural proxies in internal model states, not just output. This architectural bias persists even when models can produce correct surface-level answers.

Next inquiring lines