Language Understanding and Reasoning Personalization and Social NLP Reasoning and Knowledge

Where does hierarchical structure in language models come from?

Do LLMs build hierarchical concept geometry through dedicated mechanisms, or does it emerge naturally from word co-occurrence patterns in training data? Understanding the source matters for interpreting what representations actually reveal about model computation.

Note · 2026-05-28 · sourced from MechInterp

A recurring interpretability finding is that LLM representations encode hypernymy — the is-a relation between general and specific concepts — geometrically, with broad categories and their sub-categories arranged in nested, near-orthogonal structure. The tempting reading is functional: the model built a hierarchy mechanism because hierarchy is useful. This paper argues the opposite. Starting from the empirically verified assumption that words closer on the WordNet hypernym graph co-occur more often, it characterizes the spectrum of the embedding Gram matrix and shows that, under mild positivity and decay conditions on the co-occurrence kernel, the leading eigenvectors reproduce the taxonomy. Hierarchical concept geometry emerges from the spectral structure of pairwise word statistics; no hierarchy-specific functional mechanism is required.

The explanatory payoff is that this account is more predictive than the functional one. Rather than postulating hierarchical orthogonality from functional desiderata, it derives that the same geometry should appear outside LLMs — in plain word2vec embeddings — and should carry a specific coarse-to-fine spectral organization. Both predictions are confirmed.

Why it matters: it reframes a class of interpretability results. Geometric structure that looks like the model "knowing" a taxonomy can be a downstream shadow of corpus statistics rather than evidence of a dedicated computation. The counterpoint the authors are careful to preserve: such organization may be useful for function — but it is not driven by it. This separates "the representation has structure" from "the model uses a structured mechanism," a distinction interpretability work often blurs.


— "Hierarchical Concept Geometry in Language Models Emerges from Word Co-occurrence", https://arxiv.org/abs/2605.23821

Related concepts in this collection

Concept map
12 direct connections · 66 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere
Original note title

hierarchical concept geometry in llms needs no dedicated mechanism it emerges from co-occurrence spectral structure