Where does hierarchical structure in language models come from?
Do LLMs build hierarchical concept geometry through dedicated mechanisms, or does it emerge naturally from word co-occurrence patterns in training data? Understanding the source matters for interpreting what representations actually reveal about model computation.
A recurring interpretability finding is that LLM representations encode hypernymy — the is-a relation between general and specific concepts — geometrically, with broad categories and their sub-categories arranged in nested, near-orthogonal structure. The tempting reading is functional: the model built a hierarchy mechanism because hierarchy is useful. This paper argues the opposite. Starting from the empirically verified assumption that words closer on the WordNet hypernym graph co-occur more often, it characterizes the spectrum of the embedding Gram matrix and shows that, under mild positivity and decay conditions on the co-occurrence kernel, the leading eigenvectors reproduce the taxonomy. Hierarchical concept geometry emerges from the spectral structure of pairwise word statistics; no hierarchy-specific functional mechanism is required.
The explanatory payoff is that this account is more predictive than the functional one. Rather than postulating hierarchical orthogonality from functional desiderata, it derives that the same geometry should appear outside LLMs — in plain word2vec embeddings — and should carry a specific coarse-to-fine spectral organization. Both predictions are confirmed.
Why it matters: it reframes a class of interpretability results. Geometric structure that looks like the model "knowing" a taxonomy can be a downstream shadow of corpus statistics rather than evidence of a dedicated computation. The counterpoint the authors are careful to preserve: such organization may be useful for function — but it is not driven by it. This separates "the representation has structure" from "the model uses a structured mechanism," a distinction interpretability work often blurs.
— "Hierarchical Concept Geometry in Language Models Emerges from Word Co-occurrence", https://arxiv.org/abs/2605.23821
Related concepts in this collection
-
Do embedding eigenvectors organize taxonomy from coarse to fine?
Can we predict how embeddings encode taxonomic hierarchies by examining their spectral structure? This tests whether word co-occurrence statistics alone produce the observed hierarchical geometry in language models.
the specific spectral signature that this distributional mechanism predicts and produces
-
Do standard analysis methods hide nonlinear features in neural networks?
Current representation analysis tools like PCA and linear probing may systematically miss complex nonlinear computations while over-reporting simple linear features. This raises questions about whether our interpretability methods are actually capturing what networks compute.
cautions that geometric structure detected by analysis methods need not be the computationally important structure — consonant with structure-without-mechanism
-
How do language models organize features across processing layers?
Do neural networks arrange learned features into meaningful hierarchies as they process information? Understanding this structure could reveal how models build understanding from raw tokens to abstract concepts.
contrasts a mechanism-level account of feature hierarchy with this statistics-level account of concept geometry
-
Does word frequency correlate with semantic abstraction?
Explores whether LLMs' preference for high-frequency language also pulls them toward more abstract, general meanings—and whether this shapes how they handle expert knowledge.
another WordNet-grounded result linking corpus statistics to the abstraction structure of representations
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
Original note title
hierarchical concept geometry in llms needs no dedicated mechanism it emerges from co-occurrence spectral structure