How do language models encode syntactic relations geometrically?
Do LLM embeddings use distance alone or also direction to represent syntax? Understanding whether neural networks can spontaneously develop symbolic-compatible geometric structures.
The symbol-vector divide has been a core challenge in cognitive science since Smolensky (1987): syntactic trees are symbolic structures that seem incompatible with the vectorial representations of neural networks. The Structural Probe (Hewitt & Manning 2019) made partial progress — it showed that the existence of syntactic links between words is encoded in the distance between their corresponding embeddings. But whether the type and direction of syntactic relations were represented remained unknown.
The Polar Probe answers this: syntactic relations are coded by the relative direction between nearby embeddings, not just their distance. Using both distance and direction (a polar coordinate system), the Polar Probe recovers syntactic relation types and directions with nearly 2x the accuracy of the distance-only Structural Probe.
Three key findings:
Complete syntactic encoding. The polar coordinate system captures existence, type, AND direction of syntactic relations — the full specification of a dependency tree is encoded in the geometry of LLM activations.
Low-dimensional subspace. This encoding exists in a low-dimensional subspace of intermediate layers across many LLMs, and becomes increasingly precise in frontier models. This is not a brute-force representation but a compressed, structured one.
Nested consistency. Similar syntactic relations are coded similarly across nested levels of syntactic trees. The encoding is not ad hoc for each syntactic instance but systematic — a genuine coordinate system.
The resolution of the symbol-vector divide is significant: LLMs don't need explicit symbolic mechanisms to represent symbolic structures. They spontaneously learn a geometry that explicitly represents the main symbolic structures of linguistic theory. This doesn't mean LLMs "understand" syntax in a human sense, but it demonstrates that connectionist architectures can natively develop symbolic-compatible representations — the two paradigms are not incompatible.
This connects to Do transformer static embeddings actually encode semantic meaning? at a different structural level: static embeddings encode semantic features, while intermediate activations encode syntactic relations. Together they suggest LLM representations are far richer and more structured than the "statistical patterns" dismissal implies.
Source: Cognitive Models Latent
Related concepts in this collection
-
Do transformer static embeddings actually encode semantic meaning?
Explores whether the fixed word embeddings that enter transformer networks contain rich semantic information or serve only as shallow placeholders. This addresses a longstanding debate in philosophy of language about whether word meanings are stored or constructed.
semantic features in static embeddings complement syntactic features in intermediate activations
-
Why do neural networks fail at compositional generalization?
Exploring whether the binding problem from neuroscience explains neural networks' inability to systematically generalize. The binding problem has three aspects—segregation, representation, and composition—each creating distinct failure modes in how networks handle structured information.
polar coordinate encoding is evidence against the strong version: systematic structure IS represented, even if binding problems remain at the compositional level
-
Can neural networks learn compositional skills without symbolic mechanisms?
Do neural networks need explicit symbolic architecture to compose learned concepts, or can scaling alone enable compositional generalization? This asks whether compositionality is an architectural feature or an emergent property of scale.
convergent: symbolic-like structure emerges without explicit symbolic mechanisms
-
Do neural networks naturally break tasks into modular parts?
Can standard neural networks decompose complex tasks into separate subroutines implemented in distinct subnetworks, or do they only memorize input-output patterns? Understanding whether compositionality emerges from gradient-based learning matters for interpretability and generalization.
related: modular structure emerges from training
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
a polar coordinate system in llm activations encodes both type and direction of syntactic relations — resolving the symbol-vector divide