A polar coordinate system represents syntax in large language models

Paper · arXiv 2412.05571 · Published December 7, 2024
Cognitive Models LatentMechInterp

Originally formalized with symbolic representations, syntactic trees may also be effectively represented in the activations of large language models (LLMs). Indeed, a “Structural Probe” can find a subspace of neural activations, where syntactically related words are relatively close to one-another. However, this syntactic code remains incomplete: the distance between the Structural Probe word embeddings can represent the existence but not the type and direction of syntactic relations. Here, we hypothesize that syntactic relations are, in fact, coded by the relative direction between nearby embeddings. To test this hypothesis, we introduce a “Polar Probe” trained to read syntactic relations from both the distance and the direction between word embeddings. Our approach reveals three main findings. First, our Polar Probe successfully recovers the type and direction of syntactic relations, and substantially outperforms the Structural Probe by nearly two folds. Second, we confirm that this polar coordinate system exists in a low-dimensional subspace of the intermediate layers of many LLMs and becomes increasingly precise in the latest frontier models. Third, we demonstrate with a new benchmark that similar syntactic relations are coded similarly across the nested levels of syntactic trees. Overall, this work shows that LLMs spontaneously learn a geometry of neural activations that explicitly represents the main symbolic structures of linguistic theory.

Despite their conceptual soundness and alignment with human behavior (Robins, 2013), syntactic trees have long been the crux of a core challenge in cognitive science (Smolensky, 1987): trees are symbolic representations, which can superficially appear incompatible with the vectorial representations of neural networks. This opposition between symbols and vectors has been a major challenge to the unification of linguistic theories on the one hand, and neuroscience and connectionist AI on the other hand.

Recently, Hewitt and Manning (Hewitt and Manning, 2019) proposed an important concept for this issue, by suggesting that the existence of syntactic link between two words may be represented by the distance between their corresponding embeddings

This Structural Probe, however, can only reveal one aspect of dependency trees: namely, the existence of syntactic relations, between word pairs. However, whether and how the direction and the type of syntactic relations are represented in language models remains unknown