Language Understanding and Pragmatics

Do transformer static embeddings actually encode semantic meaning?

Explores whether the fixed word embeddings that enter transformer networks contain rich semantic information or serve only as shallow placeholders. This addresses a longstanding debate in philosophy of language about whether word meanings are stored or constructed.

Note · 2026-02-23 · sourced from Sentiment Semantics Toxic Detections
What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

The transformer architecture creates two distinct representations for every word: a static token embedding (input to self-attention) and a contextualized embedding (output of self-attention). The static embedding is the invariant entry for each word in the model's vocabulary. The question is whether these static embeddings carry meaningful semantic information or are mere placeholders that get enriched only during self-attention.

The "meaning eliminativist" hypothesis — defended in psycholinguistics by Elman (2004) and philosophy by Rayo (2013) and Recanati (2003) — holds that static word meanings are redundant. Applied to LLMs, this would mean static embeddings store only morphological and syntactic cues, with semantic information introduced entirely at the self-attention layer. Given that embeddings have only 768 parameters per token in RoBERTa-base versus tens of millions in the attention and feed-forward layers, there is architectural reason to expect semantic information might be deferred.

The evidence rules this out. Clustering RoBERTa-base's ~50,000 token embeddings into 200 clusters reveals sensitivity to five psycholinguistic measures:

  1. Valence — pleasantness of the concept (from the Mehrabian three-dimensional emotion model)
  2. Concreteness — perceptible entity vs. abstract notion ("bicycle" = 4.89, "justice" = low)
  3. Iconicity — perceived resemblance between form and meaning (challenging the arbitrariness-of-the-sign thesis)
  4. Taboo — social transgression load of the term
  5. Age of acquisition — when the word is typically learned

The iconicity finding is particularly striking because detecting it requires access to surface properties, semantic properties, and recognition of resemblance between them — all within the static embedding before any attention mechanism operates.

This means LLMs implement something analogous to a lexical store: each word has an entry containing genuine semantic information that is then modulated by context during self-attention. The parallel to the philosophy-of-language debate is direct: static embeddings are rich entries that get contextually adjusted, not minimal cores that get built from scratch each time.

The implication for mechanistic interpretability: semantic information is distributed across two levels — the token embedding layer and the contextualized layers — and analysis that focuses only on intermediate or final representations may miss what was already encoded at input.


Source: Sentiment Semantics Toxic Detections

Related concepts in this collection

Concept map
13 direct connections · 111 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

transformer static embeddings encode rich semantic information including valence concreteness iconicity and taboo — ruling out meaning eliminativism