How does iconicity detection work within static embeddings before any attention?

This explores what happens *inside* a word's static embedding — the fixed vector a model holds for a word before attention mixes in context — and asks whether properties like iconicity (how much a word's sound resembles its meaning) are already baked into that vector.

This explores what's encoded in a model's static embeddings — the fixed lexical vectors a transformer starts with before self-attention does any context-mixing — and whether something as subtle as iconicity is already present there. The short answer the corpus gives: yes. Clustering analysis of RoBERTa's static embeddings shows they're sensitive to five psycholinguistic measures — valence, concreteness, iconicity, taboo, and more — meaning these vectors already behave like genuine lexical entries carrying semantic content, not blank slots waiting for attention to fill them Do transformer static embeddings actually encode semantic meaning?. "Detection" here isn't a separate mechanism scanning for iconicity; it's that words scoring high on iconicity land near each other in embedding space, so the property falls out of how the vectors are arranged.

What makes this interesting is *how* that arrangement is structured. The geometry of embedding space isn't a flat soup of similarity — it has organized internal directions. The leading eigenvectors of embedding similarity matrices carve the vocabulary coarse-to-fine, splitting broad taxonomic branches first and finer ones later, tracking the WordNet hypernym tree level by level Do embedding eigenvectors organize taxonomy from coarse to fine?. And models go further than mere proximity: syntactic relationships get encoded in a polar-coordinate-like geometry where both distance *and angle* carry meaning How do language models encode syntactic relations geometrically?. So a psycholinguistic property like iconicity riding on the same pre-attention geometry isn't an exotic claim — it's the same trick applied to a different axis of meaning.

The lateral tension worth knowing: embeddings encode *association*, not *relevance*. The same geometry that captures iconicity also makes semantically close but functionally wrong words look similar — which is why embedding-only retrieval shines in demos but stumbles in production on underspecified queries Do vector embeddings actually measure task relevance?. Static embeddings are rich, but their richness is correlational. They register that a word *feels* a certain way without any guarantee that feeling is the one your task needs.

The deeper question this opens: how much of a model's apparent understanding is settled before attention even runs? If valence and iconicity live in the static layer, then self-attention isn't building meaning from scratch — it's recombining lexical content that's already there. That reframes attention itself, which has its own structural biases (it over-weights repeated and context-prominent tokens regardless of relevance) Does transformer attention architecture inherently favor repeated content?. The picture that emerges across these notes: meaning is laid down in two stages — a static lexicon that already knows what words feel like, and a contextual layer that reweights it, sometimes helpfully and sometimes not.

Sources 5 notes

Do transformer static embeddings actually encode semantic meaning?

Clustering analysis of RoBERTa embeddings reveals sensitivity to five psycholinguistic measures including valence, concreteness, iconicity, and taboo. This demonstrates that static embeddings function as genuine lexical entries containing semantic content before self-attention operates.

Do embedding eigenvectors organize taxonomy from coarse to fine?

Leading eigenvectors of embedding Gram matrices separate broad taxonomic branches first, then progressively finer sub-branches—a coarse-to-fine spectral order that tracks the WordNet hypernym tree level by level, confirming predictions from co-occurrence statistics.

How do language models encode syntactic relations geometrically?

The Polar Probe shows LLMs represent syntactic type and direction through both distance and angular position between embeddings, nearly doubling accuracy over distance-only methods. This demonstrates neural networks spontaneously learn structured, symbolic-compatible geometry.

Do vector embeddings actually measure task relevance?

Embeddings encode co-occurrence patterns, making semantically close but role-distinct concepts highly similar. This works in simple demos but fails in production where underspecified queries have many wrong-but-associated candidates.

Does transformer attention architecture inherently favor repeated content?

Transformer soft attention systematically over-weights repeated and context-prominent tokens regardless of relevance, creating a positive feedback loop that amplifies opinions and framing before RLHF acts. System 2 Attention—regenerating context to remove irrelevant material—can interrupt this mechanism.

How does iconicity detection work within static embeddings before any attention?

Sources 5 notes

Next inquiring lines