Do language models and humans respond to word frequency the same way?
Both LLMs and humans show stronger responses to high-frequency words. This raises a puzzle: if models mirror human neural patterns, what actually makes them different from human language processing?
Adam's Law's literature review surfaces an inconvenient symmetry. Desai et al. (2020) and Alexandrov et al. (2011) found that high-frequency words evoke stronger neural responses in human readers than low-frequency words during reading tasks. Heylen et al. (2008) found high-frequency target words have higher semantic similarity to nearest-neighbor words in distributional analyses — frequency drives perceived semantic similarity. Mohan and Weber (2019) document frequency effects on semantic retrieval. The frequency-comprehension link is not an LLM-specific artifact. Humans show it too, at the neural level.
This complicates the easy "LLMs are aliens" framing that often accompanies critiques like Do LLMs compress concepts more aggressively than humans do?. At the level of statistical exposure to text, models and human readers occupy the same regime: both privilege the frequent. The convergence is not coincidence; both systems are exposed to the same statistical structure of language — the shape of natural language is not neutral, and the shape leans on frequency. Word frequency is a property of the linguistic environment, not just a property of how LLMs process that environment.
But the symmetry is partial, and the asymmetry is what matters. Humans can override frequency through attention, context, and intention: a doctor reading a rare term in a clinical context can attend to it carefully despite its rarity; a poet can foreground low-frequency words deliberately. The override mechanism is what Why do dialogue failures persist despite scaling language models? indirectly identifies — humans are trained dialogically with goal-relevant attention shaping comprehension; LLMs are trained monologically with no equivalent override channel. The model cannot bracket frequency when frequency is irrelevant to the current goal because there is no current goal that can take priority over the statistical prior. The frequency response is the same across human and machine; the capacity to not be governed by it is what humans have and the architecture lacks. This refines the alien framing: the divergence is not in the response, it is in the override.
Source: Natural Language Inference Paper: Adam's Law: Textual Frequency Law on Large Language Models
Related concepts in this collection
-
Do LLMs compress concepts more aggressively than humans do?
Do language models prioritize statistical compression over semantic nuance when forming conceptual representations, and how does this differ from human category formation? This matters because it may explain why LLMs fail at tasks requiring fine-grained distinctions.
alien framing complicated by shared frequency response
-
Why do dialogue failures persist despite scaling language models?
If LLMs get better at text tasks with more training data, why don't dialogue-specific problems improve the same way? The question explores whether dialogue failures are capability gaps or structural training mismatches.
training mode explains override-capability gap
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
textual frequency in LLMs mirrors human neural frequency response — the linguistic surface is a shared statistical regime not just an LLM artifact