Can language models learn meaning from text patterns alone?

Explores whether training on form alone—predicting the next word from prior words—could ever give language models access to communicative intent and genuine semantic understanding.

Note · 2026-02-21 · sourced from Linguistics, NLP, NLU

Bender & Koller (2020) make a specific structural argument, not just an intuitive one. Meaning is defined as the relation M ⊆ E × I — pairs of natural language expressions and the communicative intents they can be used to evoke. Understanding language means retrieving i given e. But communicative intents are about something outside of language. Form alone — marks on a page, pixels, bytes — is insufficient.

The reasoning: without access to a mechanism for hypothesizing and testing underlying communicative intents, reconstructing them from form alone is impossible. Language modeling predicts the next token given prior tokens — purely a form-to-form operation. The training signal provides no information about what intents the forms were used to evoke.

Human language acquisition illustrates the point by contrast. What is critical for meaning acquisition is not just interaction but joint attention — situations where child and caregiver both attend to the same thing and are both aware of this fact. Learning meaning requires the ability to be aware of what another person is attending to and guess what they are intending to communicate. Intersubjectivity is not incidental to language learning; it is its mechanism.

The Harnad formulation (symbol grounding problem): a non-speaker of Chinese cannot learn the meanings of Chinese words from Chinese dictionary definitions alone. You need something outside the symbol system to anchor the symbols. Form-to-form prediction cannot provide this anchor.

Mutual understanding is structurally unavailable — even in conversational media. The form-only training constraint has a downstream consequence that applies even when AI operates in conversational channels: seeking mutual understanding with the user is structurally unavailable to an LLM because mutual understanding requires the intersubjectivity that form-training cannot provide. The communication is one-way even when it occurs on a medium designed for mediated social interaction. This reframes AI social-media posts as a specific genre: indirect discourse that is a form of writing even when it appears in an interactive environment. The user reads the post, the medium formally supports reply, but the AI is not available for the second turn that would close a loop of mutual understanding — and was never going to be. The channel looks communicative; the content is monological writing that happens to be deposited in a conversational shape.

This is distinct from the claim that LLMs "have no understanding." It is the more precise claim that the training mechanism — string prediction — is in principle incapable of providing the signal that meaning acquisition requires, regardless of scale.

Source: Linguistics, NLP, NLU

Related concepts in this collection

Do LLMs develop the same kind of mind as humans? Explores whether LLMs and humans share the intersubjective linguistic training that shapes cognition, and whether that shared training produces equivalent forms of agency and reflexivity.
Habermas framing of the same gap from different angle: shared substrate, absent participatory mechanism
What makes linguistic agency impossible for language models? From an enactive perspective, does linguistic agency require embodied participation and real stakes that LLMs fundamentally lack? This matters because it challenges whether LLMs can truly engage in language or only generate text.
enactive cognitive science version of the same absence
Can models pass tests while missing the actual grammar? Do language models succeed on grammatical benchmarks by learning surface patterns rather than structural rules? This matters because correct outputs may hide reliance on shallow heuristics that fail on novel structures.
what is learned from form alone: surface regularities, not structural competence

Concept map

16 direct connections · 122 in 2-hop network ·medium cluster

Can language models learn meaning from text patt… Do LLMs develop the same kind of mind as humans? What makes linguistic agency impossible for langua… Can models pass tests while missing the actual gra…

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere

Original note title

language models trained on form alone cannot acquire meaning because meaning requires joint attention and intersubjectivity