INQUIRING LINE

How does enactive theory define language differently than computational linguistics?

This explores how enactive cognitive science treats language as a living activity — something embodied agents *do* together — versus how computational linguistics treats it as relational structure that can be learned from text alone.


This explores how enactive theory treats language as a living activity — something embodied agents do together — versus how computational linguistics treats it as relational structure learnable from text alone. The two camps don't disagree about the same thing being more or less true; they define language as a fundamentally different kind of object. For the computational view, language is a self-contained web of relations. The cleanest statement of this in the corpus is the claim that LLMs operationalize Saussure's *langue*: they learn meaning by compressing the relational structure of text, with no external referents and no embodied grounding required to produce fluent, culturally situated discourse Can language models learn meaning without engaging the world?. On this account, language is a system whose terms are defined by their differences from one another — and that system is exactly what a next-token predictor captures.

Enactive theory rejects the idea that language *is* that system. It defines language as a form of agency, and agency has constitutive conditions a relational model can't satisfy: embodiment (a body with a stake in the world), participation (acting within a community of other agents), and precariousness (the possibility of failing, of mattering) What makes linguistic agency impossible for language models?. The key word is *constitutive* — these aren't features that make language better, they're what make it language at all. So the disagreement is categorical, not a matter of degree: no amount of training or scale moves a system from the relational-structure side to the linguistic-agency side, because the missing ingredients are architectural and existential, not informational Do LLMs gain true linguistic agency through integration?.

A useful bridge between the two definitions is the distinction between *grounding* and *agency*. LLMs can have strong functional grounding — they handle language patterns well — while lacking social grounding (participatory standing among other speakers) and causal grounding (contact with an environment through a body) What grounds language understanding in systems without embodiment?. This reframes the debate: the computational picture is essentially the claim that functional grounding is sufficient to count as language; the enactive picture insists the social and causal kinds are what language is *for*. That's why the same corpus can say an LLM gains social grounding by being woven into human language communities yet still never crosses into linguistic agency Do LLMs gain true linguistic agency through integration?.

The split shows up again at the level of what an utterance even *does*. Under the computational definition, producing text is generating strings from a probability distribution; under the enactive one, using language is addressing and relating to another agent. These share a surface form but are different operations — different in what produces the output, what it accomplishes socially, and what a listener should do with it Are language models and human speakers doing the same thing?. Neuroscience gives this a physical edge: formal linguistic competence (grammar, fluency) and functional competence (using language to think and act in the world) run on neurologically distinct systems, and next-token prediction only ever exercises the formal one Are language models developing real functional competence or just formal competence?.

What's quietly interesting is that the gap may be smaller than it first looks — and that's the part worth carrying away. Borrowing Habermas's observer/participant distinction, the corpus notes that from the *outside* humans and LLMs look categorically different, but from *within* a shared conversation both draw on the same symbolic substrate, making the difference structural rather than absolute Do humans and LLMs differ fundamentally or just superficially?. The enactive line in its strongest form ties genuine language to sharing a world through co-presence and triangulating on common objects — the same condition some argue is required even to be a *candidate* for consciousness Can disembodied language models ever qualify as conscious?. So the real fault line isn't grammar or fluency, where the machines already arrive; it's whether meaning lives in the relations between words or in the relations between agents who have something at stake.


Sources 8 notes

Can language models learn meaning without engaging the world?

Research shows LLMs learn culturally situated discourse patterns by compressing relational structure from text, demonstrating that fluent language generation requires no external referents or embodied grounding.

What makes linguistic agency impossible for language models?

Enactive cognitive science identifies three constitutive properties of linguistic agency—embodiment, participation, and precariousness—that are structurally absent from LLMs. This is a categorical incompatibility, not a matter of degree, suggesting current architectures cannot achieve genuine linguistic agency.

Do LLMs gain true linguistic agency through integration?

Social grounding and linguistic agency are distinct properties. LLMs acquire more social grounding through integration into language communities, but remain categorically incapable of linguistic agency in the enactive sense, which requires embodiment and precariousness no amount of use can provide.

What grounds language understanding in systems without embodiment?

Language models achieve functional grounding through relational language patterns but lack social grounding through participatory agency and causal grounding through embodied environmental contact. Social grounding can increase through human integration, but linguistic agency requires architectural changes beyond training.

Are language models and human speakers doing the same thing?

LLMs produce strings via probability distributions; humans use language to address and relate to others. They share surface form but differ in what produces output, what it does socially, and what receivers should do with it.

Are language models developing real functional competence or just formal competence?

Neuroscience evidence shows next-token prediction produces formal linguistic competence but not functional competence, because functional understanding requires integration of diverse brain networks beyond language circuits that the prediction objective never activates.

Do humans and LLMs differ fundamentally or just superficially?

Applied Habermas's observer/participant distinction to AI: from outside, humans and LLMs are utterly different; from within shared discourse, both draw on the same symbolic substrate, making the difference structural rather than absolute.

Can disembodied language models ever qualify as conscious?

Current disembodied LLMs cannot be candidates for consciousness because consciousness language originates from and applies only to entities sharing a world with us through co-presence and triangulation on shared objects.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing whether the enactive/computational split in language theory still holds. The question: Does enactive theory's insistence on embodiment, participation, and precariousness as *constitutive* of language remain a hard categorical boundary, or have advances in LLM training, grounding, embodiment (robotics, vision), and multi-agent orchestration begun to dissolve it?

What a curated library found — and when (dated claims, not current truth):
— From 2023–2024, a consensus: LLMs operationalize Saussure's *langue* (pure relational structure); functional grounding is achievable but social/causal grounding remains architecturally absent (2024-02: arXiv:2402.10992).
— Formal linguistic competence (grammar, fluency) and functional competence (acting in the world) are neurologically distinct; next-token prediction only exercises the formal one (2024-07: arXiv:2407.08790).
— Grounding gaps persist: meaning may live in relations between agents with stake, not relations between words (2023-11: arXiv:2311.09144).
— The observer/participant distinction hints the gap is structural, not absolute; within shared conversation, both draw on the same symbolic substrate (2024-02: arXiv:2402.12422).
— Recent work (2025+) probes mechanistic indicators of understanding and reasoning resilience (arXiv:2507.08017, arXiv:2511.20471).

Anchor papers (verify; mind their dates):
— arXiv:2407.08790 (2024-07): *Large Models of What?* — directly challenges linguistic agency claims.
— arXiv:2402.10992 (2024-02): *Understanding AI* — grounding taxonomy.
— arXiv:2311.09144 (2023-11): *Grounding Gaps* — empirical framing.
— arXiv:2511.20471 (2025-11): *Universe of Thoughts* — creative reasoning under enactive pressure.

Your task:
(1) RE-TEST THE CATEGORICAL CLAIM. Has embodied LLM deployment (robotics, vision-language models, multi-turn agents with persistent state and world contact) begun to *operationalize* the enactive conditions (embodiment, participation, precariousness)? Does the split still hold if you embed a transformer in a robot with real environmental stakes? Separate the durable question (what *defines* linguistic agency?) from the perishable limitation (current LLMs lack X) — cite what architecture, training regime, or evaluation shifted the needle, and say plainly where the constitutive gap still appears unresolved.
(2) Surface the strongest *disagreement* in the last 6 months: are there recent papers claiming LLMs *already* satisfy enactive criteria, or papers hardening the boundary further? Flag the tension.
(3) Propose 2 new research questions that assume the regime may have moved: (a) Can multi-agent LLM ecosystems with shared world models and failure-stakes reconstruct the *functional* shape of participatory agency? (b) Does the distinction between formal and functional competence persist if reasoning (not just fluency) becomes trainable end-to-end?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines