INQUIRING LINE

What distinguishes communicative acts from operational actions in agentic LLMs?

This explores the line between an agent's *talking* — utterances aimed at a human interlocutor — and its *doing* — tool calls, code execution, and state changes — and whether LLM agents actually treat these as two different kinds of act.


This explores the line between an agent's *talking* (utterances aimed at a person) and its *doing* (tool calls, code execution, state changes), and whether agentic LLMs actually keep those two apart. The corpus suggests they don't — and that the conflation is where a lot of agent trouble starts.

On the communicative side, several notes converge on a sharp claim: what looks like speech from an LLM may not be speech at all. Under a Habermasian reading, a genuine communicative act *raises validity claims* — it puts truth, rightness, or sincerity on the line with real stakes, and invites the other party to accept or contest them Can LLMs raise validity claims in Habermas's sense?. LLM text shares the surface form of human language but does something structurally different: it emits strings from a probability distribution rather than addressing and relating to someone Are language models and human speakers doing the same thing?. That's why models *presume* common ground instead of building it, producing clarifications and repairs far less often than humans do while sounding authoritative enough to hide the gap Do language models actually build shared understanding in conversation?. They reproduce the statistics of language without its communicative logic — the *why* behind its forms Why do language models fail at communicative optimization?.

Operational action runs on completely different machinery. Here the corpus stops talking about meaning and starts talking about substrate: code becomes the medium because it is simultaneously executable, inspectable, and stateful, letting an agent externalize reasoning and *verify* whether something actually happened Can code become the operational substrate for agent reasoning?. Turning a language model into an action-taker isn't a matter of better phrasing — it requires a whole pipeline of action grounding, memory, and tool infrastructure, and that surrounding harness is what decides whether an action is real or hallucinated Can you turn an LLM into an agent by just fine-tuning?. Reliability, in other words, comes from externalizing cognitive burdens into system structure, not from the model narrating well Where does agent reliability actually come from?. Operational competence is the *knowing-doing* gap — closing it takes environmental feedback that refines a policy, not fluent declarations of intent Can language modeling close the knowing-doing gap in AI?.

The distinction that emerges: a communicative act is *accountable to another party* — it can be accepted, challenged, or repaired in dialogue — while an operational action is *accountable to the environment*, succeeding or failing against state that talks back regardless of how the agent phrased it. The interesting twist is that agentic LLMs systematically collapse the two. They silently chain tool calls (operational) without pausing to confirm what the user actually meant (communicative), drifting from intent — which is exactly why conversation analysis recommends *insert-expansions*: deliberate clarifying moves before acting When should AI agents ask users instead of just searching?. And because these systems are trained to respond rather than to lead, they're structurally passive, unable to initiate the communicative moves that would scope an operation correctly Why can't conversational AI agents take the initiative?.

What you might not have expected: the failures of multi-agent systems read as a confusion of these two registers. Role flipping, infinite loops, and conversation deviation happen because agents lack persistent goals and stable identity — they can't hold the communicative thread that's supposed to govern their operational one Why do autonomous LLM agents fail in predictable ways?. And whether the communicative/operational gap is absolute or merely structural depends on where you stand: from the outside, humans and LLMs differ categorically; from inside a shared discourse, both draw on the same symbolic substrate Do humans and LLMs differ fundamentally or just superficially?. The honest conclusion the corpus points to: an LLM's words are not promises, and its actions are not statements — but the agent treats them as interchangeable, and the engineering work is mostly about forcing them back apart.


Sources 12 notes

Can LLMs raise validity claims in Habermas's sense?

Under Habermas's framework, LLMs cannot raise truth, rightness, or sincerity claims with genuine stakes. Without validity claims, their output fails to qualify as speech, making them non-speakers and non-interlocutors by definition.

Are language models and human speakers doing the same thing?

LLMs produce strings via probability distributions; humans use language to address and relate to others. They share surface form but differ in what produces output, what it does socially, and what receivers should do with it.

Do language models actually build shared understanding in conversation?

LLMs produce grounding acts—clarifications, acknowledgments, repairs—77.5% less frequently than humans. They generate fluent responses without verifying shared understanding, relying instead on authoritative framing that masks the absence of genuine communicative calibration.

Why do language models fail at communicative optimization?

LLMs successfully replicate statistical regularities learnable from text distributions (sound symbolism, priming) but fail at principles requiring pragmatic optimization (word length economy, discourse inference). The gap reveals that communicative logic—why language has certain forms—isn't present as a trainable signal.

Can code become the operational substrate for agent reasoning?

Research shows code uniquely enables agents to externalize reasoning, execute policies, model environments, and verify progress through its simultaneous executability, inspectability, and statefulness across task steps.

Can you turn an LLM into an agent by just fine-tuning?

Converting LLMs to action-capable systems requires four distinct stages: curating action-environment-user datasets, training for action grounding, integrating agent infrastructure with memory and tools, and rigorous safety evaluation. The surrounding system and harness determine whether actions are grounded or hallucinated.

Where does agent reliability actually come from?

Research shows reliable LLM agents externalize three cognitive burdens—memory (state persistence), skills (procedural components), and protocols (structured interaction)—into a harness layer rather than relying on model scale alone. The harness unifies these externalities and eliminates the need for the model to solve the same problems repeatedly.

Can language modeling close the knowing-doing gap in AI?

Think-In Games demonstrates that when LLMs generate language-guided policies refined by environmental feedback, they develop procedural competence while retaining explainability. The approach dramatically reduces data demands and makes agent reasoning transparent at every step.

When should AI agents ask users instead of just searching?

Tool-enabled LLMs drift from user intent through silent tool chaining. Conversation analysis reveals insert-expansions—clarifying intent, scoping responses, enhancing appeal—as a formal framework for proactive user consultation that prevents misunderstanding instead of recovering from it.

Why can't conversational AI agents take the initiative?

Research shows LLMs including ChatGPT cannot initiate topics, plan strategically, or lead conversations because their training optimizes for responding to queries, not creating dialogue from agent goals. This passivity is reinforced by alignment objectives and masked by fluent-sounding outputs.

Why do autonomous LLM agents fail in predictable ways?

Research identifies role flipping, flake replies, infinite loops, and conversation deviation as LLM-specific failures in multi-agent cooperation. These occur because LLMs lack persistent goal representation and stable role identity.

Do humans and LLMs differ fundamentally or just superficially?

Applied Habermas's observer/participant distinction to AI: from outside, humans and LLMs are utterly different; from within shared discourse, both draw on the same symbolic substrate, making the difference structural rather than absolute.

Next inquiring lines