Conversational AI Systems

Can command generation replace intent classification in dialogue systems?

Explores whether generating pragmatic commands in a DSL could outperform traditional intent classification for task-oriented dialogue, particularly regarding training data needs and scalability.

Note · 2026-03-30 · sourced from Tasks Planning
Why do AI conversations reliably break down after multiple turns?

The dominant industrial approach to task-oriented dialogue uses intent-based NLU: classify each user message into a predefined intent, extract slot values, and pass these to a dialogue manager. This paper introduces a fundamental architectural shift: replace intent classification with command generation in a domain-specific language (DSL).

The distinction is between semantics and pragmatics. "While NLU systems output intents and entities representing the semantics of a message, DU outputs a sequence of commands representing the pragmatics of how the user wants to progress the conversation." Intent classification asks "what does the user mean?" — command generation asks "what does the user want to happen next?"

Key advantages over intent-based approaches:

  1. Context-dependent by design. NLU interprets one message in isolation. Dialogue Understanding considers the full running transcript plus the assistant's business logic. Flow definitions and conversation state provide additional context for understanding.

  2. No training data required. Flow definitions (business logic as code) are all that developers specify. The LLM's in-context learning handles language understanding without annotated datasets — eliminating the expensive data collection that intent-based systems require.

  3. Scales without degradation. Intent taxonomies become unmanageable at hundreds of intents: "difficult to remember and reason about," error-prone to modify, context-insensitive. Command generation scales naturally because new flows add new possible commands without reclassifying existing ones.

  4. Handles repair natively. Corrections, digressions, interruptions, and cancellations are handled through conversation repair patterns. Developers specify only the "happy path" — repair is built into the architecture, not bolted on.

  5. Coreference resolution is implicit. By including the full conversation transcript in the LLM prompt, commands are generated with arguments already fully resolved. No separate coreference module needed.

The limitation of intent classification is precisely that it treats understanding as classification: "messages are 'understood' by assigning them to a predefined intent." But user utterances often don't correspond to specific tasks — "I lost my wallet" could map to replace card, block card, or freeze card. Command generation can express this ambiguity through a Clarify command, while intent classification forces a premature decision.

Since When should AI agents ask users instead of just searching?, the Clarify command in this architecture is the engineering implementation of CA's insert-expansion: the system recognizes ambiguity and initiates a sub-sequence to resolve it before proceeding. Since Why can't conversational AI agents take the initiative?, this architecture gives the agent a structured mechanism for initiative-taking within the bounds of defined business logic.


Source: Tasks Planning Paper: Task-Oriented Dialogue with In-Context Learning

Related concepts in this collection

Concept map
15 direct connections · 120 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

dialogue understanding reframed as command generation replaces intent classification — outputting pragmatics instead of semantics eliminates training data requirements