Can command generation replace intent classification in dialogue systems?

Explores whether generating pragmatic commands in a DSL could outperform traditional intent classification for task-oriented dialogue, particularly regarding training data needs and scalability.

Note · 2026-03-30 · sourced from Tasks Planning

The dominant industrial approach to task-oriented dialogue uses intent-based NLU: classify each user message into a predefined intent, extract slot values, and pass these to a dialogue manager. This paper introduces a fundamental architectural shift: replace intent classification with command generation in a domain-specific language (DSL).

The distinction is between semantics and pragmatics. "While NLU systems output intents and entities representing the semantics of a message, DU outputs a sequence of commands representing the pragmatics of how the user wants to progress the conversation." Intent classification asks "what does the user mean?" — command generation asks "what does the user want to happen next?"

Key advantages over intent-based approaches:

Context-dependent by design. NLU interprets one message in isolation. Dialogue Understanding considers the full running transcript plus the assistant's business logic. Flow definitions and conversation state provide additional context for understanding.
No training data required. Flow definitions (business logic as code) are all that developers specify. The LLM's in-context learning handles language understanding without annotated datasets — eliminating the expensive data collection that intent-based systems require.
Scales without degradation. Intent taxonomies become unmanageable at hundreds of intents: "difficult to remember and reason about," error-prone to modify, context-insensitive. Command generation scales naturally because new flows add new possible commands without reclassifying existing ones.
Handles repair natively. Corrections, digressions, interruptions, and cancellations are handled through conversation repair patterns. Developers specify only the "happy path" — repair is built into the architecture, not bolted on.
Coreference resolution is implicit. By including the full conversation transcript in the LLM prompt, commands are generated with arguments already fully resolved. No separate coreference module needed.

The limitation of intent classification is precisely that it treats understanding as classification: "messages are 'understood' by assigning them to a predefined intent." But user utterances often don't correspond to specific tasks — "I lost my wallet" could map to replace card, block card, or freeze card. Command generation can express this ambiguity through a Clarify command, while intent classification forces a premature decision.

Since When should AI agents ask users instead of just searching?, the Clarify command in this architecture is the engineering implementation of CA's insert-expansion: the system recognizes ambiguity and initiates a sub-sequence to resolve it before proceeding. Since Why can't conversational AI agents take the initiative?, this architecture gives the agent a structured mechanism for initiative-taking within the bounds of defined business logic.

Source: Tasks Planning Paper: Task-Oriented Dialogue with In-Context Learning

Related concepts in this collection

When should AI agents ask users instead of just searching? Explores whether tool-enabled LLMs should probe users for clarification when uncertain, rather than silently chaining tool calls that drift from intent. Examines conversation analysis patterns as a formal alternative.
Clarify command as engineering implementation of insert-expansions
Why can't conversational AI agents take the initiative? Explores whether current LLMs lack the structural ability to lead conversations, set goals, or anticipate user needs—and what architectural changes might enable proactive dialogue.
command generation gives agents structured initiative within business logic
Can dialogue planning balance fast responses with strategic depth? Can a system use quick instinctive responses for familiar conversation contexts while activating deeper planning only when uncertainty demands it? This explores whether adaptive computation improves dialogue goal-reaching.
command generation is the System 1 fast path; complex planning activates when commands don't resolve
Why do protocol-based tool systems fail in production agentic workflows? Explores whether standardized tool protocols like MCP introduce non-determinism that undermines reliable agent execution, and what causes ambiguous tool selection in production systems.
command generation + deterministic business logic execution avoids the MCP non-determinism problem

Concept map

15 direct connections · 120 in 2-hop network ·medium cluster

Can command generation replace intent classifica… When should AI agents ask users instead of just se… Why can't conversational AI agents take the initia… Can dialogue planning balance fast responses with … Why do protocol-based tool systems fail in product…

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere

Original note title

dialogue understanding reframed as command generation replaces intent classification — outputting pragmatics instead of semantics eliminates training data requirements