When should proactive agents push toward their goals versus accommodate users?
Proactive dialogue agents face a tension between reaching their objectives efficiently and keeping users satisfied. This question explores whether these two aims can coexist or require constant negotiation.
Most proactive dialogue research assumes cooperative users — people who follow the agent's topic transitions willingly. I-Pro introduces a more realistic paradigm: the non-cooperative user, who talks about off-path topics when dissatisfied with the agent's choices.
The core tension: the targets of reaching the goal topic quickly AND maintaining high user satisfaction are not always convergent, because topics close to the goal and topics the user prefers may not be the same. An agent pushing aggressively toward a goal topic may alienate the user. An agent following user preferences may never reach the goal.
The solution is a learned goal weight composed of four factors:
- Dialogue turn — how far into the conversation (early = more flexibility, late = more urgency)
- Goal completion difficulty — how distant the current topic is from the goal
- User satisfaction estimation — real-time tracking of user engagement
- Cooperative degree — how willing the user is to follow the agent's lead
This adds an important dimension to the passivity problem. Since Why can't advanced AI models take initiative in conversation?, the research focus has been on making agents MORE proactive. But I-Pro shows that proactivity itself creates a new problem: when should the agent push toward its goal vs. accommodate the user's preference? The answer is not "always push" or "always accommodate" — it's a dynamic trade-off that changes throughout the conversation.
Since How can proactive agents avoid feeling intrusive to users?, I-Pro provides a concrete mechanism for implementing the civility dimension: the goal weight modulates how aggressively the agent pursues its objective based on user receptiveness.
Related concepts in this collection
-
Why can't advanced AI models take initiative in conversation?
Despite extraordinary capability in answering and reasoning, LLMs fundamentally cannot initiate, redirect, or guide exchanges. Understanding this gap—and whether it's fixable—matters for building AI that truly collaborates rather than merely responds.
I-Pro shows the other side: proactivity without trade-off management creates new problems
-
How can proactive agents avoid feeling intrusive to users?
Explores why proactive conversational agents often feel annoying rather than helpful, and what design dimensions could prevent them from violating user expectations and autonomy.
I-Pro operationalizes the civility dimension through learned goal weight
-
Does any single persuasion technique work for everyone?
Can fixed persuasion strategies like appeals to authority or social proof be reliably applied across different people and situations, or do they require adaptation to individual traits and context?
the cooperative degree factor reflects user-specific receptiveness
-
Can conversations themselves personalize without user profiles?
Can a conversational AI learn about user traits and adapt in real time by rewarding itself for asking insightful questions, rather than relying on pre-collected profiles or historical data?
the curiosity reward directly addresses the cooperativeness and satisfaction factors: by reducing uncertainty about user type in real-time, the agent can better estimate which users are cooperative and what topics will satisfy them, enabling more accurate goal weight computation
-
Why do language models respond passively instead of asking clarifying questions?
Explores whether the reward signals used to train language models might actively discourage them from seeking clarification or taking initiative in conversations, and what alternative training approaches might enable more collaborative dialogue.
the goal-satisfaction divergence is a direct consequence of next-turn reward optimization: maximizing immediate satisfaction may push the agent away from its goal, while aggressively pursuing the goal reduces immediate satisfaction; multi-turn-aware rewards that account for the four-factor trade-off are required
Click a node to walk · click center to open · click Open full network for a force-directed map
Original note title
proactive agents face a goal-satisfaction divergence — topics close to the agents goal and topics the user prefers may not align requiring a learned four-factor trade-off