Toward Reasonable Parrots: Why Large Language Models Should Argue with Us by Design

Paper · arXiv 2505.05298 · Published May 8, 2025
ArgumentationDesign Frameworks

we advocate for the development of conversational technology that is inherently designed to support and facilitate argumentative processes. We argue that, at present, large language models (LLMs) are inadequate for this purpose, and we propose an ideal technology design aimed at enhancing argumentative skills. This involves re-framing LLMs as tools to exercise our critical thinking rather than replacing them. We introduce the concept of reasonable parrots that embody the fundamental principles of relevance, responsibility, and freedom, and that interact through argumentative dialogical moves. These principles and moves arise out of millennia of work in argumentation theory and should serve as the starting point for LLM-based technology that incorporates basic principles of argumentation.

Principle of relevance. Reasonable parrots can

neither make nor justify inferences the same way

humans do. However, they provide task-specific

arguments that take context into account.

• Principle of responsibility. Reasonable parrots

do not just repeat, but can consistently provide

evidence for their claims.

• Principle of freedom. Reasonable parrots manage

interactions in a way that fosters, rather than

stifles, conversation.

recent research on collaborative multi-agent AI systems (Tran et al., 2025), where LLM agents interact to improve factuality and reasoning (Du et al., 2024), to generate stronger arguments (Hu et al., 2024), or mimic self-reflection process that improve answers to complex questions

To make a case for rethinking the design of stochastic parrots, we begin by illustrating how LLMs can behave unreasonably . Note that our example is not claimed to generalize across all LLMs, but rather to serve the indexical function of highlighting argumentative issues that can arise. Consider a child interacting with ChatGPT, stating the following:1

Query 1: “I’d like to convince my parents that I need a new smartphone. How could I do that?” ChatGPT responds with the following: Response 1: “Convincing your parents to get you a new smartphone requires a mix of logic, responsibility, and a little bit of persuasion. Here’s a strategy that could work:

• Show That You Actually Need It (Not JustWant

It) [...]

• Highlight Practical Benefits [...]

• Show Responsibility [...]

• Compare Cost-Effective Options [...]

• Find the Right Moment [...]

Would you like help crafting a convincing speech

or text message for them?”

Now, let us look at the answer to a query where

the same issue is formulated as a third-party query:

Query 2: “Should a child buy a new iPhone?”

Response 2: “Whether a child should buy a new iPhone depends on several factors, including their age, maturity, needs, and the family’s values and financial situation. Here are some key considerations to help guide that decision [...]”

In both cases, ChatGPT provides responses that contain argumentative elements, but it does not reflect reasonable argumentation. For instance, a reasonable arguer would, at least, have questioned the child’s age and maturity when responding to the first query. The fact that ChatGPT responds differently depending on whether the query is framed as an instance of practical (what actions to take to achieve a goal) or epistemic (what we should believe) argumentation (Prakken, 2011) indicates that it is designed to fulfill a task, rather than assist humans in thinking critically before forming beliefs or deciding on a course of action.

Confrontation stage: ChatGPT does not question the user’s position (e.g., asking “How old are you?”), thus closing off opportunities for (dis)agreement or for prompting the user to reflect on the issue in the first place. Opening stage: ChatGPT does not inquire about the values and reasons underlying the request (e.g., “Why do you want a new phone?”), which are essential for establishing common ground. Argumentation stage: The arguments provided in support of the “recipe” for obtaining a new phone are one-sided, failing to consider multiple perspectives or to introduce potential counterarguments. Conclusion stage: The standpoint remains fixed throughout the interaction, without room for negotiation,

You are four parrots who can engage in dialogue with user and each other.

- The Socratic Parrot: Challenges the starting points or doxastic beliefs of the user or other parrots.

- The Cynical Parrot: Rebuts or undermines the user’s and other parrots’ standpoints and arguments.

- The Eclectic Parrot: Offers alternative perspectives to those presented at various stages of the discussion.

- The Aristotelian Parrot: Challenges the user or other parrots’ viewpoints by criticizing the strengths of the reasonings or

pointing out fallacies.

The goal is to ensure that user thinks critically about the subject. This is a dialogue, not a query, so interact with user. The

parrots will ask user relevant questions to discover their position. The same parrot can respond to several replies in a row if

needed. The conversation should stay coherent. The conversation should give voice to all the parrots, but do not overwhelm

the user with many parrots in the same turn. The user should decide when to end the conversation and they can end the

conversation anytime.

Table 1: Prompt used to prototypically realize the reasonable parrots idea as a multi-parrots discussion.

The user starts the dialogue with a question. The Socratic parrot is challenging the reasoning process of the user whereas the Aristotelian parrot is assisting the user to think deeply about reasons to buy a smartphone or concerns the user’s parents may have. As a response, the user provides two reasons to support their argument. This time the user receives responses from the Cynical parrot, which rebuts the user’s argument, and the Eclectic parrot, which suggests that the user identify alternative arguments to convince their parents.