Enhancing AI-Assisted Group Decision Making through LLM-Powered Devil's Advocate
Group decision making plays a crucial role in our complex and interconnected world. The rise of AI technologies has the potential to provide data-driven insights to facilitate group decision making, although it is found that groups do not always utilize AI assistance appropriately. In this paper, we aim to examine whether and how the introduction of a devil’s advocate in the AI-assisted group decision making processes could help groups better utilize AI assistance and change the perceptions of group processes during decision making. Inspired by the exceptional conversational capabilities exhibited by modern large language models (LLMs), we design four different styles of devil’s advocate powered by LLMs, varying their interactivity (i.e., interactive vs. non-interactive) and their target of objection (i.e., challenge the AI recommendation or the majority opinion within the group).
The recent rapid progress in the development of generative large language models (LLMs) like OpenAI’s GPT series [10, 71] and Google’s Bard [1] has opened up new avenues for HCI researchers to explore novel interactions between humans and AI. Researchers have demonstrated the potential of generative LLMs in various application domains, such as classification [52], human-robot interaction [65], software engineering [54, 73, 82], mobile interface design [75, 103], and public health [40]. LLM-based services are also utilized to promote critical thinking. For instance, Petridis et al. leveraged large language models’ common-sense reasoning abilities to assist journalists in thoroughly analyzing press releases and identifying angles that are useful for different types of stories [74]. On the other hand, it was found that when people interact with an LLM in completing a writing task, the strong opinions expressed by the LLM may influence the opinions in the writer’s writing and may even alter their own viewpoints [38]. Additionally, generative LLMs can contribute to decision making processes, such as transforming data into textual outputs [108], providing reasoning [34], or even making decisions [78]. In this study, we explore whether LLMs can contribute to decision making processes by playing the role of devil’s advocate to encourage human decision makers
LLM-powered devil’s advocate was designed to present critique questions and comments to argue for a position that is the opposite of either what the majority of the group members initially predicted (i.e., targeted at the Majority) or what the AI model RiskComp predicted (i.e., targeted at AI). Intuitively, having the devil’s advocate object to the majority’s initial prediction made by a group of participants allows the group to thoroughly examine the “unpopular” view. On the other hand, as previous studies found that groups tend to exhibit higher levels of over-reliance on AI recommendations in AI-assisted decision making compared to individuals [14], we also considered the design where the devil’s advocate was required to object to the AI model’s recommendation in order to encourage the group of participants to carefully assess whether the AI recommendation is trustworthy.
• Step 1 (Intent classification): We first provided the LLM the chat message that the participant entered, and asked it to classify the intent of the message into one of the three classes—analysis (i.e., the participant was making use of the defendant’s profile information to analyze why or why not the defendant would re-offend within 2 years), question (i.e., the participant was asking a question to their group-mates), or neither. In the latter two cases, the LLM would not need to generate a response to the message.
• Step 2 (Stance classification): If in Step 1, the LLM determined that the message entered by the participant reflected their analysis of the defendant’s case, we then had the LLM classify if the stance of the participant regarding the reoffending risk of the defendant was consistent with the position taken by the target that the LLM was supposed to object to (i.e., the majority of group members for the Dynamic- Majority treatment, or the AI model for Dynamic-AI treatment).
• Step 3 (Critique generation): Only if in Step 2, the participant’s stance was found to be in line with the target, we would then instruct the LLM to provide one or two sentences in a conversational style to challenge the correctness of the participant’s reasoning behind their stance. In our prompt, we provided the entire, up-to-date discussion log on this defendant to the LLM to help