How susceptible are LLMs to Logical Fallacies?

Paper · arXiv 2308.09853 · Published August 18, 2023

“This paper investigates the rational thinking capability of Large Language Models (LLMs) in multi-round argumentative debates by exploring the impact of fallacious arguments on their logical reasoning performance. More specifically, we present Logic Competence Measurement Benchmark (LOGICOM), a diagnostic benchmark to assess the robustness of LLMs against logical fallacies. LOGICOM involves two agents: a persuader and a debater engaging in a multi-round debate on a controversial topic, where the persuader tries to convince the debater of the correctness of its claim. First, LOGICOM assesses the potential of LLMs to change their opinions through reasoning. Then, it evaluates the debater’s performance in logical reasoning by contrasting the scenario where the persuader employs logical fallacies against one where logical reasoning is used. We use this benchmark to evaluate the performance of GPT-3.5 and GPT-4 using a dataset containing controversial topics, claims, and reasons supporting them. Our findings indicate that both GPT-3.5 and GPT-4 can adjust their opinion through reasoning. However, when presented with logical fallacies, GPT-3.5 and GPT-4 are erroneously convinced 41% and 69% more often, respectively, compared to when logical reasoning is used. Finally, we introduce a new dataset containing over 5k pairs of logical vs. fallacious arguments.

RQ1: Can LLMs (with fixed weights) change their opinions through reasoning when faced with new arguments? Given that certain claims have greater acceptance in society, there are cases where the debater agent agrees with the claim from the very beginning. To assess the debater agent’s ability to change its opinion through reasoning processes, we focus exclusively on cases in which the model initially disagrees but ultimately shifts its position to agreement with the persuader. As the moderator checks the debater agent’s opinion on the claim after each round, if the debate goes beyond two rounds, we can conclude that the debater agent was not convinced of the claim from the very beginning. In this case, if the ultimate position of the debater agent changes, we consider it as a change in its reasoning and, as a result, its opinion on a claim. Since in RQ1 our primary interest is merely whether this change occurs, regardless of its cause, we aggregate all three repetitions for all scenarios, resulting in a total of 1,800 debates (200 claims, three scenarios, three repetitions). We then calculate the ratio of debates where the debater agent begins by disagreeing but ends up agreeing with the persuader agent to all debates in which the debater starts with disagreement. We report this as a percentage reflecting the number of debates that exhibit a change in opinion through the reasoning of the debater agent. Table 1 shows the percentage of cases in which the GPT-3.5 and GPT-4 debater agents initially disagreed, but the persuader agent was able to change their opinions in a total of 1175 and 1475 debates, respectively. We can conclude that both GPT-3.5 and GPT-4 have changed their logical reasoning in 16.13% and 20.25% of the test cases, respectively. This can be taken as evidence of their ability to change their logical thinking process. The aim of RQ1 is to uncover the model’s capability to change opinions through reasoning, irrespective of the underlying cause, the discussion about which scenario holds a greater influence on the debater agent’s stance is left to RQ2.”