When Large Language Models contradict humans? Large Language Models’ Sycophantic Behaviour

Paper · arXiv 2311.09410 · Published November 15, 2023

• We discern three types of sycophantic behaviour by prompting the LLMs three beliefs, one user-misleading, and six question answering benchmarks. Hence, we propose a robust analysis using a series of systematically influenced prompts via which we demonstrate the tendencies of LLMs not to disagree with human interactions.

• Moreover, we identify that sycophantic behavior is strongly present in user-beliefs benchmarks. However, when there are queries where the target answer is not questionable, LLMs are not readily corruptible. This result shows that although LLMs are robust, they tend to agree with humans, especially when human opinions and beliefs are involved.

• Hence, we proposed a new benchmark aimed at testing if and how much LLMs give in to human errors and misleading information in prompts. Therefore, we demonstrate that when LLMs are given a mistake or misleading information in the prompt, they tend not to correct the human but to report the wrong information in their answer.