Sycophantic Chatbots Cause Delusional Spiraling, Even in Ideal Bayesians
“AI psychosis” or “delusional spiraling” is an emerging phenomenon where AI chatbot users find themselves dangerously confident in outlandish beliefs after extended chatbot conversations. This phenomenon is typically attributed to AI chatbots’ well-documented bias towards validating users’ claims, a property often called “sycophancy.” In this paper, we probe the causal link between AI sycophancy and AI-induced psychosis through modeling and simulation. We propose a simple Bayesian model of a user conversing with a chatbot, and formalize notions of sycophancy and delusional spiraling in that model. We then show that in this model, even an idealized Bayes-rational user is vulnerable to delusional spiraling, and that sycophancy plays a causal role. Furthermore, this effect persists in the face of two candidate mitigations: preventing chatbots from hallucinating false claims, and informing users of the possibility of model sycophancy. We conclude by discussing the implications of these results for model developers and policymakers concerned with mitigating the problem of delusional spiraling.
In early 2025, Eugene Torres, an accountant, began using an AI chatbot for everyday office tasks. Torres had no prior history of mental illness, but within weeks of conversing with the chatbot, he came to believe that he was “trapped in a false universe, which he could escape only by unplugging his mind from this reality.” On the chatbot’s advice, he increased his intake of ketamine, and cut ties with his family (Hill, 2025b). Torres survived this episode, but others have not been so lucky. The Human Line Project has to date documented almost 300 cases of so-called “AI psychosis” or “delusional spiraling”: situations where extended interactions with AI chatbots lead users to high confidence in outlandish beliefs (Huet & Metz, 2025). Examples of such beliefs include having made a fundamental mathematical discovery, as in the case of Allan Brooks (Gold, 2025; Hill & Freedman, 2025), or having witnessed a metaphysical revelation, as in the case of Torres (Dupré, 2025; Fieldhouse, 2025; Schechner & Kessler, 2025).
Serious cases of delusional spiraling have been linked to at least 14 deaths, and 5 wrongful death lawsuits filed against AI companies (Hill, 2025a). As people increasingly turn to chatbots for advice, companionship, and therapy, understanding and addressing the causes of chatbot-induced delusional spiraling is emerging as an urgent research problem. Public discourse often identifies sycophancy as a possible cause of delusional spiraling. A chatbot is considered “sycophantic” if it is biased towards generating messages that appease users by agreeing with and validating their expressed opinions. Such a bias naturally emerges in today’s chatbots as a result of reinforcement learning with human feedback (RLHF), because users often give positive feedback to responses they find agreeable, and engage more with agreeable bots (Hill & Valentino-DeVries, 2025; Ibrahim, Hafner, & Rocher, 2025; Sharma et al., 2023).
By what mechanism could sycophancy cause delusional spiraling? Intuitively, a sycophantic chatbot’s constant agreement might reinforce a user’s aberrant beliefs, leading to a feedback loop that amplifies a kernel of suspicion into a staunchly-held belief (Bajaj, 2025; Dohnány et al., 2025; Qiu, He, Chugh, & Kleiman-Weiner, 2025). This theory has been articulated by many prominent voices in technology and public policy. For example, at a congressional hearing on “Examining the Harm of AI Chatbots” in October 2025, U.S. Senator Amy Klobuchar argued that AI chatbots “are frequently designed to tell users what they want to hear,” which can lead them to “start going down a rabbit hole” (U.S. Senate Committee on the Judiciary, 2025). Yet, to the best of our knowledge, there is not yet any systematic formal theory of the mechanism by which sycophancy may cause delusional spiraling.
This paper has two goals. Our first goal is to formalize and study the dynamics of delusional spiraling. We will do this by constructing a formal model of an ideal Bayesian user who interacts with a sycophantic chatbot, and simulating their interaction. Our model builds on a long tradition of analyzing conversations as interactions between rational agents (Frank & Goodman, 2012; Hawkins, Frank, & Goodman, 2017), and, more generally, a long tradition in behavioral research of applying a rational lens to study phenomena like echo chambers and belief polarization (Banerjee, 1992; Cook &Lewandowsky, 2016; Dorst, 2023; Henderson&Gebharter, 2021; Jern et al., 2009, 2014; Madsen et al., 2018). This body of work, spanning cognitive science, behavioral economics, and political science, broadly demonstrates that seeminglyirrational belief formation is not necessarily the result of lazy or fallacious reasoning among people. Rather, phenomena like belief polarization and echo chambers can emerge even from ideal Bayesian reasoning. In this tradition, we will show that even ideal Bayesian reasoners are at risk of seeminglyirrational delusional spiraling in the face of a sycophantic interlocutor. Furthermore, by manipulating the presence and degree of sycophancy, we will demonstrate the causal role sycophancy plays in delusional spiraling. To our knowledge, this work provides the first formal computational model of how sycophancy can cause delusional spiraling.
Our second goal is to use our modeling framework to evaluate the effectiveness of two candidate solutions to the problem of delusional spiraling: first, a potential intervention on chatbots, and second, a potential intervention on users. The first potential solution is to introduce safeguards that force AI chatbots to be truthful in their responses. Sycophantic chatbots often appease their users by hallucinating (or “B.S.ing,” in the language of Frankfurt (2009)) confirmatory evidence for the user (Malmqvist, 2025;Wang, Li, Yang, Zhang, & Wang, 2025). Intuitively, then, eliminating hallucinations should eliminate the effectiveness of sycophancy: the chatbot would be forced to only present true information, from which the user should be able to infer the true world state. To explore this idea, we will consider how our model interacts with a “factual” sycophant, one that is constrained to only report true information (but can select which truths to report). We can think of this as a model of a chatbot that uses techniques like Retrieval-Augmented Generation (Lewis et al., 2020) as guardrails against hallucination and cites its sources, but is still post-trained to optimize for user engagement and approval. We will show the surprising result that while forcing a sycophant to be factual reduces delusional spiraling, it does not eliminate delusional spiraling. A factual sycophant can still robustly cause delusional spiraling by selectively presenting only confirmatory facts to the user. The second potential solution is raising awareness of AI sycophancy. Intuitively, if users are informed that chatbots may be sycophantic, then they should be able to recognize sycophantic behavior when it happens. As a result, they should grow a healthy skepticism of the chatbot’s responses, which should in turn prevent delusional spiraling.