Can robots do therapy?: Examining the efficacy of a CBT bot in comparison with other behavioral intervention technologies in alleviating mental health symptoms

Paper · Source

In a recent systematic review of the literature, (Conley et al., 2022) note that this literature has yet to answer the question of whether BITs can be considered an effective and low-cost substitution to traditional psychotherapeutic services, or if they are best used as a way to engage people initially and ultimately connect them to traditional services.

When it comes to online psychoeducational tools, they are limited in their inability to provide information that is personalized and clinically relevant to the user, and often vary in quality and accuracy (Barak & Grohol, 2011). Regarding interactive self-help interventions, low adherence and significant dropout rates are a common problem that prevents many individuals from experiencing the programs’ benefits (Barak & Grohol, 2011; Lipschitz et al., 2022). Internet-based psychological interventions are also unable to accurately detect when an individual is in crisis or in need of alternative treatment services, which presents serious ethical and clinical challenges (Barak & Grohol, 2011; Calvo, Milne, Hussain, & Christensen, 2017; Pham, Nabizadeh, & Selek, 2022).

Some have argued that AI-based chatbots may be able address problems with adherence and lack of personalization and interaction of other BITs, as they are uniquely well positioned in a middle ground between professionally provided psychotherapy and self-help apps due to the potential for a therapeutic chatbot to simulate aspects of therapy that are predictive of mental health benefits, including a sense of connection, alliance, and enlightenment (Thompson, 2018; Pham et al., 2022).

Findings indicated that compared to people in a control condition who read psychoeducational materials about depression, individuals who used Woebot showed a decrease in symptoms of depression; the investigators concluded that Woebot and other similar conversational agents are a “feasible, engaging, and effective way to deliver CBT” (Fitzpatrick et al., 2017, p. 19e). Concluding that a relationship with Woebot can provide CBT implicates a level of care beyond self-help behavioral intervention technologies; it stakes a claim that Woebot is a psychotherapy provider. This raises questions about the evidence required to promote an AI-based chatbot as providing or delivering psychotherapy or mental health services. In psychotherapy research, best practices indicate that new treatment modalities must show efficacy in comparison to other treatment modalities.

secondary analyses using paired samples ttests provided additional information about where significant improvements were found for each intervention separately, along with their effect sizes. Findings indicated most robust effect sizes for ELIZA users, followed by those using Daylio, then Woebot and finally Psychoeducation. Participants who used ELIZA experienced significant improvements in all four outcome areas, with large effect sizes for anxiety, depression, and positive affect, and a medium effect size for negative affect.

benefits to Woebot users were found for anxiety rather than depressive symptoms; however, this improvement was on par with ELIZA, a non-psychotherapy conversational bot. Further, both ELIZA and Daylio—included in this study as per recommendations by Mohr et al. (2009) to increase validity by testing the active intervention to active control groups that were meant to exemplify the expressive and conversational elements of Woebot—were found to lead to improvements in symptoms that were equivalent to or greater than the benefits associated with Woebot.

When researching tools designed for symptom reduction and mental health support, these findings support the importance of an applied clinical approach of comparing the new to established treatments, rather than limiting the study to a basic science “better than nothing” approach. RCT studies that compare BITs to waitlist or psychoeducational controls that may contribute to basic science also possess significant risk of misuse. Such research has a high likelihood of being used to drive misinformation about the efficacy of smartphone applications to treat mental health problems, such as depression, anxiety, and substance use. As the popularity and scope of mental health technologies grow, developers of technology-driven mental health tools are economically incentivized to conduct research aimed at helping to market their interventions by providing “evidence” of their viability and efficacy. The “better than nothing” RCT is the tool of choice for this purpose. Rather, what is needed, and what is common in applied clinical research, is research that demonstrates the efficacy of a new intervention in relation to other evidence-based interventions and not just to no-treatment or psychoeducational controls. Also necessary is research that can identify the underlying mechanisms that contribute to whatever comparative efficacy is demonstrated.