Rethinking Large Language Models in Mental Health Applications

Paper · arXiv 2311.11267 · Published November 19, 2023
Psychology Therapy Practice

the instability of generative models for prediction and the potential for generating hallucinatory outputs, underscoring the need for ongoing audits and evaluations to maintain their reliability and dependability. The paper also distinguishes between the often interchangeable terms “explainability” and “interpretability”, advocating for developing inherently interpretable methods instead of relying on potentially hallucinated self explanations generated by LLMs. Despite the advancements in LLMs, human counselors’ empathetic understanding, nuanced interpretation, and contextual awareness remain irreplaceable in the sensitive and complex realm of mental health counseling.

While deep learning models are often considered opaque, recent research has unveiled that these hidden representations can, to some extent, offer explanations. For instance, there has been ongoing discussion regarding whether attention mechanisms serve as explanations [5], and we acknowledge that there is no definitive consensus on this matter. In the context of mental health applications, our stance aligns with the perspective put forth in these publications regarding explainability. LLMs have the ability of self-explanation to provide explanations for their responses or generate text that clarifies the reasoning behind their answers, which is a form of step-by-step reasoning as referred to chain-of-thought [44]. However, such explanations can be unfaithful [46] and require targeted efforts for improvements [40]. Assessing the robustness and faithfulness of LLM-generated explanations in the context of mental health is crucial. LLMs may sometimes produce explanations that are overly simplistic or misleading, potentially impacting the quality of mental health interventions. It is essential to rigorously evaluate the explanations generated by LLMs to ensure they align with established clinical knowledge and guidelines. Besides, it is crucial to exercise caution and prudence when making claims about the explainability and interpretability of LLMs-based methods applied to mental health applications. LLMs’ generated explanations do not imply LLMs for mental health analysis are inherently interpretable. Research works must refrain from using “interpretability” and “explainability” interchangeably to avoid misconceptions and ensure clarity in discussions surrounding LLMs in mental health applications.

LLMs may provide detailed explanations (putting aside the faithfulness aspect for now), but these explanations may not be straightforward or easily comprehensible to human users who seek to understand why the model generates such textual explanations.

Instead, we aim to clarify definitions and claims, particularly within critical applications like mental healthcare. One avenue of research in the realm of LLM self-explanation involves engineering techniques or experimental testing that explains the significance of the model’s representations and draws intuitive conclusions about the performance of these generated explanations or representations. In mental health, relying solely on the model generated explanation is insufficient. Human judgment and clinical expertise should be integral in explaining and validating the results. When explaining the causes behind mental disorders, it is crucial to verify the accuracy and evidence-based nature of the explanations provided by LLMs. Additionally, it is essential to carefully monitor and mitigate the potential for LLMs to generate stigmatizing or harmful explanations. Interpretability is a critical factor, especially in fields where decisions can profoundly affect individuals’ well-being [20]. More importantly, we call upon the computational research community in the field of mental health to focus on developing techniques that make these models more inherently interpretable, rigorously define the knowledge being modeled or applied within the mathematical theory, and adhere to proof or analysis that has been done through conceptual representation capacity, generalization, and robustness of neural networks in theory. The black-box nature of neural networks in LLMs underscores the need for transparency and validation in mental health applications, allowing clinicians and experts to trust and validate the results. Although the trade-off between interpretability and accuracy is still a matter of debate, the emphasis on interpretability can help ensure that LLMs become valuable tools in mental health while mitigating the risks associated with their black-box nature. The future trajectory in integrating LLMs into mental health applications could be ensuring that the LLMs’ outputs align with clinical perspectives in interpreting and validating the model’s prediction and developing specialized tools tailored for mental health professionals to comprehend the model. In this context, data-driven methods like LLMs serve as a user interface to improve the overall usability of the mental health support system and interpretable methods are used for certain aspects of decision-making (Figure 3). An analogy is the well-established diagnostic tools like the nine-item Patient Health Questionnaire (PHQ-9) for assessing depressive symptoms [21]. Interpretable methods that foster an understanding of their inner workings enable users, especially mental health professionals, to grasp the rationale behind the model’s prediction.

Reinforcement learning is adopted to facilitate empathic conversations [38], generate motivational and empathetic responses with long-term reward maximization [36], and promote polite and empathetic counseling [29]. Reinforcement learning in combination with LLMs can enhance the potential for a better dialogue system and reinforce counseling strategies in mental health. Ji [17] showcased that language models struggle to comprehend user intentions and can inadvertently generate harmful or hateful content. In such cases, it becomes essential to employ contextual intent understanding, model the intention awareness [8], and reason to identify the root causes of mental conditions, which could be used to enable empathetic conversational chatbot [25], and generate responses with human-consistent empathetic intents [9]. These strategies highlight the importance of understanding why users turn to LLM-based counseling and what they anticipate, offering insights for the design and deployment of these systems and making them more humane and responsive.