Evaluating the Efficacy of Interactive Language Therapy Based on LLM for High-Functioning Autistic Adolescent Psychological Counseling

Paper · arXiv 2311.09243 · Published November 12, 2023

significant emphasis was placed on the development of prompts used to guide the Large LanguageModel (LLM). This process was intricate and involved multiple stages to ensure that the prompts were effective, relevant, and resonated well with the intended therapeutic objectives. To maintain clarity and consistency, each set of prompts was given an alias name, akin to how Bing Chat, codenamed Sydney, is identified. This naming convention facilitated easier reference and organization of the various prompt types used in the study

Each prompt was designed to reflect the role of a counselor, embodying characteristics such as empathy, active listening, and the ability to provide guidance without leading or influencing the patient’s thoughts unduly. The language used was carefully chosen to be accessible and engaging for high-functioning autistic adolescents, ensuring that it was neither too simplistic nor overly complex. The prompts were crafted to encourage open-ended responses, allowing the LLM to demonstrate its capacity for generating conversational content that is both relevant and contextually appropriate. This aspect was crucial in evaluating the LLM’s potential as a therapeutic tool, as it needed to simulate the adaptive and responsive nature of a human counselor.

scorecard is designed for use by clinical psychologists and psychiatrists to systematically evaluate the performance of the LLM. It consists of 30 questions

Empathy and Understanding (Questions 1-5): These questions assess the LLM’s ability to demonstrate empathy and understanding. They focus on the model’s capacity to recognize and appropriately respond to the emotional states and needs of the hypothetical patient.
Communication Skills (Questions 6-10): This section evaluates the LLM’s effectiveness in communication. It includes questions on clarity of expression, appropriateness of language for the adolescent’s age and cognitive level, and the ability to maintain a coherent and relevant conversation.
Adaptability and Responsiveness (Questions 11-15): These questions measure the LLM’s adaptability to changing conversation dynamics and its responsiveness to new or unexpected inputs. This section is crucial for understanding how well the LLM can simulate the flexible nature of human therapeutic interactions.
Engagement and Motivation (Questions 16-20): This category assesses the LLM’s ability to engage the hypothetical patient and motivate them to participate in the therapy session. It includes questions on the model’s ability to sustain interest and encourage active participation.
Therapeutic Alliance (Questions 21-25): These questions evaluate the LLM’s capacity to establish a therapeutic alliance, a critical component of effective therapy. It focuses on the model’s ability to build trust, rapport, and a sense of safety within the therapeutic interaction.
Overall Effectiveness (Questions 26-30): The final section provides an overall assessment of the LLM’s effectiveness as a therapeutic tool. It includes questions on the perceived value of the LLM in a therapeutic setting, its potential benefits for high-functioning autistic adolescents, and the likelihood of recommending such a tool in clinical practice.

Overall, the LLM demonstrated a notable capacity for empathetic and understanding responses. It consistently recognized and appropriately responded to the emotional states presented in the simulated scenarios. The evaluators highlighted the model’s ability to validate the experiences and emotions of hypothetical patients, which is a crucial aspect of effective therapy. However, while the LLM showed competence in understanding and empathy, there were occasional lapses in maintaining this consistency, especially in more complex emotional scenarios.

No. Question Score (1-5)

1 Does the LLM demonstrate understanding of the patient’s feelings?

2 Does the LLM respond empathetically to emotional cues?

3 Is the LLM’s empathy consistent and appropriate?

4 Does the LLM validate the patient’s experiences and emotions?

5 Does the LLM encourage expression of feelings in a safe manner?

6 Is the LLM’s communication clear and understandable?

7 Does the LLM use age-appropriate language and concepts?

8 Can the LLM maintain a coherent and relevant conversation?

9 Does the LLM provide clear and concise information when needed?

10 Is the LLM’s language engaging and encouraging to the patient?

11 Can the LLM adapt its responses to changing conversation dynamics?

12 Does the LLM respond appropriately to new or unexpected inputs?

13 Is the LLM capable of redirecting the conversation when necessary?

14 Does the LLM demonstrate flexibility in its conversational approach?

15 Can the LLM adjust its language and style based on patient feedback?

16 Does the LLM effectively engage the patient in the session?

17 Does the LLM motivate the patient to participate actively?

18 Is the LLM capable of sustaining the patient’s interest?

19 Does the LLM encourage patient’s self-expression and autonomy?

20 Does the LLM foster a positive and encouraging session atmosphere?

21 Does the LLM build a sense of trust with the patient?

22 Is there a sense of rapport established by the LLM?

23 Does the LLM create a feeling of safety and acceptance?

24 Can the LLM maintain a consistent therapeutic presence?

25 Does the LLM respect the patient’s pace and boundaries?

26 How effective is the LLM as a therapeutic tool overall?

27 Does the LLM provide meaningful contributions to the therapy?

28 Is the LLM likely to benefit high-functioning autistic adolescents?

29 Would you recommend the use of an LLM in a clinical setting?

30 Does the LLM have potential for future therapeutic applications?

Table 1: Evaluation Scorecard for LLM in Interactive Language Therapy

instances where its responses lacked the depth and personalization that might be expected in a human-led therapeutic session

The adaptability and responsiveness of the LLM were particularly noteworthy. It adeptly handled changing conversation dynamics and responded appropriately to new or unexpected inputs. This adaptability is essential in therapy, where client needs and topics can shift rapidly. The LLM’s flexibility in conversational approach was also commended, though some evaluators suggested further refinement to more closely mimic the nuanced understanding a human therapist might offer.

establishment of a therapeutic alliance, a cornerstone of successful therapy, was an area of mixed results. While the LLM built a sense of trust and rapport in many cases, creating a feeling of safety and acceptance, there were instances where its digital nature seemed to limit the depth of connection achievable compared to a human therapist.

The majority agreed that the LLM could be a valuable addition to clinical settings, particularly as a supplementary tool to traditional therapy methods. The