Discourse-Level Representations can Improve Prediction of Degree of Anxiety
The primary clinical manifestation of anxiety is worry associated cognitive distortions, which are likely expressed at the discourse-level of semantics.
discourse patterns of causal explanations, among others, were used significantly more by those scoring high in anxiety
One of the key characteristics of anxiety disorders is cognitive distortion (Muran and Motta, 1993; Maric et al., 2011), or an illogical reasoning in dealing with life events (Kaplan et al., 2017). The primary window into such distortions is language, including one’s own explanatory style – the way they reason about the occurrence of events (Peterson, 1991).
Explanatory style may not be well represented by single words or words in context (i.e., lexical-level features). For example, consider the catastrophizing statement (i.e., worrying that a bad event will lead to an extreme outcome) “I’m sick. Now I’m going to miss my classes and fail them all.” (Hazlett-Stevens and Craske, 2003). To see that “fail them all” is catastrophizing the event “I’m sick” requires understanding that the latter is a causal explanation for the expected falling behind. This is discourse-level information – semantics at the level of complete clausal statements or relating statements to each other (discourse relations) (Pitler et al., 2008).
Here, we propose a language-based assessment of anxiety utilizing both lexical-level and discourse level representations. We first compare models that leverage discourse-level representations alone. We then propose a dual lexical- and discourse-level (lexico-discourse) approach and evaluate whether the combination of both types of representations leads to improved performance. Finally, we explore specific types of discourse relations that are thought to be associated with cognitive distortions, and look at their association with anxiety in order to illuminate what our lexico-discourse approach can pick up on at the discourse semantics level.
(1) proposal of a novel user-level language assessment model that integrates both discourse-level and lexical-level representations; (2) empirical exploration of different discourse and lexical-level contextual embeddings and their value towards predicting the degree of anxiety as continuous values; (3) examination of the association between a person’s anxiety and their discourse relation usage, finding that causal explanations are the most insightful for prediction; and (4) finding that to the best of our knowledge, this is the first model of anxiety from language specifically fit against a screening survey
We evaluate four discourse relations relevant to anxiety. Causal explanations are a statement of why an event happened. Using the model of Son et al. (2018) with F1 of approximately .87 over social media, we computed the percentage of the messages written by a user that contain causal explanation. Counterfactuals imagine what could have happened as an alternative to actual events. Using the model of Son et al. (2017), we calculate the proportion of the messages from each user that communicates counterfactual thoughts. Finally, dissonance refers to situations in which one’s stated behavior or belief contradicts a prior belief; consonance is its opposite concept. We use the RoBERTa-based topic-independent classifier that evaluates whether a pair of messages composes dissonance (Varadarajan et al., 2022, 2023). Instead of assessing all pairs, we take two temporally adjacent messages (maximum distance of 2) to reduce computation time.
We see that all discourse dimensions were related to the score, but causal explanations, often related to overgeneralization, had the highest difference (e.g., “You know life is going to be permanently complicated when your in-laws start turning their backs on you like a domino effect.”).