Modeling Interpersonal Linguistic Coordination in Conversations using Word Mover's Distance
Linguistic coordination is a well-established phenomenon in spoken conversations and often associated with positive social behaviors and outcomes. While there have been many attempts to measure lexical coordination or entrainment in literature, only a few have explored coordination in syntactic or semantic space. In this work, we attempt to combine these different aspects of coordination into a single measure by leveraging distances in a neural word representation space. In particular, we adopt the recently proposed Word Mover’s Distance with word2vec embeddings and extend it to measure the dissimilarity in language used in multiple consecutive speaker turns. To validate our approach, we apply this measure for two case studies in the clinical psychology domain. We find that our proposed measure is correlated with the therapist’s empathy towards their patient in Motivational Interviewing and with affective behaviors in Couples Therapy. In both case studies, our proposed metric exhibits higher correlation than previously proposed measures. When applied to the couples with relationship improvement, we also notice a significant decrease in the proposed measure over the course of therapy, indicating higher linguistic coordination.
When people engage in conversations in social settings, they tend to coordinate with each other and show similar behavior in various modalities. This tendency, known as entrainment or coordination, is exhibited through facial expressions [1], head-motion [2], vocal patterns (vocal entrainment) [3, 4], as well as the use of language (linguistic coordination) [5]. Linguistic coordination is a well-established phenomenon in both spoken and written communication that has many collaborative benefits. It is often associated with a wide range of positive social behaviors and outcomes, such as task success in collaborative games [6, 7], building effective dialogues [8] and rapport [9], engagement in tutoring scenario [10], successful negotiation [11] etc.
Understanding linguistic coordination and quantifying it is beneficial in characterization of interpersonal behavior in psychotherapy, and in monitoring the quality and efficacy of therapy [12, 13]. Another potential application lies in spoken dialog systems and conversational agents, where the system can learn to use linguistic coordination to communicate efficiently with the human user and create a common ground [7].
According to Pickering and Garrod’s model [5], there exist several different components in linguistic coordination – lexical, syntactic and semantic. Among these lexical entrainment has been arguably the focus of the most attention, primarily in psycholinguistics [14, 15]. While it is a complex and multifaceted phenomenon, a number of studies have explored specific forms of lexical entrainment, such as linguistic style matching [16], similarity in choice of high frequency words [6], similarity in referring expressions [15], similarity in style words [17] etc. Researchers in computational linguistics also tried to quantitatively measure lexical entrainment in conversational settings.