Neural Topic Modeling of Psychotherapy Sessions

Paper · arXiv 2204.10189 · Published April 13, 2022

During the session, the dialogue between the patient and therapist are transcribed into pairs of turns. We take the full records of a patient, or a cohort of patients belonging to the same condition. We either use it as is before the feature extraction, or we truncate them into segments based on timestamps or topic turns. When we have the features, we fit them into the topic models. The end results of the topic modeling would be a list of weighted topic words, that tells us what the text block is concerned with. These knowledges are usually very informative and interpretable, thus important in psychotherapy.

inform whether the therapy is going the right direction, whether the patient is going into certain bad mental state, or whether the therapist should adjust

Some topics can also be off-limit taboos, such as those in suicidal conversations, so if such terms arises from the topic modeling (say, a dynamic topic modeling), it can be flagged for the doctor to notice.

Say, if we have learned 10 topics, the topic score will be a vector of 10 dimensions, with each dimension corresponding to some notion of likelihood of this turn being in this topic. Because we want to characterize the directional property of each turn with a certain topic, we compute the cosine similarity of the embedded topic vector and the embedded turn vector, instead of directly inferring the probability as traditional topic assignment problem (which would be more suitable if we merely want to find the assignment of the most likely topic). In the result section, we will present the temporal modeling of the Embedded Topic Model (ETM), but this analytic pipeline can in principle be applied to any learned topic models.