Improving Document-Level Sentiment Analysis with User and Product Context

Paper · arXiv 2011.09210 · Published November 18, 2020
Sentiment Semantics Toxic DetectionsReading Summarizing

“Document-level sentiment analysis aims to predict sentiment polarity of text that often takes the form of product or service reviews. Tang et al. (2015) demonstrated that modelling the individual who has written the review, as well as the product being reviewed, is worthwhile for polarity prediction, and this has led to exploratory work on how best to combine review text with user/product information in a neural architecture (Chen et al., 2016; Ma et al., 2017; Dou, 2017; Long et al., 2018; Amplayo, 2019; Amplayo et al., 2018). A feature common amongst past studies is that user and product IDs are modelled as embedding vectors whose parameters are learned during training. We take this idea a step further and represent users and products using the text of all the reviews belonging to a single user or product – see Fig. 1 (left).

There are two reasons to incorporate review text into user/product modelling. Firstly, the reviews from a given user will reflect their word choices when conveying sentiment. For example, a typical user might use words such as fantastic or excellent with correspondingly high ratings but another user could use the same words sarcastically with a low rating. Similarly, a group of users writing a review of the same product may use the same or similar opinionated words to refer to that product. Secondly, learning meaningful user and product embeddings that are only updated by back propagation is difficult when a user or product only has a small number of reviews, whereas one may still be able to glean something useful from the text of even a small number of reviews.”