Man vs machine – Detecting deception in online reviews

Paper · Source
Social Theory SocietyLinguistics, NLP, NLUNatural Language InferenceSentiment Semantics Toxic Detections

This study focused on three main research objectives: analyzing the methods used to identify deceptive online consumer reviews, evaluating insights provided by multi-method automated approaches based on individual and aggregated review data, and formulating a review interpretation framework for identifying deception.

theoretical framework is based on two critical deception-related models, information manipulation theory and self-presentation theory. The findings confirm the interchangeable characteristics of the various automated text analysis methods in drawing insights about review characteristics and underline their significant complementary aspects. An integrative multi-method model that approaches the data at the individual and aggregate level provides more complex insights regarding the quantity and quality of review information, sentiment, cues about its relevance and contextual information, perceptual aspects, and cognitive material.

online reviews influence 93% of individuals in their market decisions and emphasize the importance of peer-generated digital content in consumer decisions

There are numerous instances in which we encounter deception in consumer reviews, including businesses incentivizing consumers to write about their brand and competing brands, as well as using modern digital entrepreneurs and online reputation management companies to manage this process (Choi et al., 2017; Ivanova & Scholz, 2017; Luca & Zervas, 2016; Petrescu et al., 2022; Sahut, Iandoli, & Teulon, 2021). Deceptive communication takes different forms, including automatically filtering out negative reviews, misleading aggregation algorithms, artificially written fake reviews, and incentivized consumer comments, which makes it difficult for consumers to evaluate this type of information (Dellarocas, 2006; Hu et al., 2011, 2012; Moon, Kim, & Iacobucci, 2021; Munzel, 2016; Plotkina, Munzel, & Pallud, 2020).

However, despite a significant number of research studies in business and data science on fake review detection, there is still no consensus on the efficacy of automated text classification methods and the best approaches to be employed by marketers and consumers, especially regarding the use of real-life vs. artificial reviews, valence, and procedure of analysis (Cardoso, Silva, & Almeida, 2018; Hajek & Sahut, 2022). Researchers note that semantic meaning, context and sentiment information are essential in evaluating deceptive reviews and review and reviewer characteristics (Hajek & Sahut, 2022; Heydari et al., 2015). Nevertheless, there is also a need for comprehensive theoretical models that focus on textual characteristics as indicators of review authenticity vs. deception (Banerjee & Chua, 2017; Petrescu et al., 2022). This is especially important considering that humans have a much lower accuracy of deception detection in online reviews than automated tools, even when primed with information on cues of fake online reviews (Plotkina, Munzel, & Pallud, 2020).

Specialists are also working on theoretical approaches, algorithms, and models to assess the degree of deception based on lexical and semantic characteristics, including review length, complexity, readability, subjectivity, valence, and sentiment (Banerjee & Chua, 2017; Mukherjee et al., 2012; Ott et al., 2012).

Deception is a deliberate act performed by manipulating information to create or maintain a belief that the communicator knows to be false (DePaulo et al., 2003; Munzel, 2015; Peng et al., 2016; Xiao & Benbasat, 2011). In marketing communications, consumers have specific expectations regarding the characteristics and quality of the message and its credibility, which can be exploited through deceptive content (McCornack, 1992; McCornack et al., 1996, 2014).

When it comes to factors that can help identify deceptive reviews, previous studies have mentioned a lack of details, emotional exaggeration, and variability in valence, as well as review length (Luca & Zervas, 2016; Moon, Kim, & Iacobucci, 2021; Ott et al., 2013). Reviews can be differentiated based on numerous factors, such as comprehensibility, specificity, exaggeration, and negligence, as well as on syntactic elements like structure and format, writing style, and readability (Wu et al., 2020). Information Manipulation Theory (McCornack, 1992; McCornack et al., 1992, 2014) and Self-Presentation Theory (Banerjee & Chua, 2017; DePaulo et al., 2003) provide the most comprehensive and integrative models of deception identification in consumption communication used in both manual and automated consumer review analysis and are used as the bases for the theoretical framework explored and tested in our analysis.

4.1.Information Manipulation Theory

Information Manipulation Theory (IMT) states that consumers manipulate information simultaneously along different dimensions, which can be identified based on quantity (amount of information), quality (details), relation (relevance), and communication (style) manner (McCornack, 1992; McCornack & Levine, 1990; McCornack et al., 1996). This theory emphasizes that most of the everyday deceptive discourse includes numerous deceptive elements, including adjusting the amount of relevant information shared, incorporating false information, using irrelevant information, and employing a vague manner of communication (McCornack, 1992; McCornack & Levine, 1990).

Authors found that alterations of amount, veracity, relevance, and the clarity of information impact perceived message deceptiveness (McCornack, 1992; McCornack & Levine, 1990). In this context, studies also found that consumers have a minimal capacity for detecting deception because of a significant truth bias (Levine, Park, & McCornack, 1999; Plotkina, Munzel, & Pallud, 2020).

An updated version of IMT (McCornack et al., 2014) presents a propositional, testable theory of deceptive discourse production, enriches this theoretical framework and brings it up for modern digital communication by integrating elements of linguistics, cognitive neuroscience, speech production, and artificial intelligence. This new version focuses on individual intentional states, cognitive load, and information manipulation, including the intentional nature of deception, which is additionally completed by self-presentation theory.

4.2.Self-Presentation Theory

Self-presentation focuses on how individuals control the way they present themselves and try shaping others’ opinions, based on controlled information about themselves, other individuals, and events. In the self-presentation perspective, the authors focus on the assumption that cues to deception are generally weak and authentic messages differ from deceptive ones as a function of perceptual aspects, such as sensory information, contextual details, such as information related to location and time, as well as cognitive information

deceptive communicators are less forthcoming and provide fewer details and information.

consumers may also be likely to build an appearance of naturalness to signal low effort to their audience in a strategic manner in their interpersonal communication and self-presentation

Consumers often engage in self-presentational tactics with the purpose of manipulating and misrepresenting themselves to achieve positive outcomes