Truth or lie: Exploring the language of deception

Paper · Source

we investigated whether linguistic features that differentiate true and false utterances in English—namely utterance length, concreteness, and particular parts-of-speech—are also present in the Polish language. We analyzed nearly 1,500 true and false statements, half of which were transcripts while the other half were written statements. Our results show that false statements are less complex in terms of vocabulary, are more concise and concrete, and have more positive words and fewer negative words. We found no significant differences between spoken and written lies. Using this data, we built classifiers to automatically distinguish true from false utterances, achieving an accuracy of 60%.

In writing, liars have more time to plan their utterances and can edit them. In some studies where only written statements were analyzed, this affected—among other things—their length: subjects used more words when lying than when they were telling the truth [13].

Part-of-speech and over-generalizations. We tested whether liars distance themselves from their lies by using fewer first-person pronouns and more third-person pronouns

Another way to distance oneself from lies can be to use over-generalizations. We measured them by comparing words from statements with lists of words explicitly created for this variable and by counting the frequency of their occurrence. We hypothesize that there will be more over-generalizations in false written and oral statements than in true statements.