Can Authorship Representation Learning Capture Stylistic Features?

Paper · arXiv 2308.11490 · Published August 22, 2023
Sentiment Semantics Toxic Detections

“Knowing something about an author’s writing style is helpful in many applications, such as predicting who the author is, determining which passages of a document the author composed, rephrasing text in the style of another author, and generating new text in the style of a particular author. The trouble is that fully characterizing something as complex as writing style has proven too unwieldy to admit fine-grained human annotations, which leaves the possibility of directly learning explicit and interpretable representations of writing style practically beyond reach. Instead, research in this area has largely focused on specific stylistic attributes, such as formality, toxicity, politeness, gender, simplicity, and humor, which are more straightforward to annotate (Rao and Tetreault, 2018; Pavlopoulos et al., 2020; Madaan et al., 2020; Li et al., 2018; Jin et al., 2022). Unfortunately, the reliance on human labels and the narrow focus of such stylistic distinctions severely limit the utility of such representations in tasks related to authorship, such as those listed above.

...we propose targeted interventions to probe representations learned for the surrogate authorship prediction task. First, we explore masking content words at training time in §5, an operation intended to gauge the degree to which a representation relies on content. Then we explore automatic paraphrasing in §6, an operation intended to preserve meaning while modifying how statements are expressed. Finally, in §7 we explore the capacity of these representations to generalize to unseen tasks, specifically topic classification and coarse style prediction.

Taken together, and despite approaching the research question from various points of view, our experiments suggest that representations derived from the authorship prediction task are indeed substantially stylistic in nature. In other words, success at authorship prediction may in large part be explained by having successfully learned discriminative features of writing style. The broader implications of our findings are discussed in §8. “