AI Social Psychology Language Understanding and Reasoning

Can statistical rarity measure whether stories are truly original?

Can we operationalize originality as statistical rarity in narrative feature space? This matters because copyright law requires measuring human creative control, but rarity is relative, context-dependent, and doesn't guarantee quality or authorship.

Note · 2026-05-28 · sourced from Co Writing Collaboration

As AI seeps into writing, the question of what counts as original work shifts from how a story is written to how it is conceived. StoryScope proposes a concrete operationalization: represent each story as a vector of discourse-level narrative features and treat statistical rarity in that space as a proxy for originality. Less common combinations of narrative decisions reflect the broader notion of originality invoked by creativity research (Torrance) and by copyright law, which requires a minimal degree of originality and, per recent U.S. Copyright Office guidance, sufficient human creative control. The empirical hook: human stories are, on average, rarer in narrative feature space, while the five AI models occupy a tight, well-separated cluster.

This is appealing because it converts a contested legal-aesthetic concept into something measurable and model-agnostic. Rarity does not depend on surface style (which survives the humanization edit) and aligns with the intuition that originality is about making uncommon choices, not novel word combinations. It also gives the copyright question an operational handle: a work's position in narrative-decision space could index how much distinctive human conception it carries.

Why it stays a question: rarity-as-originality is a proxy with sharp limits. Rarity is defined relative to a reference distribution, so it drifts as both human and AI writing change — and rare is not the same as good or protectable; an incoherent story can be statistically rare. Conflating "uncommon in feature space" with "originally authored by a human" risks both false positives (idiosyncratic AI output) and false negatives (a human writing in a popular convention). The construct is a useful, falsifiable starting point for measuring conception rather than execution — but whether it should bear legal or evaluative weight is exactly what it leaves open.


— "StoryScope: Investigating idiosyncrasies in AI fiction", https://arxiv.org/abs/2604.03136

Related concepts in this collection

Concept map
12 direct connections · 109 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere
Original note title

originality can be operationalized as statistical rarity in a feature space of narrative decisions