AI Social Psychology Language Understanding and Reasoning

Can AI stories be detected without analyzing writing style?

Explores whether discourse-level narrative structures like character agency and plot organization reveal AI authorship independently of surface stylistic cues, and whether such structural features resist the kind of fine-tuning that defeats style-based detection.

Note · 2026-05-28 · sourced from Co Writing Collaboration

Most AI-text detection rides on surface signatures: word choice, syntactic structure, the overused em-dash, "delve," "tapestry." These cues are discriminatory but fragile — GPT 5.4 cut em-dash usage, and fine-tuning to mimic human style drops detection on creative writing from 97% to 3%. StoryScope asks a different question: can AI stories be told apart without stylistic signals, using only discourse-level narrative choices like character agency and chronological structure? Across a parallel corpus of 10,272 prompts (each written by a human and five LLMs, 61,608 stories of ~5,000 words), narrative features alone reach 93.2% macro-F1 for human-vs-AI detection, retaining over 97% of the performance of models that include stylistic cues.

The consequential part is the durability argument. Surface style is a post-hoc edit away from concealment; discourse-level narrative structure is not. Changing whether a protagonist's choices are morally ambiguous, or whether a plot runs on a single tidy track versus a nonlinear one with flashbacks, requires structural rewrites rather than find-and-replace. So the features that survive humanization are precisely the ones tied to how a story is conceived, not how its sentences are dressed.

Why it matters: this reframes AI detection from a stylometric arms race into a structural one, and it relocates the question of authorship. If models keep closing the surface-style gap while their narrative choices stay distinct, then detection — and, downstream, the legal question of originality — should attach to discourse structure. The counterpoint is that narrative features are themselves learnable targets; nothing prevents future training from diversifying discourse-level choices, which would erode this signal too, just more slowly than style erodes.


— "StoryScope: Investigating idiosyncrasies in AI fiction", https://arxiv.org/abs/2604.03136

Related concepts in this collection

Concept map
12 direct connections · 86 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere
Original note title

ai fiction is distinguishable by discourse-level narrative choices not surface style which resists humanization