Can rarity in feature space distinguish human authorship from AI output reliably?

This explores whether you can tell human writing from AI by measuring how *unusual* a text is — the idea that humans land in rarer, more scattered regions of some feature space while AI clusters in the predictable middle.

This explores whether "rarity" — being statistically unusual in some measured feature space — is a reliable tell for human authorship. The corpus says the signal is real and, in places, surprisingly strong, but "reliably" hides two very different questions: reliable for a machine, or reliable for a person?

The rarity idea is most explicitly developed in narrative work. StoryScope operationalizes originality as statistical rarity in *discourse-level* decisions — character agency, how time is sequenced — and finds human stories genuinely occupy rarer regions of that space while AI outputs cluster tightly together Can statistical rarity measure whether stories are truly original?. Crucially, the discriminating features aren't surface style; the same approach hits ~93% accuracy using structure alone, and that structure resists "humanization" because faking it requires a rewrite, not a word swap Can AI stories be detected without analyzing writing style?. So rarity isn't just a statistical artifact — it points at something AI structurally does differently.

Why would AI cluster? Because models converge. Analysis of 70+ models across 26K open-ended prompts found an "Artificial Hivemind": different systems independently produce strikingly similar outputs, since they share training data and alignment procedures Do different AI models actually produce diverse outputs?. That convergence is the flip side of rarity — if all the models pile into the same region, anything *outside* that region reads as human. And the gap is measurable along concrete axes: AI text diverges from human writing across six dimensions of lexical diversity, robustly enough to show up under statistical testing Can human judges detect measurable differences in AI text?.

Here's the catch that should reframe the whole question. "Reliably" for a *detector* and "reliably" for a *human reader* are not the same thing. The exact same measurable lexical differences that machines catch are invisible to people — even trained linguists and NLP researchers can't reliably tell the two apart, and newer models diverge *further* statistically while getting *harder* for humans to spot Can humans detect AI text if machines can measure it?. So rarity-in-feature-space can be a reliable machine signal precisely *because* it's imperceptible to us. For arguments rather than fiction, cheap interpretable linguistic features hit 99% accuracy spotting LLM counter-arguments — AI leaves signatures like over-accommodation to the prompt and textbook-clean argument markers humans don't bother to produce Can simple linguistic features detect AI-written arguments?.

The honest answer: rarity is a strong, reliable signal *today*, but it's a moving target, not a permanent fingerprint. It works because current models converge into a measurable common region; it's defined relative to that region, so it drifts as models change. And it tells you a text is statistically machine-typical — not that the idea behind it is hollow. The corpus's deeper claim is that AI lacks the *event structure* of genuine utterance and we supply the missing intent ourselves Does AI generate genuine utterances or just text patterns?. Rarity may be the closest measurable shadow of that absence — but it's a shadow, not the thing itself.

Sources 7 notes

Can statistical rarity measure whether stories are truly original?

StoryScope operationalizes originality as statistical rarity in discourse-level narrative decisions. Human stories are measurably rarer in this space than AI outputs, which cluster tightly, offering a quantifiable proxy for the human conception copyright law requires.

Can AI stories be detected without analyzing writing style?

StoryScope achieved 93.2% accuracy separating AI from human fiction using only discourse-level features like character agency and chronological structure, retaining 97% of performance while eliminating stylistic cues. These structural choices resist humanization because they require rewrites, not surface edits.

Do different AI models actually produce diverse outputs?

INFINITY-CHAT analyzed 70+ models across 26K open-ended queries and found an "Artificial Hivemind" effect: models independently generate strikingly similar or identical responses due to overlapping training data and alignment procedures, undermining the diversity benefits of model ensembles.

Can human judges detect measurable differences in AI text?

Six-dimension MANOVA analysis confirms significant differences between ChatGPT and human writing across vocabulary volume, abundance, variety, evenness, disparity, and dispersion. Despite these robust statistical differences, human judges including linguists and NLP researchers fail to reliably distinguish AI from human text.

Can humans detect AI text if machines can measure it?

LLM-generated text differs significantly on six lexical diversity dimensions, confirmed through statistical analysis across multiple models. Yet human judges, including trained linguists, cannot reliably detect these differences—and newer models diverge further while becoming harder to spot.

Can simple linguistic features detect AI-written arguments?

General linguistic features combined with argument-quality measures achieved 99% accuracy detecting LLM-generated counter-arguments on r/ChangeMyView, matching heavyweight neural detectors while remaining computationally cheap and transparent. LLMs produce detectable stylistic signatures: accommodation to prompts and textbook-quality argument markers that humans don't replicate.

Does AI generate genuine utterances or just text patterns?

AI output carries communicative markers inherited from training data but lacks the event structure that produces actual utterances. Users supply the missing orientation through interpretive labor, creating a pseudo-event with structure only on the human side.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are an AI capability analyst tasked with re-evaluating a year-old research question: **Can rarity in feature space distinguish human authorship from AI output reliably?**

What a curated library found — and when (findings span 2024–2026; treat as dated claims, not current truth):
• Human narratives occupy statistically rarer regions in discourse-level feature space (character agency, temporal sequencing); AI clusters tightly. Discriminating accuracy ~93% using structure alone, resistant to surface humanization (StoryScope, 2026).
• Different LLMs independently converge on similar outputs across 26K open-ended prompts — the "Artificial Hivemind" — meaning rarity-as-distance-from-AI-cluster is a function of model convergence, not an invariant signature (2025).
• Six measurable dimensions of lexical diversity robustly separate LLM from human text statistically, yet trained linguists and NLP researchers fail to detect these differences in blind evaluation; newer models diverge *further* statistically while becoming *harder* for humans to perceive (2025).
• Lightweight interpretable linguistic features (e.g., over-accommodation to prompt, textbook-clean argument markers) achieve 99% accuracy detecting LLM-generated counter-arguments — AI leaves signatures humans don't bother producing (2024).
• The signal may reflect AI's lack of genuine *event structure* and utterance intent — rarity is a measurable shadow of that absence, not the thing itself (synthesis, 2026).

Anchor papers (verify; mind their dates):
• arXiv:2510.22954 — Artificial Hivemind (2025)
• arXiv:2604.03136 — StoryScope: AI fiction idiosyncrasies (2026)
• arXiv:2508.00086 — Lexical diversity in LLM vs. human text (2025)
• arXiv:2404.00750 — LLM argument recognition (2024)

Your task:
(1) **RE-TEST EACH CONSTRAINT.** For every finding above, judge whether newer models (o1, Claude 3.5, Llama 3.3), training refinements (process-based reward modeling, constitutional AI), tooling (fine-tuning harnesses, adaptive sampling), or orchestration (chain-of-thought, multi-step reasoning, memory systems) have since RELAXED or OVERTURNED it. Separate the durable question (likely: *Do models intrinsically converge?*) from the perishable limitation (possibly: *Can rarity detectors outpace model divergence?*). Cite what resolved it; flag where constraints still hold.
(2) Surface the strongest **CONTRADICTING or SUPERSEDING work** from the last ~6 months (if any) showing either that rarity fails, or that newer models escape clustering, or that imperceptibility has flipped.
(3) Propose 2 research questions that **ASSUME the regime may have moved**: e.g., *If models now diverge structurally (not just statistically), is rarity obsolete or reborn in a new feature space?* or *Can rarity detectors be made robust to future model shifts?*

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Can rarity in feature space distinguish human authorship from AI output reliably?

Sources 7 notes

Next inquiring lines