Can prompted or fine-tuned models generate genuine narrative ambiguity?
This explores whether prompting or fine-tuning can get an LLM to produce real narrative ambiguity — open-ended, morally unresolved storytelling — or whether the architecture pushes models toward tidy, over-explained plots no matter how you steer them.
This explores whether prompting or fine-tuning can get a model to write genuine narrative ambiguity — and the corpus suggests the obstacle is deeper than craft. The most direct evidence is a large-feature analysis of AI versus human fiction: across 304 narrative signals and all five major models tested, AI stories systematically over-explain their themes, favor tidy single-track plots, and avoid moral ambiguity, while human stories lean on temporal complexity and nonlinear structure Do AI stories explain their themes more than human stories do?. So the default behavior is the opposite of ambiguity — the model resolves what a human writer would leave hanging.
The interesting question is whether prompting fixes that. Two notes say the ceiling is real. Prompt optimization can reorganize and surface what a model already learned, but it cannot inject capacities the training distribution lacks — there's a hard floor no prompt strategy climbs past Can prompt optimization teach models knowledge they lack?. And models tend to override their context with strong parametric priors: when training-time associations are confident, textual instructions alone don't redirect the output, and only intervention in the model's internal representations does Why do language models ignore information in their context?. If 'resolve the plot, explain the theme' is baked in as a prior, asking nicely for ambiguity may not be enough.
There's a subtle twist, though. Ambiguity isn't only an authoring choice — it's structurally close to how these models already work. Shanahan's 20-questions regeneration test shows an LLM doesn't commit to one character or object; it holds a superposition of consistent possibilities and samples one at generation time, so regenerating yields a different-but-coherent answer each run Do large language models actually commit to a single character?. That latent non-commitment is, in a sense, raw ambiguity — but it shows up as cross-run variance, not as a single text that holds two readings at once. The same machinery that could fuel genuine multiplicity also makes persona-style outputs unstable, with run-to-run variance matching or exceeding the variance across distinct personas Why do LLM persona prompts produce inconsistent outputs across runs?. So the model can be undecided across attempts while still flattening each individual attempt into something over-explained.
Fine-tuning offers a sharper lever than prompting, and adjacent work hints both ways. Consistency training and RL-style objectives can reliably shape narrative behavior — for instance, multi-turn RL cuts persona drift by over 55% by rewarding the right kind of internal coherence Can training user simulators reduce persona drift in dialogue?, and consistency training can make a model invariant to surface perturbations using its own outputs as targets Can models learn to ignore irrelevant prompt changes?. But notice the direction every one of these pushes: toward consistency, invariance, reduced drift — the machinery is optimized to remove the very variability that ambiguity depends on. Fine-tuning is good at making a model commit; genuine ambiguity asks it to withhold.
The thing you might not have known you wanted to know: persona-driven memory retrieval can predict the choices fictional characters make across hundreds of novels Can LLMs predict character choices from narrative context? — models are strikingly good at reading the human ambiguity already written into stories. The gap isn't comprehension; it's generation. They can detect the unresolved, but their training and tuning both reward resolving it. Genuine narrative ambiguity, on this evidence, isn't a prompt away — it runs against the grain of what these systems are optimized to do.
Sources 8 notes
Analysis of 304 narrative features reduced to 30 core signals shows AI fiction systematically over-explains themes, uses tidy single-track plots, and avoids moral ambiguity, while human stories employ temporal complexity and nonlinear structure. This pattern holds across all five major LLM models tested.
Prompting works entirely within a model's pre-existing training distribution and cannot supply domain knowledge absent from training data. This creates a hard ceiling: no prompt strategy can compensate for missing foundational knowledge, only reorganize what already exists.
Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.
Shanahan's 20-questions test shows LLMs maintain a superposition of consistent objects or characters and sample from that distribution at generation time. Regenerating the same response yields different outputs, each consistent with prior context, proving no fixed commitment exists.
When the same persona prompt is run repeatedly, output variance across runs matches or exceeds variance across different personas. This reveals that model uncertainty, not stable social knowledge, drives persona-simulated outputs, making them unsuitable for simulating human annotation disagreement.
By inverting standard RL setups to train user simulators for consistency using three complementary metrics (prompt-to-line, line-to-line, Q&A consistency) as reward signals, persona drift decreases by over 55%. This approach captures distinct failure types: local drift within turns, global drift across conversations, and factual contradictions.
Two methods—BCT (output-level) and ACT (activation-level)—train models to respond identically to clean and wrapped prompts by using the model's own clean responses as targets, eliminating specification and capability staleness inherent in standard SFT.
The LIFECHOICE benchmark (1,462 decisions across 388 novels) shows LLMs predict character choices better when given expert-written persona profiles paired with retrieved memories relevant to the character's psychology. This persona-based approach outperforms automated summarization by 5%.