DOC: Improving Long Story Coherence With Detailed Outline Control

Paper · arXiv 2212.10077 · Published December 20, 2022

“Recent advancements in natural language generation systems have fueled increased interest in long-form text generation, in which texts may span thousands of words or more. Compared to tasks with shorter outputs, long-form generation involves meaningfully different challenges. It is nontrivial to maintain overarching coherence, or even basic relevance to an initial premise or plan. Even the most advanced language models to date, such as GPT4 (OpenAI, 2023), still cite long context as a major direction for further improvement, and require structured planning to generate text longer than a few hundred words.

In this work, we focus on long-form story generation, which is representative of the major difficulties in long text generation. Only recently have prior efforts even attempted to generate stories of comparable length to human-authored “short stories” (Re3, Yang et al. (2022)). Compared to humans, state-of-the-art story generation systems like Re3 still fall short in numerous areas: common failure modes include insufficient high-level planning resulting in local fluency amid global incoherence, or deviating from said planning even when it exists. To bridge some of this gap, we propose the Detailed Outline Control (DOC) framework. While reusing the high-level planning-drafting-revision structure of Re3, DOC improves long-range plot coherence via two complementary approaches. First, our detailed outliner refines a brief initial outline into a more detailed, hierarchical one (Figure 1 left). As motivation, a human author might also iteratively refine and expand a brief initial outline before drafting a long document, using the outline to guide a coherent plot rather than improvising plot points on the fly. Accordingly, our detailed outliner employs a structured prompting procedure to create a detailed outline with length scalable according to the desired scope of generation. Individual outline items are associated with a setting and characters, and are carefully filtered for relevance and coherence in context.

Second, our detailed controller maintains faithfulness to our detailed outline by controlling passage generation based on corresponding outline items (Figure 1 right). Because our detailed outline imposes many overlapping soft constraints, the detailed controller must exert sufficient control strength to enforce them. The detailed controller must also accommodate flexible natural language inputs and be computationally efficient when generating with state-of-the-art large language models. We implement the detailed controller as an OPT- 350m-based controller according to FUDGE (Yang and Klein, 2021), designing a contrastive training procedure that aligns summaries to passage prefixes. In particular, we construct fluent hard negatives to encourage lengthy outputs to be not only initially on topic, but relevant throughout.

Compared to the original Re3, the previous state of the art in long-form story generation, using DOC achieves dramatically higher plot coherence (22.5% absolute gain), outline relevance (28.2%), and even interestingness (20.7%) in pairwise human evaluations (Section 4). Our ablations indicate that both the detailed outliner and detailed controller are critical (Section 5.1). We also demonstrate that DOC can generate stories in collaboration with humans, interacting at a high-level planning stage rather than passage-by-passage as in many prior works (Coenen et al., 2021; Lee et al., 2022), and is overwhelmingly preferred over the original Re3 in this setting (Section 4.1).1”