Multi-hop Question Answering via Reasoning Chains

Paper · arXiv 1910.02610 · Published October 7, 2019
Reasoning Methods CoT ToTQuestion Answer Search

Multi-hop question answering requires models to gather information from different parts of a text to answer a question. Most current approaches learn to address this task in an end-to-end way with neural networks, without maintaining an explicit representation of the reasoning process. We propose a method to extract a discrete reasoning chain over the text, which consists of a series of sentences leading to the answer. We then feed the extracted chains to a BERT-based QA model (Devlin et al., 2018) to do final answer prediction. Critically, we do not rely on gold annotated chains or “supporting facts”: at training time, we derive pseudo-gold reasoning chains using heuristics based on named entity recognition and coreference resolution.

two-stage model that identifies intermediate reasoning chains and then separately determines the answer. A reasoning chain is a sequence of sentences that logically connect the question to a fact relevant (or partially relevant) to giving a reasonably supported answer. Figure 1 shows an example of what such chains look like. Extracting chains gives us a discrete intermediate output of the reasoning process, which can help us gauge our model’s behavior beyond just final task accuracy. Formally, our extractor model scores sequences

We present a method for extracting oracle reasoning chains for multi-hop reasoning tasks. These chains generalize across multiple datasets and are comparable to human-annotated chains. (2) We present a model that learns from these chains at train time and at test time can produce a list of chains. Those chains could be used to gauge the behaviors of our model.

We describe our notion of chain extraction in more detail. A reasoning chain is a sequence of sentences that logically connect the question to a fact relevant to determining the answer. Two adjacent sentences in a reasoning chain should be intuitively related: they should exhibit a shared entity or event, temporal structure, or some other kind of textual relation that would allow a human reader to connect the information they contain.

First, how can we automatically select pseudo-ground truth reasoning chains? Second, how do we model the chain extraction process? Third, how do we take one or more extracted chains and turn them into a final answer?

We derive heuristic reasoning chains by searching over a graph which is constructed based on these factors. Each sentence si is represented as a node i in the graph. We run an off-the-shelf named entity recognition system to extract all entities for each sentence. If sentence i and sentence j contain a shared entity, we add an edge between node i and j.

Starting from the question node, we do an exhaustive search to find all possible chains that could lead to the answer.

When we ask LLM questions, we do not know what the answer will be so the hints are unknown. In this prompt design, we consider the following two potential situations: 1) The hints are the same as the correct answer: to be sure that the model can still get the correct answer when the hint is correct; 2) hints are not the same as the correct answer: to be sure that the model can jump out of the incorrect answer.

Adhering to the above guidelines, we utilize the Standard prompt, CoT prompt, and Complex CoT prompt to generate initial base answers, from which we can then develop the subsequent answer generation prompts, namely, PHP-Standard prompt, PHP-CoT prompt, and PHP-Complex CoT prompt, respectively.