Enhancing Dialogue Generation via Dynamic Graph Knowledge Aggregation

Paper · arXiv 2306.16195 · Published June 28, 2023

“Recent years have seen a surge of interest in developing chatbots with the facilitation of large-scale knowledge (Tang et al., 2022a). As a highly expressive data format, Knowledge Graphs (e.g. ConceptNet and DBpedia), which include world facts, are considered to be a key factor in building an effective dialogue generation system (Zhou et al., 2018). In order to incorporate graph-structured knowledge, a range of Graph Neural Networks such as Graph Attention Networks (GATs) (Velickovic et al., 2017; Brody et al., 2021) and Graph Convolutional Networks (GCNs) (Kipf and Welling, 2016) have been proposed to learn representations of the topological structure of the knowledge graph via message passing between entities. In open domain dialogue generation, these GNNs are further embedded into generative frameworks to feed graph knowledge features into the language models (LMs).

Despite prior success in leveraging graph knowledge with graph neural networks (GNN) (Zhou et al., 2018; Zhang et al., 2020), current generative frameworks are still hindered by the representation gap in the hidden space between the LMs and GNNs, which poses significant challenges in exploiting graph knowledge in the subsequent text decoding process. As illustrated in Figure 1, prior works using GNNs (Zhu et al., 2017; Ghazvininejad et al., 2018; Zhou et al., 2018; Zhang et al., 2020) tend to fuse the graph features by transforming them into text form and then feeding them into the language model, which acts as a “copy” mechanism. In other words, these networks run as a pipeline where the graph knowledge is firstly transformed into additional text to avoid the problem of language model encoding brought about by the heterogeneous graph features. However, these separate encoding stages result in neural networks learning suboptimal representations of graph knowledge, which leads to information loss. With large-scale pretrained models such as GPT-2 (Radford et al., 2019), BART (Lewis et al., 2020) and T5 (Raffel et al., 2020) being widely adopted in recent advances in dialogue generation, the drawbacks that arise from incompatibility between GNNs and LMs becomes a more severe problem, prohibiting chatbot systems from leveraging graph structured data effectively.

In contrast to existing works (Zhu et al., 2017; Ghazvininejad et al., 2018; Zhou et al., 2018; Zhang et al., 2020) which incorporate graph knowledge with conventional GNNs (causing inadequacies in representation learning), we propose to involve language models in both text and graph knowledge incorporation at all steps via hierarchically aggregating knowledge on a dynamic pseudo graph. During the knowledge aggregation process, knowledge triples are reorganised as shown in Figure 1 (b), where pseudo nodes are created to learn conceptual representations from original knowledge triples. Conceptual semantics are forced to coagulate into pseudo nodes, and finally merge into a condensed feature vector to fill the semantic gap of the encoded text features. Our approach for incorporating text and graph knowledge features can be adapted to all language models with an encoder-decoder architecture.”