Exploring Large Language Models for Knowledge Graph Completion
“A KG is generally a multi-relational graph with entities as nodes and relations as edges. Each edge is depicted as a triplet (head entity, relation, tail entity) (abbreviated as (h, r, t)), signifying the relationship between two entities, for instance, (Steve Jobs, founded, Apple Inc.). Despite their effectiveness, knowledge graphs remain incomplete. This issue leads to the challenge of knowledge graph completion, which aims to evaluate the plausibility of triples that are not present in a knowledge graph. A significant amount of research has been dedicated to knowledge graph completion. One prevalent method is knowledge graph embedding (Wang et al., 2017). However, most knowledge graph embedding models solely rely on structural information from observed triple facts, leading to issues arising from the sparsity of knowledge graphs.
In this study, we propose a novel method for knowledge graph completion using large language models. Specifically, we treat entities, relations, and triples as textual sequences and model knowledge graph completion as a sequence-to-sequence problem. We perform instruction tuning with open LLMs (LLaMA (Touvron et al., 2023) and Chat- GLM (Du et al., 2022)) on these sequences for predicting the plausibility of a triple or a candidate entity/relation.”