An Automatic Graph Construction Framework based on Large Language Models for Recommendation

Paper · arXiv 2412.18241 · Published December 24, 2024

Graph neural networks (GNNs) have emerged as state-of-the-art methods to learn from graph-structured data for recommendation. However, most existing GNN-based recommendation methods focus on the optimization of model structures and learning strategies based on pre-defined graphs, neglecting the importance of the graph construction stage. Earlier works for graph construction usually rely on specific rules or crowdsourcing, which are either too simplistic or too labor-intensive. Recent works start to utilize large language models (LLMs) to automate the graph construction, in view of their abundant open-world knowledge and remarkable reasoning capabilities. Nevertheless, they generally suffer from two limitations: (1) invisibility of global view (e.g., overlooking contextual information) and (2) construction inefficiency. To this end, we introduce AutoGraph, an automatic graph construction framework based on LLMs for recommendation. Specifically, we first use LLMs to infer the user preference and item knowledge, which is encoded as semantic vectors. Next, we employ vector quantization to extract the latent factors from the semantic vectors. The latent factors are then incorporated as extra nodes to link the user/item nodes, resulting in a graph with in-depth global-view semantics. We further design metapath-based message aggregation to effectively aggregate the semantic and collaborative information.

Nowadays, large language models (LLMs) emerge as a promising solution for automatic graph construction in view of their vast amount of open-world knowledge, as well as their remarkable language understanding and reasoning capabilities [80]. Recent attempts have proposed various innovative prompt-based techniques, e.g., chain-of-thought prompting [87], multi-turn conversation [72], and proxy code generation [3], to estimate the relational linkage between nodes for graph construction. Although these works automate the graph construction stage with the help of LLMs and largely save the intensive human labor, they still suffer from the following two limitations, especially when faced with the large-scale user/item volumes in industrial recommender systems.

Firstly, existing LLM-based graph construction methods fail to capture the high-quality topological structure among nodes due to the invisibility of global view. The reasonable assessment of node connections should comprehensively consider the global view of the entire dataset, including but not limited to node attributes, graph schema, and contextual information [15, 80]. For example, as depicted in Figure 1(a), suppose we have three item nodes with features: 𝑖1 = {𝑓1, 𝑓2, 𝑓3, 𝑓8}, 𝑖2 = {𝑓1, 𝑓2, 𝑓3, 𝑓5}, and 𝑖3 = {𝑓1, 𝑓6, 𝑓7, 𝑓8}. With a simple local-view pairwise comparison, it seems that item 𝑖1 is more similar to item 𝑖2 since they have more overlapped features. But once we acquire the global information that 𝑓8 serves as a rare feature with fairly low frequency and others are commonly frequent ones, the association between 𝑖1 and 𝑖3 will be significantly enhanced and the connection between 𝑖1 and 𝑖2 could be in turn reduced. Nevertheless, as shown in Figure 1(b), due to the context window limitation of LLMs and the large-scale users/items in RSs, it is hard to incorporate all the important information into the prompt, which will be truncated and incomplete. Therefore, the information utilized by these methods can only be partial, but never global. Such local-view information can thereby lead to inferior topological graph structures.

Secondly, existing works generally suffer from the construction inefficiency issue due to the massive invocations of LLMs. While LLMs provide support for in-depth semantic analysis and complex topological structure mining, their intrinsic expensive inference cost poses a significant challenge to the efficiency of graph construction algorithms. Most works instruct LLMs to infer the similarity scores between nodes in a pairwise manner [58], and result in a time complexity of 𝑂(𝑁2), which is impractical for real-world scenarios where the number of users/items 𝑁 can easily reach million or even billion level [43]. Although several works propose to conduct downsampling [61] or heuristic pre-filtering [87] to reduce the number of calls of LLMs, they generally sacrifice the graph quality and thereby introduce noise to the constructed graph. Therefore, it is crucial to design an efficient yet effective LLM-automated graph construction method for large-scale industrial applications. To this end, we propose AutoGraph, an automatic graph

AutoGraph consists of two stages: quantizationbased graph construction and graph-enhanced recommendation. In the quantization-based graph construction stage, we first leverage LLMs to infer the user preference and item knowledge, which is encoded as semantic vectors. Such a pointwise invocation manner (i.e., invoking LLMs for each single user/item separately) improves the efficiency by reducing the calls of LLMs to 𝑂(𝑁) complexity. Then we propose latent factor extraction for users and items based on vector quantization techniques. By incorporating the latent factors as extra nodes, we build a graph with a global view of in-depth semantics, providing more comprehensive and informative insights through the topological structure. In the graph-enhanced recommendation stage, we propose metapath-based message propagation to aggregate the semantic and collaborative information effectively on the constructed graph, resulting in the graph-enhanced user/item representations. These representations can be integrated into arbitrary recommender systems for enhancement.

To the best of our knowledge, we are the first to introduce vector quantization for graph construction based on LLMs in recommendation, which addresses the two key limitations of existing methods, i.e., invisibility of global view and inefficiency.

3.3.1 Metapath-based Message Propagation. We aim to acquire the graph-enhanced user and item representations for the downstream recommendation tasks. To handle the heterogeneous and complex topological structure, we first define a set of metapaths to guide the message propagation on the graph. As depicted in Figure 2, the metapath set P consists of the following four types:

Item-User Interaction Path 𝑖 → 𝑢 means the collaborative

information flow from an interacted item to the target user.

• User Semantic Path 𝑢 →𝑞 →𝑢 indicates the semantic knowledge

propagation between a pair of similar users who share the

same latent factor node.

• User-Item Interaction Path 𝑢 → 𝑖 means the collaborative

information flow from the target user to the interacted item.

• Item Semantic Path 𝑖 →𝑞 →𝑖 indicates the semantic knowledge

propagation between a pair of similar items that share the

same latent factor node.

In this section, we investigate the impact of different configurations and hyperparameters in AutoGraph including the contribution of different metapaths to the graph, the strategy for generating semantic factors, and the number of codebook levels. The experiments are mainly conducted on the MovieLens-1M and Amazon-Books datasets with YT-DNN (abbr. of YouTubeDNN) and MIND as backbone models. Due to the page limitation, we further provide the visualization and case study of learned latent factors in Appendix F.

4.4.1 Impact of different metapaths. As is shown in Table 4, we remove different metapaths from the subgraphs to evaluate their efficacy respectively, i.e., user semantic path (𝑢 → 𝑞 → 𝑢), item semantic path (𝑖 → 𝑞 → 𝑖), and interaction paths (𝑢 → 𝑖 and 𝑖 →𝑢). Moreover, 𝑁/𝐴 means removing all the metapaths, i.e., the vanilla backbone model.

Removing different metapaths serves as the ablation study w.r.t. the item/user representation enhancement brought by AutoGraph. We can observe that removing each metapath of AutoGraph generally results in performance degradation, while these variants still outperform the vanilla backbone models. This demonstrates the efficacy of each metapath proposed in our AutoGraph framework. Moreover, the semantic paths contribute more to the performance improvement than the interaction paths, highlighting the importance of in-depth semantics from LLMs and global-viewinformation brought by quantization-based latent factors.

AutoGraph generally outperforms existing graph-enhanced baseline methods by a significant margin. These methods either employ specific rules (i.e., click as linkage for LightGCN, and entity extraction for CCGNN) which fail to model the complex semantic signals, or focus on local-view pairwise comparison (i.e., TopoI2I) which is unable to capture the high-quality topological structure provided by the global view. In comparison, based on the latent factors from LLM knowledge, AutoGraph not only captures indepth semantics in multiple facets, but also equips the graph with a global view, leading to superior performance.