Graph-enhanced Large Language Models in Asynchronous Plan Reasoning

Paper · arXiv 2402.02805 · Published February 5, 2024

Planning is a fundamental property of human intelligence. Reasoning about asynchronous plans is challenging since it requires sequential and parallel planning to optimize time costs. Can large language models (LLMs) succeed at this task? Here, we present the first large-scale study investigating this question. We find that a representative set of closed and open-source LLMs, including GPT-4 and LLaMA-2, behave poorly when not supplied with illustrations about the task-solving process in our benchmark AsyncHow. We propose a novel technique called Plan Like a Graph (PLaG) that combines graphs with natural language prompts and achieves state-of-the-art results. We show that although PLaG can boost model performance, LLMs still suffer from drastic degradation when task complexity increases, highlighting the limits of utilizing LLMs for simulating digital devices.

In our work, we propose a novel prompting technique Plan Like a Graph (PLaG, Figure 2). Taking inspiration from Discourse Representation Theory (Wolf et al., 2004) and relevant works on graphical prompt representations (Fatemi et al., 2023; Wang et al., 2023), PLaG includes a graph representation in the prompt, where we give models k-shot illustrations with graphs describing the task and instruct them to either reason based on a given graph (i.e., explicit graph) or to generate a graph themselves and then reason about it (i.e., Build a Graph/BaG; Wang et al., 2023). We instruct models to produce graph representations of the naturalistic question and then use the information to solve relevant tasks.

Planning is an important property of human intelligence (Sternberg, 1984; Colom et al., 2010), and it is also vital in many downstream tasks such as developing autonomous robotic agents (Huang et al., 2022a; Shinn et al., 2023). While symbolic processors have been historically used for handling plan design (Fikes & Nilsson, 1971; McDermott, 2000), LLMs have recently emerged as a relevant complementary approach (Ahn et al., 2022; Dagan et al., 2023; Song et al., 2023). Although LLMs generate reasonable elementary planning steps when informed with appropriate guidance (Huang et al., 2022b; Yuan et al., 2023), they cannot combine those units effectively and develop optimal plans without external processors (Silver et al., 2022; Dagan et al., 2023; Yang et al., 2023). This might be an issue if LLMs are deployed for related tasks.

This work explores the reasoning ability of LLMs in naturalistic asynchronous planning, which we define as complex planning tasks involving both sequential and parallel actions. Given a set of steps for a task, the time required for each step, and step ordering constraints, we ask whether LLMs can compute the shortest possible time needed for an optimal plan for the task (Figure 1).