On the Roles of LLMs in Planning: Embedding LLMs into Planning Graphs

Paper · Source

we aim to further study the insight of the planning capability of LLMs by investigating the roles of LLMs in off-the-shelf planning frameworks. To do this, we investigate the effectiveness of embedding LLMs into one of the well-known planning frameworks, graph-based planning, proposing a novel LLMs-based planning framework with LLMs embedded in two levels of planning graphs, i.e., mutual constraints generation level and constraints solving level.

Plan synthesis aims to generate a course of actions or policies to transit given initial states to goal states, provided domain models that could be designed by experts or learnt from training data (Aineto et al., 2019) or interactions with the world

An interesting result shown by (Valmeekam et al., 2023b) is when taking the solution generated by LLMs, which is generally incorrect, as a seed plan to be repaired by an off-the- shelf planner, e.g., LPG (Gerevini & Serina, 2002), a significant improvement in search steps can be seen over the empty plan provided as a seed plan for the planner. This indicates that LLMs can indeed provide some helpful information (e.g., in some sense of heuristics) for planning, even though they cannot solve planning problems solely.

For example, as shown in Figure 1(a), there could be many actions in “action-level 1” after expanding the planning graph based on “state-label 0” with Graphplan (Blum & Furst, 1997). We expect that LLMs will help select only a few promising actions in each action level, e.g., {a1, a2, a3} in “action-level 1”. In Figure 1(b), when backtracking from “state-levelK” that contains goals, there could be many candidate sets of actions to be explored (e.g., “action set 1”, “action set 2”, “action set 3”) — actions in each candidate set are not mutual with each other. It is particularly time-consuming to search all of the valid candidate sets based on the mutual constraints in each action-level. We expect that LLMs are capable of helping select only a few candidate sets to be backtracked, e.g., only “action set 1” is selected to be backtracked by LLMs.

Specifically, in LLMs4Plan, for each action-level i, we first automatically generate prompts based on propositions in state-level i − 1, goals and domain models when using LLMs to select actions to expand planning graphs in actionlevel i. After that, when backtracking from the last statelevel of expanded planning graphs, i.e., the last state-level in expanded planning graphs contains goals, we automatically generate prompts based on propositions in state-level j (from which backtracking is conducted), propositions from state-level 0, mutual constraints in action-level j, and domain models. We embed the above two components into one of the well-known off-the-shelf graph planners, Graphplan. We study the effectiveness of different cases of adding or removing the above one or two components in Graphplan to see the significance of roles LLMs play in the graph planning framework.

Through this study, we provide new clues for how to deeply embed LLMs into off-the-shelf planning frameworks, i.e., first identifying critical steps (generally time-consuming ones) in specific planning frameworks, and then designing proper prompt generation procedure to be embedded into the frameworks. We verify that soly relying on LLMs to do planning is far from a good option, while leveraging LLMs to help deal with some critical steps in the graph planning framework is possible.