How to Correctly do Semantic Backpropagation on Language-based Agentic Systems

Paper · arXiv 2412.03624 · Published December 4, 2024

Due to the strength of Large Language Models (LLMs) in doing a wide array of tasks, agentic systems typically have most of their key components rely on querying LLMs. This results in communication between the components of such systems being handled with free-form natural language (Zhuge et al., 2023). However, while relying on LLMs does partially alleviate the engineering burden of building such systems, designing agentic systems remains nontrivial.

Task. In the LIAR dataset (Wang, 2017), the task is to decide whether a political statement is a lie or not. Each sample in the dataset consists of five attributes, i.e., (i) the statement, (ii) the political party of the speaker, (iii) the job title of the speaker, (iv) the state from which the speaker comes from, and (v) the source from which this statement is released. This five-attribute structure leads to an intuitive decomposition of the problem, where each component of an agentic system analyzes an attribute and then merges the analysis(Figure 2, b). This intuitive decomposition is desirable here as it allows for a relatively naive yet practically plausible agentic system architecture and prompts. Thus we can focus our evaluation on the performance of the optimizers. We use the binary classification version of LIAR as done by Pryzant et al. (2023).Here, the prefix of the response (required to be either “Yes” or “No”) is used to determine how the agentic system has classified a query.

Experiment Design. Following the aforementioned decomposition strategy, we optimize a graph of 13 variables, of which six are optimizable parameters. These six optimizable parameters serve as instructions for an LLM. Five are initialized to guide the LLM in analyzing specific attributes of a sample, while the last parameter instructs the LLM to formulate a final answer based on the previous analyses. See Figure 2 for the visualization of the initial graph.

We compare our optimization method with four variants: (1) optimizing without semantic gradients by removing the feedback (see Implementation 1) as input of the parameter update function; (2) optimizing one parameter only (running this variant six times with a different parameter each time and reporting the average); (3) optimizing with semantic gradients computed without conditioning the neighborhood (i.e., as in Equation (3)), emulating TextGrad in our implementation; and (4) optimizing without the update gate introduced in Section 3.2, where update gate accepts parameter updates only if they performs better on a validation set.

In this work, we tackled the challenge of optimizing language-based agentic systems by introducing semantic gradients and semantic backpropagation. These concepts generalize existing credit assignment methods, such as reverse-mode automatic differentiation and TextGrad, by incorporating neighborhood conditioning to compute directional information which can be leveraged to improve each optimizable component of the system. This framework enabled us to propose semantic gradient descent, effectively solving the Graph-based Agentic System Optimization (GASO) problem.