Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference

Paper · arXiv 2306.12509 · Published June 21, 2023

“Particularly, we view LLMs as language layers in a Deep Language Network (DLN). The learnable parameters of each layer are the associated natural language prompts and the LLM at a given layer receives as input the output of the LLM at the previous layer, like in a traditional deep network. Each layer uses a template to organize the conditioning information into a string before calling the LLM (see Fig. 1). This layering induces a learnable decomposition of the task into a series of smaller sub-tasks, each of which is more easily solvable by an LLM. This shares similarities to recent works that chain LLM calls [44, 8, 56]. In this work, we move towards integrating learnable components in the pipeline rather than relying on handcrafted rules: each layer is associated with prompt (language) parameters that need to be learned in such a way that the performance of the final objective is maximized.”

No results.