Self-Adapting Language Models

Paper · arXiv 2506.10943 · Published June 12, 2025
Self Refinement Self Consistency Feedback

Given a new input, the model produces a self-edit—a generation that may restructure the information in different ways, specify optimization hyperparameters, or invoke tools for data augmentation and gradient-based updates. Through supervised finetuning (SFT), these self-edits result in persistent weight updates, enabling lasting adaptation. To train the model to produce effective self-edits, we use a reinforcement learning loop, using the downstream performance of the updated model as the reward signal. Unlike prior approaches that rely on separate adaptation modules or auxiliary networks, SEAL directly uses the model’s generation to parameterize and control its own adaptation process. Experiments on knowledge incorporation and fewshot generalization show that SEAL is a promising step toward language models capable of self-directed adaptation in response to new data.

As an analogy, consider a human student preparing for the final exam of a machine learning class. Many students rely on their notes to prepare for the exam. These notes are often derived from the lecture content, textbooks, or information available on the internet. Instead of relying on the raw content, assimilating and rewriting the information in the form of notes often improves the ability of students to understand the content and answer exam questions. This phenomenon of reinterpreting and augmenting external knowledge in a way that is easier to understand is not limited to just taking exams, but seems to be universally true of human learning across tasks. Furthermore, different humans assimilate information in different ways—some might condense the information into a visual diagram, some into text, or some might rely more on concrete mathematical descriptions.

Such assimilation, restructuring, or rewriting of data as part of the learning process is in contrast with how large language models (LLMs) are typically trained and deployed. Given a new task, current LLMs consume and learn from the task data “as-is” via finetuning or in-context learning [9, 10, 11, 12]. However, such data may not be in an optimal format (or volume) for learning, and current approaches do not enable models to develop bespoke strategies for how to best transform and learn from their training data.

As a step towards scalable and efficient adaptation of language models, we propose equipping LLMs with the ability to generate their own training data and finetuning directives for utilizing such data. In particular, we introduce a reinforcement learning algorithm that trains LLMs to generate “self-edits”— natural-language instructions that specify the data and, optionally, the optimization hyperparameters for updating the model’s weights (see Figure 1). We refer to such models as Self-Adapting LLMs (SEAL).

We evaluate SEAL on two applications. We first consider the task of integrating new factual knowledge into an LLM. Rather than finetuning directly on the passage text, we finetune on synthetic data generated by the SEAL model. Our results show that, following reinforcement learning (RL) training, finetuning on self-generated synthetic data improves question-answering performance on the no-passage-in-context variant of SQuAD [13] from 33.5% to 47.0%. Notably, self-generated data from SEAL outperforms synthetic data generated by GPT-4.1.

We further evaluate SEAL on few-shot learning on a simplified subset of the ARC-AGI benchmark [14], where the model leverages a set of tools to autonomously select both synthetic data augmentations and optimization hyperparameters (e.g., learning rate, training epochs, selective loss computation over token types). Our experiments demonstrate that automatic selection and configuration of these tools using SEAL enhances performance compared to both standard in-context learning (ICL) and self-editing without RL training to use the tools effectively.