Adaptation of Agentic AI
Cutting-edge agentic AI systems are built on foundation models that can be adapted to plan, reason, and interact with external tools to perform increasingly complex and specialized tasks. As these systems grow in capability and scope, adaptation becomes a central mechanism for improving performance, reliability, and generalization. In this paper, we unify the rapidly expanding research landscape into a systematic framework that spans both agent adaptations and tool adaptations. We further decompose these into tool execution– signaled and agent-output–signaled forms of agent adaptation, as well as agent-agnostic and agent-supervised forms of tool adaptation. We demonstrate that this framework helps clarify the design space of adaptation strategies in agentic AI, makes their trade-offs explicit, and provides practical guidance for selecting or switching among strategies during system design. We then review the representative approaches in each category, analyze their strengths and limitations, and highlight key open challenges and future opportunities. Overall, this paper aims to offer a conceptual foundation and practical roadmap for researchers and practitioners seeking to build more capable, efficient, and reliable agentic AI systems. We also maintain a repository that organizes relevant papers and codebases for agentic AI adaptation.
The first dimension, which we term Agent Adaptation, focuses on modifying the agent’s internal parameters, representations, or behavioral policies to better align with task requirements. This includes both traditional fine-tuning approaches [17] and modern reinforcement learning methods that leverage environment feedback [18, 19]. The second dimension, Tool Adaptation, shifts the optimization target from the agent to its external tools, e.g., retrievers, planners, memory modules, and specialized models, enabling frozen agents to benefit from an adaptive operational environment [20, 10, 21]. Within these two broad paradigms, we further identify four distinct adaptation strategies, forming a comprehensive taxonomy that organizes the rapidly evolving landscape of agentic AI research:
• A1: Tool Execution Signaled Agent Adaptation (§3.2.1, §4.1): The agent is optimized using verifiable outcomes produced by external tools it invokes. This paradigm captures settings where correctness signals arise directly from tool execution, such as code sandbox results, retrieval relevance scores, or API call outcomes.
• A2: Agent Output Signaled Agent Adaptation (§3.2.2, §4.2): The agent is optimized using evaluations of its own outputs, e.g., final answers, plans, or reasoning traces, possibly after incorporating tool results. This paradigm includes both tool-free outcome-based learning and tool-augmented adaptation driven by answer correctness or preference scores.
• T1: Agent-Agnostic Tool Adaptation (§3.2.3, §5.1): Tools are trained independently of the frozen agent. These tools include retrievers, domain-specific models, and other pretrained components that can be used as plug-and-play modules orchestrated by the frozen agent.
• T2: Agent-Supervised Tool Adaptation (§3.2.4, §5.2): The agent remains fixed while its tools are adapted using signals derived from the agent’s outputs. This paradigm includes reward-driven retriever tuning, adaptive rerankers, search subagents, and memory-update modules trained to better support the frozen agent.
3.2 Four Adaptation Paradigms of Agentic AI
Building upon the mathematical notations introduced earlier, we now present the four adaptation paradigms proposed in this paper, which together form a unified framework for classifying existing approaches to agentic AI adaptation. In this framework, adaptation is first categorized by the optimization target, namely the agent or the tool. For agent adaptation, we further differentiate paradigms based on the type of optimization signal used, which may originate from tool-execution feedback (A1) or from evaluations of the agent’s own final output (A2). For tool adaptation, the distinction instead concerns whether the adaptation process involves the agent, where tools may be optimized independently of any agent (T1) or adapted under the supervision of a fixed agent (T2). Taken together, these considerations give rise to four paradigms, A1, A2, T1, and T2, which collectively characterize the principal modes of adaptation explored in agentic AI research.
Retrieval-Augmented Generation (RAG) Setting. In the RAG setting, the agent receives a query and performs a retrieval action to obtain relevant documents from a database. Formally, the agent produces a retrieval query a, the retriever returns a set of documents y, and the agent synthesizes these documents together with the original query to generate a final answer o.
• A1 example. DeepRetrieval [19] optimizes the agent using feedback signals computed directly from retrieval quality. After generating a retrieval query a, the retriever returns documents y, and metrics such as recall or nDCG are computed from y and used as the reward for updating the agent. Since the adaptation signal depends solely on the tool-execution outcome, this represents the A1 paradigm.
• A2 example. Search-R1 [47] follows the full RAG pipeline, where the agent first retrieves documents and then integrates them into its context to produce a final answer o. The adaptation signal is computed from the correctness or quality of this final answer by calculating exact matching accuracy. Because the optimization is guided by the agent’s final output rather than the retrieval result alone, this falls under the A2 paradigm.