Agentic and Multi-Agent Systems

How do agentic AI systems decompose into adaptation paradigms?

What are the core dimensions that distinguish different approaches to adapting agents and tools in agentic systems? Understanding this taxonomy could clarify which adaptation strategy fits which problem.

Note · 2026-02-23 · sourced from Agents

The adaptation landscape for agentic AI systems is cleaner than it appears. Two binary dimensions — what gets optimized (agent or tool) and what provides the signal (tool execution or agent output) — generate four paradigms that cover the principal modes of adaptation:

A1: Tool Execution Signaled Agent Adaptation — The agent is optimized using feedback from external tool execution. When the agent generates a retrieval query and the retriever returns documents, metrics like recall or nDCG computed from retrieval results directly reward the agent. Example: DeepRetrieval optimizes the agent's query generation using retrieval quality scores.

A2: Agent Output Signaled Agent Adaptation — The agent is optimized using evaluation of its final output after incorporating tool results. The full pipeline runs (retrieve → integrate → answer), and the answer's correctness drives the reward signal. Example: Search-R1 rewards based on exact match of the final answer, not the retrieval quality.

T1: Agent-Agnostic Tool Adaptation — Tools are trained independently of any specific agent. Retrievers, domain-specific models, and pretrained components function as plug-and-play modules. The agent remains frozen; the tool improves on its own.

T2: Agent-Supervised Tool Adaptation — Tools are adapted using signals derived from the frozen agent's outputs. Reward-driven retriever tuning, adaptive rerankers, and memory-update modules all fall here — the agent defines what "good" means for the tool.

The taxonomy is practically useful because it maps directly to implementation decisions. A1 vs A2 determines where the loss function sits: at the tool boundary or at the output boundary. T1 vs T2 determines whether tool improvement requires an agent in the loop or not. Since How do knowledge injection methods trade off flexibility and cost? provides a parallel taxonomy for knowledge injection, these two frameworks are complementary: one classifies what gets injected, the other classifies how the system adapts.

The RAG setting illustrates the A1/A2 distinction clearly: A1 optimizes the agent to write better queries (retrieval quality as reward), while A2 optimizes the agent to produce better final answers (answer correctness as reward). These are different objectives and can pull in different directions — a query that retrieves the best documents is not necessarily the query that produces the best final answer when the agent has limited context integration ability.


Source: Agents

Related concepts in this collection

Concept map
19 direct connections · 158 in 2-hop network ·medium cluster

Click a node to walk · click center to open · click Open full network for a force-directed map

your link semantically near linked from elsewhere
Original note title

agentic AI adaptation decomposes into four paradigms along two dimensions — agent versus tool optimization target and execution-signaled versus output-signaled feedback