Agents Are Not Enough

Paper · arXiv 2412.16241 · Published December 19, 2024

By exploring past incarnations of agents, we can understand what has been done previously, what worked, and more importantly, what did not pan out and why. This understanding lets us to examine what distinguishes the current focus on agents. While generative AI is appealing, this technology alone is insufficient to make new generations of agents more successful. To make the current wave of agents effective and sustainable, we envision an ecosystem that includes not only agents but also Sims, which represent user preferences and behaviors, as well as Assistants, which directly interact with the user and coordinate the execution of user tasks with the help of the agents.

Agents can range from simple systems, such as thermostats that adjust ambient temperature based on sensor readings, to complex systems, such as autonomous vehicles navigating through traffic. Key characteristics of agents include autonomy, programmability, reactivity, and proactiveness. The area of agentic AI covers AI systems designed to operate with a high degree of autonomy, making decisions and taking actions independently of human intervention.

Agents, by definition, remove agency from a user in order to do things on the user’s behalf and save them time and effort. However, such a trade-off may be brought into question if relinquishing that control does not generate sufficient user value. The agents may also make mistakes, require intervention or supervision, and may be limited to performing only simple tasks. These shortcomings are evident from the agentic research and development efforts over the past decade or so. Agents, often referred to as operators, skills, apps, extensions, and plugins, have been widely available through integrations into computers, smartphones, speakers, wearables, and automobiles. However, their utility has been severely limited [1, 6]. In addition to the limited applications, there are continuing shortcomings that these agents exhibit that are not addressed by simply creating more capable systems. Here, we briefly review why this is the case and what we can do

Early AI Agents The idea of AI agents dates back to the 1950s with symbolic AI. Early examples, such as the General Problem Solver (GPS), aimed to replicate human problem-solving using symbolic reasoning. However, these agents struggled with real-world complexity due to their dependence on predefined rules and lack of adaptability [4].

Expert Systems In the 1980s, expert systems like MYCIN and DENDRAL emerged, utilizing domain-specific knowledge for decision-making. While effective in narrow domains, these systems were brittle and unable to generalize beyond their programmed expertise. The extensive manual knowledge engineering required made them impractical for broader applications [5].

Reactive Agents The 1990s introduced reactive agents [8], which responded to environmental stimuli without internal models. Rodney Brooks’ subsumption architecture [3] exemplified this approach, emphasizing real-time interaction over complex reasoning. However, reactive agents lacked the ability to plan or learn from past experiences, limiting their utility in dynamic environments.

Multi-Agent Systems Multi-agent systems (MAS) [10] brought the concept of multiple interacting agents, each with specific roles. While MAS showed promise in distributed problem-solving, they faced challenges in coordination, communication, and scalability. Managing interactions among agents often led to inefficiencies and unpredictable behaviors.

Cognitive Architectures Cognitive architectures like SOAR and ACT-R [7] aimed to model human cognition, integrating perception, memory, and reasoning. Despite their sophisticated designs, these architectures struggled with scalability and real-time performance. Their complexity often resulted in high computational costs and limited practical applications.

(1) Value generation. An agent is meant to provide autonomous execution of tasks on a user’s behalf, but there are costs and risks. For instance, if the user needs to intervene or clarify frequently, that may defeat the purpose of an agent. The user may also face the trade-off between privacy and utility regarding the agent. In short, without the user realizing enough value out of an agent, they may not be willing to use it. Here, value can be understood as the difference between the perceived benefit and the perceived cost (e.g., time, privacy) of using an agent.

(2) Adaptable personalization. Every user and every situation is different when it comes to executing the task. An agent that cannot adapt to the user or their context may be of limited use. For instance, what if performing an online transaction on a user’s behalf calls for resetting a password? The agent will need to be capable of doing this, but more importantly, depending on the task, the situation, and knowledge about the user, the agent could proceed with this subtask on its own or seek the user’s input. (

Trustworthiness. The more capable an agent is, the more the user will need to be able to trust it. Letting agents perform bank transactions, personal communications, and important decision-making tasks will call for stronger scrutiny of and well-placed trust in those agents. This trust will also not be built overnight. Rather, through increased accuracy and transparency, the agents will have to gradually earn our trust. We still do not have a broad acceptance of automatically generated emails. Having AI-based agents perform more than content generation will require much more familiarity with and trust in those systems.

(4) Social acceptability.We envision a future where agents can do many tasks on a user’s behalf, including shopping, scheduling, and negotiating. However, to have these done at scale and for diverse populations, cultures, and customs, we need to have wide social acceptability of agent-based interactions and transactions. This may take a long time to materialize. For instance, while paying bills online offers many advantages to individuals, service providers, and the environment and many in the developed world are accustomed to using it, there is still a significant fraction of the world where this is not a common practice for various reasons.

(5) Standardization. Developing and deploying agents is and will continue to be decentralized, which is desired for a sustained ecosystem around agents. However, this will also pose new challenges regarding compatibility, reliability, and security of those agents. Therefore, we will need efforts to standardize how agents are deployed, connected (in case of multi-agent frameworks), and served. Consider this similar to developing a networking protocol or an app store.