LLM Agents
Related topics:
- A Survey of Self-Evolving Agents: On Path to Artificial Super IntelligenceLarge Language Models (LLMs) have demonstrated remarkable capabilities across diverse tasks but remain fundamentally static, unable to adapt their internal parameters to novel tasks, evolving knowledg…
- Adaptation of Agentic AICutting-edge agentic AI systems are built on foundation models that can be adapted to plan, reason, and interact with external tools to perform increasingly complex and specialized tasks. As these sys…
- Agent Workflow MemoryDespite the potential of language model-based agents to solve real-world tasks such as web navigation, current methods still struggle with long-horizon tasks with complex action trajectories. In contr…
- Agentic Reasoning for Large Language ModelsAbstract: Reasoning is a fundamental cognitive process underlying inference, problem-solving, and decision-making. While large language models (LLMs) demonstrate strong reasoning capabilities in close…
- Agents Are Not EnoughBy exploring past incarnations of agents, we can understand what has been done previously, what worked, and more importantly, what did not pan out and why. This understanding lets us to examine what d…
- Automated Design of Agentic Systemsthe history of machine learning teaches us that hand-designed solutions are eventually replaced by learned solutions. We formulate a new research area, Automated Design of Agentic Systems (ADAS), whic…
- CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model SocietyThe rapid advancement of chat-based language models has led to remarkable progress in complex task-solving. However, their success heavily relies on human input to guide the conversation, which can be…
- DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving AgentsDERA is a paradigm made possible by the increased conversational abilities of LLMs, namely GPT- 4. It provides a simple, interpretable forum for models to communicate feedback and iteratively improve …
- Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimizationstrategic team of agents communicating in a dynamic interaction architecture based on the task query. Specifically, we build a framework named Dynamic LLM-Agent Network (DyLAN) for LLM-agent collabora…
- Equipping agents for the real world with Agent SkillsAs model capabilities improve, we can now build general-purpose agents that interact with full-fledged computing environments. Claude Code , for example, can accomplish complex tasks across domains us…
- From Web Search towards Agentic Deep Research: Incentivizing Search with Reasoning AgentsHowever, traditional keyword-based search engines are increasingly inadequate for handling complex, multi-step information needs. Our position is that Large Language Models (LLMs), endowed with reason…
- Generative Agent Simulations of 1,000 PeopleWe present a novel agent architecture that simulates the attitudes and behaviors of 1,052 real individuals—applying large language models to qualitative interviews about their lives, then measuring ho…
- LIMI: Less is More for AgencyWe define “Agency” as the emergent capacity of AI systems to function as autonomous agents—actively discovering problems, formulating hypotheses, and executing solutions through self-directed engageme…
- Language Agents as Optimizable GraphsVarious human-designed prompt engineering techniques have been proposed to improve problem solvers based on Large Language Models (LLMs), yielding many disparate code bases. We unify these approaches …
- Language as a Cognitive Tool to Imagine Goals in Curiosity-Driven Explorationgo to 43 mins Developmental machine learning studies how artificial agents can model the way children learn open-ended repertoires of skills. Such agents need to create and represent goals, select wh…
- Large Language Model-Brained GUI Agents: A SurveyThe advent of Large Language Models (LLMs), particularly multimodal models, has ushered in a new era of GUI automation. They have demonstrated exceptional capabilities in natural language understandin…
- Large Language Model-based Data Science Agent: A SurveyThe rapid advancement of Large Language Models (LLMs) has driven novel applications across diverse domains, with LLM-based agents emerging as a crucial area of exploration. This survey presents a comp…
- MCP-Zero: Proactive Toolchain Construction for LLM Agents from ScratchFunction-calling has enabled large language models (LLMs) to act as tool-using agents, but injecting thousands of tool schemas into the prompt is costly and error-prone. We introduce MCP-Zero, a proac…
- MLE-STAR: Machine Learning Engineering Agent via Search and Targeted RefinementAgents based on large language models (LLMs) for machine learning engineering (MLE) can automatically implement ML models via code generation. However, existing approaches to build such agents often r…
- Octopus v2: On-device language model for super agentCurrent on-device models for function calling face issues with latency and accuracy. Our research presents a new method that empowers an on-device model with 2 billion parameters to surpass the perfor…
- Octopus v4: Graph of language modelsThis paper introduces a novel approach that employs functional tokens to integrate multiple open-source models, each optimized for particular tasks. Our newly developed Octopus v4 model leverages func…
- Openagents: An Open Platform For Language Agents In The WildWe present OpenAgents, an open platform for using and hosting language agents in the wild of everyday life. OpenAgents includes three agents: (1) Data Agent for data analysis with Python/SQL and data …
- Position: LLMs Can't Plan, But Can Help Planning in LLM-Modulo FrameworksLarge Language Models (LLMs), essentially n-gram models on steroids which have been pre-trained on web-scale language corpora (or, effectively, our collective consciousness), have caught the imaginati…
- Solving a Million-Step LLM Task with Zero ErrorsLLMs have achieved remarkable breakthroughs in reasoning, insights, and tool use, but chaining these abilities into extended processes at the scale of those routinely executed by humans, organizations…
- Survey on Evaluation of LLM-based AgentsThis paper provides the first comprehensive survey of evaluation methodologies for these increasingly capable agents. We systematically analyze evaluation benchmarks and frameworks across four critica…
- TheAgentCompany: Benchmarking LLM Agents on Consequential Real World TasksTo measure the progress of these LLM agents’ performance on performing real-world professional tasks, in this paper we introduce TheAgentCompany, an extensible benchmark for evaluating AI agents that …
- Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMsThis survey synthesizes both strands under a unified reasoning-retrieval perspective. We first map how advanced reasoning optimizes each stage of RAG (Reasoning- Enhanced RAG). Then, we show how retri…
- Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based AgentsDescription automatically generated](file:////Users/adrianchan/Library/Group%20Containers/UBF8T346G9.Office/TemporaryItems/msohtmlclip/clip_image013.png) However, these agents often suffer from high…
- UserBench: An Interactive Gym Environment for User-Centric AgentsLarge Language Models (LLMs)-based agents have made impressive progress in reasoning and tool use, enabling them to solve complex tasks. However, their ability to proactively collaborate with users, e…
- Voyager: An Open-Ended Embodied Agent with Large Language ModelsWe introduce VOYAGER, the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human inter…
- When AIs Judge AIs: The Rise of Agent-as-a-Judge Evaluation for LLMsAs large language models (LLMs) grow in capability and autonomy, evaluating their outputs— especially in open-ended and complex tasks—has become a critical bottleneck. A new paradigm is emerging: usin…