Agentic Systems and Planning

Can prompt injection reshape multi-agent workflow without touching infrastructure?

Explores whether an attacker can manipulate how a planner assigns tasks and routes coordination purely through prompt crafting, without modifying agents, tools, or messages. This matters because it identifies a planning-time vulnerability most defenses miss.

Note · 2026-05-28 · sourced from Agents Multi Architecture

The flexibility that makes planner-executor multi-agent systems attractive is also their weakness. When a planner converts a prompt into subtasks, roles, dependencies, and routing paths, the prompt is not merely a request — it is the blueprint from which the entire collaboration is constructed. FLOWSTEER demonstrates that an attacker who never touches agents, tools, memory, or inter-agent messages can still steer behavior, because the planning step happens before any of that infrastructure is invoked. A single crafted prompt can bias how the workflow forms in the first place, raising malicious success by up to 55 percent over naive prompting and transferring across MAS setups even under black-box topology inference.

This reframes where multi-agent safety lives. Most existing defenses inspect the artifacts of coordination — the generated workflow, the messages exchanged, the tool calls made. But if the contamination enters at workflow formation, those defenses arrive too late. The attack surface is not the running system; it is the organizational act of deciding who does what and in what order. The counterpoint is that this requires the planner to be promptable at all — fully fixed pipelines are immune — but fixed pipelines forfeit the adaptive coordination that motivates planner-executor designs. This matters because it identifies workflow formation as a distinct security frontier, one that grows more exposed precisely as multi-agent systems become more flexible.


— "FLOWSTEER: Prompt-Only Workflow Steering Exposes Planning-Time Vulnerabilities in Multi-Agent LLM Systems", https://arxiv.org/abs/2605.11514

Related concepts in this collection

Concept map
12 direct connections · 84 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere
Original note title

multi-agent planner-executor systems expose a planning-time attack surface where prompts reshape agent organization without touching infrastructure