AI Security

AI agent delegation: defensive architecture patterns

An agent orchestrating others multiplies attack paths. Four delegation patterns and their defensive properties.

Aroua Biri

Most 2026 production AI agents are multi-agent systems in disguise: one main agent orchestrating others to split work. Claude Agent SDK with sub-agents, LangGraph supervisor, AutoGen team, CrewAI crew — all materialize a delegation pattern. How you structure delegation directly impacts attack-surface exposure.

Here are the 4 main patterns and their defensive properties.

Pattern 1 — Hierarchical delegation (supervisor / workers)

A central supervisor receives the request, plans, delegates to specialized workers, aggregates results, decides.

`` supervisor (sensitive capabilities) / \ \ worker_1 worker_2 worker_3 (reading) (analysis) (synthesis) ``

Defensive properties

  • + Capabilities concentrated on the supervisor → reduced external attack surface.
  • + Workers can be stateless and disposable.
  • + Centralized audit naturally (all decisions go through supervisor).
  • Supervisor is a single point of failure.
  • Higher latency (round-trips).

When to use

The recommended default pattern for agents acting on the IS in 2026. Materialized by default in Claude Agent SDK and LangGraph.

Pattern 2 — Peer-to-peer delegation (mesh)

Multiple agents collaborate without hierarchy, passing messages.

`` agent_A ⇄ agent_B ⇅ ⇅ agent_C ⇄ agent_D ``

Defensive properties

  • + No single point of failure.
  • + Resilient to single-agent crashes.
  • Collusion risk — injected instructions propagating.
  • Self-amplifying loop risk between agents.
  • Hard audit (no central observer).
  • Hard-to-reason threat model.

When to use

Rarely, with much caution. Suits research more than production. If in prod, mandate: strict timeout, off-band centralized audit log, automatic loop detection.

Pattern 3 — Sequential pipeline

Agents chain linearly, each taking the previous one's output.

`` agent_input → agent_process → agent_validate → agent_output ``

Defensive properties

  • + Each step clear and auditable.
  • + Gates between steps (structured validation).
  • + Allows trust compartmentalization (step 1 reads untrusted content, transforms to structured, downstream steps only see structured).
  • Cumulative latency.
  • A silent error in one step propagates.

When to use

Excellent for deterministic workflows with clear roles per step: ingestion → extraction → validation → action. Recommended when external inputs (emails, docs, web) must be "cleaned" before touching the action layer.

Pattern 4 — Planner / Executor with third-party validator

Variant of supervisor where decision is explicitly separated from execution, with a third agent as guardrail.

`` planner (decides the plan) ↓ validator (checks plan against policy) ↓ executor (executes validated plan, no interpretation freedom) ``

Defensive properties

  • + Explicit separation between reasoning and execution.
  • + Validator can be non-AI (deterministic rules) or LLM with dedicated security prompt.
  • + Executor does only what's in the validated plan — no improvisation.
  • + Excellent auditability: validated plan replayable.
  • Heavy implementation.
  • 3-step minimum latency.

When to use

Agents making high-impact decisions (finance, legal, HR). Reference pattern in regulated environments.

Synthetic comparison

| Pattern | Auditability | Resilience | Collusion risk | Cost | |---|---|---|---|---| | Supervisor / workers | High | Medium | Low | Medium | | Mesh | Low | High | High | Low | | Pipeline | High | Low | Low | Low | | Planner / Validator / Executor | Very high | Medium | Very low | High |

What stays valid regardless of pattern

1. Separate identities per agent

Each agent has its own token / identity at the capability broker level. No shared credential. If agent C is compromised, its permissions are cut without touching A and B.

2. Typed messages, not free text

When agents pass information, format must be structured (JSON with schema), not free text. Limits inter-agent prompt injection.

3. Cross-referenced audit log

Different agents' logs share:

  • A trace_id tracking the user request end-to-end.
  • A span_id per step (OpenTelemetry inheritance).
  • Reference to the caller agent.

That's what lets you trace, in an incident, the original instruction across the chain.

4. Inter-agent scenario tests

Red-teaming a MAS must include scenarios simulating a compromised agent and verifying compromise doesn't propagate. Not just per-agent isolated testing.

Architecture is a security choice

Many teams pick their orchestration pattern for purely functional reasons ("LangGraph is more flexible", "AutoGen allows free agent conversation"). It's also a choice determining residual security posture. Consider it from design, not after the first incident.

A related topic on your side?

20 minutes to scope it together. No commercial pitch.

Book a Calendly call