Cost runaway: bounding an AI agent that goes off-budget

A team I worked with in Q1 2026 burned €23,000 of Claude tokens in 4 hours on a weekend. An agent looping. A loop triggered by a misformulated stop condition. No one watching. Detected Monday morning via the Anthropic admin's email.

Not an extreme case. In 14 WeeSec engagements since October 2025, I've seen two similar incidents. Cost runaway has become a serious operational risk. It deserves guardrails on par with technical security.

Three runaway mechanisms

1. Agent loop

The agent calls a tool, doesn't get the expected result, retries. And retries. No stop condition. On streaming long-running sub-tasks, the agent can loop for hours unnoticed.

Typical causes:

Tool returns an unexpected format: the agent thinks it failed.
Vaguely-worded stop condition ("until you're satisfied").
Planning logic bug.

2. Multi-agent amplification

System with planner + workers. Planner decides to ask for "validation" from another agent per sub-task. Validator, under prompt injection, demands "more information". Planner calls more workers. Exponential loop.

3. User abuse

A user (or attacker impersonating) launches massively-parallel requests, or requests asking the agent to "analyze everything" on a huge corpus. Cost explodes without agent-side malicious intent — the user forces it.

Defenses by layer

Layer 1 — LLM provider limits

All major providers (Anthropic, OpenAI, Google, Mistral) offer rate limits and spending caps at account level. Configure at account opening:

Monthly spending cap: annual budget / 12 + 30% margin.
Threshold notifications (50%, 75%, 90%, 100% of cap).
Rate limit on requests per minute.

The minimum safety belt. Many teams skip it not wanting to be "blocked". Right calibration avoids catastrophe without daily friction.

Layer 2 — Per-session and per-user limits

App-level:

Tokens / session: strict limit. Hits limit, clean shutdown + alert.
Tokens / user / day: protects against compromised-account abuse.
Tokens / endpoint: differentiate cheap (simple chat) from expensive (research mode with multi-tool).

Layer 3 — Per-agent limits

For autonomous background agents:

Max steps per task: max tool calls before timeout. Typical: 30-100.
Max cumulative tokens: independent of step count. Covers ballooning contexts.
Wall-clock timeout: hard ceiling in seconds.

On limit hit: clean exit, context logging, escalation to human or clear notification to user.

Layer 4 — Statistical anomaly detection

Beyond hard limits:

Track consumption per session, per user, per agent version.
Detect outliers (a user consuming 10x their average within 1h).
Alert on sessions exceeding statistical bounds.

That's what catches "slow" runaways — not a frantic loop but anomalous grind over hours.

Layer 5 — Budget kill-switch

A global feature flag: "disable all non-essential agents". Useful on detected-but-not-investigated cost spikes. Triggering in 30 seconds can save 5-figure overage.

Patterns to code

Graceful degradation

When an agent hits 80% of its token limit, switch to a cheaper model:

Claude Opus → Claude Sonnet → Claude Haiku.
Visible to user ("switching to short-summary mode").

Cost-aware planning

For agents with explicit planner, embed cost as a constraint:

"You have a 50,000-token budget for this task."
The model adapts investigation depth.

Used in some 2026 frameworks (notably custom LangGraph). Not magic, useful.

Aggressive caching

Cache expensive tool outputs (web search, RAG queries) for minutes or hours. If the agent re-asks the same in-session info, cache hit instead of re-call. Low memory cost, potentially big token savings on iterating agents.

Math to remember

Before shipping an agent, do this exercise:

Average per-session cost: X tokens × Y € = Z€.
Sessions / day expected: N.
Expected daily cost: N × Z €.
Catastrophic daily cost: N × Z × 100 (undetected runaway over 24h).

If "catastrophic daily" exceeds your pain threshold, you need material guardrails, not just good intentions.

Day-1 dashboard indicators

Tokens consumed (day, week, month) vs budget.
Session-cost distribution (spot outliers).
Top 10 most expensive sessions today.
Average per-session cost trend (alert on slow drift).

Without visible indicators, runaway happens silently.

The cautionary contrast

The team that burned €23,000 had:

No per-session app-side limit.
No Anthropic spending cap.
No real-time cost dashboard.
A mail alert configured but on the billing account, not operational — recipient on weekend.

Four weak points. None individually catastrophic. Together: €23,000. FinOps defense-in-depth is exactly that: no single point, many reasonable points.

A related topic on your side?

20 minutes to scope it together. No commercial pitch.

Book a Calendly call →