AI Security

AI agent tools: the risk matrix to fill in

Giving an agent access to 12 tools means creating 12 error vectors. The matrix that separates acceptable from prohibited.

Aroua Biri

A 2026 AI agent typically has 5 to 20 tools at hand. Email reads, CRM writes, web search, code execution, SQL queries, payments, social posts. Each added tool multiplies threat-model complexity. The right scope isn't "every useful tool" — it's "tools whose value justifies the residual risk".

The 3 axes

Axis 1 — Impact of a wrong action

| Level | Definition | |---|---| | Low | User can reverse easily (draft, internal note) | | Medium | Visible but recoverable (email to colleague, ticket edit) | | High | Visible and hard to recover (customer email, payment, deletion, public post) | | Critical | Irreversible with external impact (bank transfer, signed contract, regulatory filing) |

Axis 2 — Input manipulability

| Level | Definition | |---|---| | Low | Input only from authenticated user | | Medium | Input from semi-trusted sources (internal docs, CRM) | | High | Input influenceable by an outsider (incoming email, public forms, indexed web) |

Axis 3 — Action visibility

| Level | Definition | |---|---| | High | User sees what will happen (clear UI, confirmation) | | Medium | Action logged and consultable later | | Low | Silent, or logged but not read in practice |

Combination rule

Classify each tool by the highest cell it can reach:

  • Green (allowed): Impact Low/Medium + visibility High + manipulability Low/Medium.
  • Orange (allowed with confirmation): Impact High + visibility High. Explicit confirmation before each action.
  • Red (avoid): Impact Critical, or Impact High + manipulability High + visibility Medium/Low.

Applied to a sales-assist agent

| Tool | Impact | Manipulability | Visibility | Classification | |---|---|---|---|---| | Read incoming emails | Low | High | High | Green | | Summarize Slack convo | Low | Medium | High | Green | | Create CRM note | Low | Medium | High | Green | | Update deal status | Medium | Low | High | Green | | Email a customer | High | Medium | High | Orange | | Schedule meeting | High | Medium | High | Orange | | Edit offer price | Critical | Low | Medium | Red | | Sign contract | Critical | — | — | Red |

4 tools shipped without confirmation, 2 with confirmation, 2 human-only.

Classic mistakes

Marking orange what should be red

"Edit offer price" with user confirmation stays red — users can be talked into agreeing by a persuasive agent (2025-2026 HCI work on over-reliance). Human confirmation isn't sufficient alone for critical actions.

Confusing "log" with "audit"

"But I have logs" doesn't upgrade Low to Medium visibility. Logging isn't auditing. For Medium visibility you need: accessible dashboards, someone who reads them regularly, alerts on anomalies.

Under-rating manipulability

Many teams default internal tools to Low manipulability. If a tool reads data produced by another tool that reads external content, end-to-end manipulability is High.

Forgetting cascading-error cost

"Send email" alone is orange. With "list all customers" added, the combined bulk-mistake risk emerges: 5,000 inappropriate emails. The combination's effective risk exceeds the sum.

Not a frozen matrix

Re-evaluate at every new agent version and tool addition. At least quarterly. Without discipline, the perimeter drifts.

Deliverable

For a production agent:

  • Up-to-date matrix, dated and versioned.
  • Per orange cell: screenshot of confirmation flow.
  • Per red cell: explicit "human-only" note.

ISO 42001 / AI Act auditors will ask for this from August 2, 2026.

A related topic on your side?

20 minutes to scope it together. No commercial pitch.

Book a Calendly call