Why an AI security audit is not a regular pentest
A traditional application pentest tests for SQL injection, XSS, broken access control. A production LLM system introduces 5 new attack families that traditional pentests don't cover:
1. Direct and indirect prompt injection — manipulation of the model via inputs or contextual data (documents, RAG, MCP, emails).
2. Jailbreak of guardrails — bypassing model protections via roleplay, encoding, multi-turn.
3. Data leakage and exfiltration — leakage of memorized training data, other users' context, environment secrets.
4. Model manipulation and data poisoning — poisoning of the RAG index or fine-tuning data.
5. Agent hijacking — compromise of an autonomous agent (Claude Agent SDK, LangGraph, AutoGen) via injected instructions to abuse accessible tools.
The primary reference is OWASP LLM Top 10 (LLM01-LLM10), complemented by the NIST AI RMF Generative Profile (AI 600-1).
Audit scope — 6 axes
Axis 1 — Threat modeling. Modeling threats specific to your AI system, aligned OWASP LLM Top 10 + NIST AI RMF. Asset mapping (model, system prompts, contextual data, accessible tools, user accounts), trust boundary identification, attack scenario listing.
Axis 2 — Targeted adversarial testing. Manual and automated tests (Garak, PyRIT, Promptfoo) on prompt injection, jailbreak, exfiltration. For agents: hijack tests, unauthorized execution, privilege escalation through tools.
Axis 3 — RAG audit. If you operate a RAG: ingestion security (PDF parsing, web scraping), index control (authoring, validation), cross-tenant isolation, indirect injection prevention.
Axis 4 — Agent audit. Tool perimeter, execution sandbox, kill-switch, input/output validation, forensic audit log. Particularly critical for tool-using agents (Claude Agent SDK, LangGraph, AutoGen).
Axis 5 — Compliance. Alignment with OWASP LLM Top 10, NIST AI RMF, ISO 42001, EU AI Act. Preparation for potential certification audit.
Axis 6 — Hardening. Concrete recommendations: defense-in-depth architecture (input filter + Constitutional AI + output filter + sandbox + logs), guardrails (Llama Guard, ShieldGemma, IBM Granite Guardian), continuous CI/CD red teaming.
Tools used
WeeSec operates with a mix of open-source tools and manual techniques:
- Garak (NVIDIA, OSS): reference framework with 100+ adversarial probes.
- PyRIT (Microsoft): multi-turn orchestration.
- Promptfoo: evaluation and robustness testing.
- Lakera Red, Robust Intelligence (commercial) where relevant.
- Structured manual testing: 30+ advanced prompt injection patterns, adversarial jailbreaks, exfiltration.
- For agents: sandbox escape tests, tool manipulation, indirect prompt injection via tool outputs.
Deliverables
After 4 to 6 weeks (depending on scope):
- Audit report (40-80 pages): threat mapping, test results, scoring per axis, identified vulnerabilities with proofs.
- Prioritized hardening plan: 20-50 actions classified (Critical / Important / Optimization), with effort and technical targets.
- Target architecture documented: defensive architecture diagram, CI/CD integrations, monitoring.
- Robustness indicators: pass rate on Garak/PyRIT, OWASP LLM Top 10 score, monitoring baseline.
- Executive briefing + transfer workshop with your technical team.
WeeSec can then support operational implementation (fractional CISO mode or ad-hoc engagement).
Pricing
Standard AI security audit (1 LLM system in production, medium scope): €18-35K, 4-6 weeks.
Extended AI security audit (system with complex RAG or autonomous agents in production): €30-60K, 6-8 weeks.
Audit + hardening engagement (audit + 3 months implementation): €60-120K.
Quick scoping (1-week strategic framing): €5-8K.
A free 20-minute scope-call frames the exact perimeter and provides a firm quote.
Why WeeSec for AI security
AI security is a very recent domain. Most traditional cybersecurity firms lack AI expertise. Most AI consulting firms lack offensive cybersecurity expertise.
The WeeSec founder combines both:
- Doctorate in cybersecurity Télécom SudParis (2009) — solid academic foundation.
- MIT Professional Education — Applied AI — recent structured AI training.
- 15+ years in operational cybersecurity — knowing how to distinguish theory from practice.
- 3+ years of active LLM security practice since GPT-3.5/4 — testing, audits, hardening engagements.
One of the rare profiles in Europe able to audit an LLM system with simultaneous technical and compliance reading.