How to Pass an AI Security Audit:
What Auditors Are Actually Checking in 2026

AI security audits have moved from the theoretical to the norm. SOC 2 Type II is a non-negotiable procurement baseline. ISO 42001 are coming fast. The CISA Five Eyes guidance published May 1, 2026 defined the first formal agentic AI audit requirements. Here is exactly what the auditors are looking at – and what evidence you need to have in hand.

DOMAIN 01
AI Inventory & Governance
Shadow AI · Agent registry · Policy documentation
DOMAIN 02
Access & Identity Controls
NHI management · Least privilege · Credential rotation
DOMAIN 03
Input/Output Inspection
Prompt injection · Data leakage · Policy enforcement
DOMAIN 04
Audit Trails & Logging
Decision-chain logs · Retention · Forensic readiness
DOMAIN 05
Incident Response
AI-specific playbooks · Tested kill switch · Breach notification

In 2024, an AI security audit was a new thing. In 2026, it is a procurement requirement. Enterprise buyers want SOC 2 Type II reports before they start evaluating vendors. Defense contractors are being assessed against CMMC 2.0 where the 110 NIST SP 800-171 controls apply to AI that is interacting with CUI – assessors are now specifically looking at AI access paths, audit trails and encryption in CMMC scoping. ISO 42001 – the first international standard for AI management systems – is being asked for by regulators in the EU, UK and APAC. And the CISA Five Eyes guidance released May 1, 2026 by CISA, NSA, the Australian ASD, Canada's CCCS, New Zealand's NCSC and the UK's NCSC set the first formal security expectations for agentic AI deployments.

The paradigm shift: In 2026, auditors will not ask if controls exist. They will ask if controls work, and they will want proof. Policies, promises, and architecture drawings are no longer enough. Auditors will want logs, test results, access reviews, and incident reports. If it didn't happen in a system of record, it didn't.

Below is a summary of what auditors are actually asking for – the specific evidence requests, the most common gaps, and the preparation that differentiates passing organizations from those with findings.

The Five Frameworks That Matter in 2026

Most organizations need to address multiple frameworks at the same time. An automated crosswalk (mapping controls from one framework to another) allows you to address SOC 2, ISO 42001 and EU AI Act requirements with one control architecture. NIST AI RMF functions map to ISO 42001 clauses and EU AI Act Article requirements. It is much more efficient to have one evidence base that covers all three rather than address them as separate compliance programs.

SOC 2 Type II
US · Attestation
In 2026, auditors review CC9.2 (risk mitigation) for model behavior over time (drift monitoring, training data provenance and prompt/guardrail updates as controlled changes) and the AI API keys and embeddings stores should be in access reviews with production databases.
Non-negotiable procurement baseline
ISO 42001
International · Certifiable
Published in 2023. The first globally certifiable AI management system standard. Certification requires auditors to be BS ISO/IEC 42006:2025 certified. Requires a formal risk register for AI systems, documented roles and an impact assessment for AI systems. Cross references to NIST AI RMF.
Requested by EU/UK/APAC regulators
NIST AI RMF
US · Voluntary guidance
GOVERN, MAP, MEASURE, MANAGE framework. GOVERN v1 introduced specific agentic AI controls in January 2026. It is increasingly used as a benchmark by auditors when no other framework is applicable – the four functions map to operational audit domains.
De facto baseline when no other applies
CMMC 2.0
US Defense · Required
CMMC 2.0 does not name AI, but any AI that processes CUI is fully in scope for all 110 NIST SP 800-171 controls. Practice areas AC (Access Control), AU (Audit), SC (System and Communications) and SI (System Integrity) are directly applicable to AI agents. Third party assessors (C3PAOs) are now asking about AI access paths to CUI, audit trail linkage, and whether AI tool use is in the System Security Plan.
Active CMMC assessments in 2026
EU AI Act
EU · Regulation
High-risk AI system requirements under Article 9 (risk management), Article 12 (record-keeping) and Article 13 (transparency) are applicable for in-scope systems. Disclosure obligations under Article 52 on AI generated content are applicable. Technical documentation and conformity assessment for high-risk systems.
Enforcement active for high-risk systems
CISA Five Eyes (May 2026)
Joint · Agentic AI
Published May 1, 2026 by CISA + NSA + ASD + CCCS + NCSC. "Careful Adoption of Agentic AI Services" is the first joint guidance for autonomous AI agents. Five risk categories are defined: privilege escalation, design/configuration error, behavioral mismatch, cascading failure, and accountability obfuscation. Used as a direct reference in new audit programs.
New May 2026 · Referenced in audit programs
Typical AI security audit sequence — what happens in each phase
PHASE 1 · SCOPING AI system inventory Define in-scope AI tools Map to frameworks Identify data classification Weeks 1–2 PHASE 2 · EVIDENCE REQ. Document requests sent Policies · Logs · Reports Access reviews · Configs Test results · Playbooks Weeks 2–4 PHASE 3 · TESTING ← Control walkthroughs Auditor tests each control Samples logs · Interviews Penetration test review Most failures happen here PHASE 4 · REPORTING Findings documented Exceptions listed Management response Report issued Weeks 6–8 The evidence collected in Phase 2 is tested in Phase 3 — everything that exists only verbally or in policy documentation without system-of-record evidence fails here

Auditors in 2026 will want to see that AI startups have security and risk controls in place that address model behavior over time, not just the infrastructure. If you cannot prove model drift monitoring, training data provenance, and retraining change management, you will get a finding.

— SecureSlate SOC 2 Controls Guide, January 2026

The Five Audit Domains — Evidence by Evidence

What follows is what auditors really want: evidence items, the most common gaps, and the insider's tip on what gets you on the list, all based on SOC 2 AI audit guidance, ISO 42001 certification requirements, CMMC 2.0 AI-specific controls, and the CISA Five Eyes agentic AI guidance.

01
DOMAIN
Highest Auditor Time Investment · CC1.2, CC9.2, ISO 42001 Clauses 4–6
AI Inventory & Governance
Shadow AI · Agent registry · Policy documentation · Risk register
Most common gap: no formal registry
Pass requirement: complete, current, system-of-record
Evidence auditors request
  • 📋Complete AI system inventory — every tool, agent, and model in use, with owner, purpose, data classification, and last review date
  • 📋AI acceptable use policy (documented, signed, dated — not just drafted)
  • 📋Risk register for all in-scope AI systems, with identified risks and mitigation status
  • 📋Evidence of shadow AI discovery process — how unauthorized AI tools are detected
  • 📋AI system impact assessments for high-risk use cases (required by EU AI Act Article 9 and ISO 42001 Clauses 6 and 8)
  • 📋Vendor AI risk assessments for third-party AI services
  • 📋Board or executive-level AI governance oversight documentation
Most common failures
  • There is an inventory in a spreadsheet that is owned by one person, not a system of record, not updated after a new tool is deployed
  • AI policy drafted but never approved, signed or distributed – auditors request proof of employee acknowledgement
  • Risk register does not include AI risks (model drift, prompt injection, data poisoning) but infrastructure.
  • Shadow AI informally detected – no process, no tooling, no scan results to show
  • AI vendor agreements do not have data processing terms for AI inputs – GDPR and HIPAA risk
Insider Note
Auditors now ask: "How did you find out about this AI tool?". If the answer is "someone told us" you don't have a discovery process. If the answer is "we ran a network scan on [date] and here are the results" you pass.
02
DOMAIN
Access Control · CC6.1, CC6.2, ISO 42001 Annex A.6, CMMC AC.1.001
Access & Identity Controls
NHI management · Least privilege · Credential rotation · Agent identity
New in 2026: AI API keys must appear in access reviews
Pass requirement: quarterly access reviews with sign-off evidence
Evidence auditors request
  • 🔑Review access reports with reviewer name, date and per-user/per-agent approval/removal decision – quarterly is usual
  • 🔑AI API keys and embeddings store access included in the access review (new 2026 expectation from SOC 2 auditors)
  • 🔑Evidence that the AI agent credentials of terminated employees were revoked with timestamps
  • 🔑Least privilege documentation: declared scope of each agent vs. actual permissions
  • 🔑AI API key rotation logs: when was the last AI API key rotated, who rotated it, and what is the rotation policy?
  • 🔑MFA enforcement on all AI platform administrative accounts
  • 🔑Unique per-agent identity for agentic deployments (CISA Five Eyes requirement)
Most common failures
  • Access reviews completed but not kept with the sign-off of the reviewer – the evidence is not available to the auditor, although the review has been performed
  • AI API keys are not part of quarterly access reviews – most access reviews were done before the AI was deployed and not updated
  • Shared service accounts for AI agents cannot be demonstrated on a per-agent least privilege basis without per-agent identity
  • There is a credential rotation policy, but no logs that it was executed — policy without evidence of execution fails
  • 70% of agents over-privileged against declared task scope (Teleport 2026) – auditors now ask for a comparison between declared and actual permissions
Insider Note
The most frequent finding: access reviews are done but the reviewer's name and sign-off date are not recorded in the evidence. Auditors need reviewer accountability – a time-stamped record of who approved what, and when.
03
DOMAIN
LLM Security · CC6.6, CC6.8, OWASP LLM01/02/06, EU AI Act Art.15
Input/Output Inspection & Policy Enforcement
Prompt injection controls · PII protection · Output policy · Guardrail evidence
New in 2026: auditors ask for adversarial test results
Pass requirement: documented controls + evidence of operation
Evidence auditors request
  • 🛡️AI input inspection controls – what is checked before prompts are sent to the LLM
  • 🛡️Documentation of PHI/PII handling for AI inputs and outputs: PHI/PII detection, redaction or blocking
  • 🛡️PHI/PII handling for AI input and output: PHI/PII detection and redaction or blocking
  • 🛡️Adversarial test results (prompt injection testing, jailbreak testing) – auditors ask for these now, especially under CC9.2 and EU AI Act Article 15
  • 🛡️How long are AI conversations logged? Who can access them? Data retention policy for AI inputs and outputs
  • 🛡️Logs of the inspection layer running, not only configured
Most common failures
  • There is guardrail documentation, but no operational logs showing that the controls are running in production – a configured control that is not running is not a control
  • No adversarial test results are being requested from the auditor. Auditors are now specifically asking for these, especially for AI systems processing sensitive data
  • PII handling policy does not cover AI prompt inputs but employees pasting PHI into AI tools is a HIPAA gap that is not addressed by DLP
  • Policy output in system prompt only — if the system prompt can be extracted or bypassed, the whole policy fails; auditors ask how the policy is enforced at the infrastructure layer and not at the model layer
  • No evidence of regular testing of policy effectiveness – "our guardrails block bad outputs" without test results
Insider Note
Auditors are beginning to differentiate between model-layer guardrails (system prompts) and infrastructure-layer guardrails (inspection before/after the LLM). Infrastructure-layer controls are much more defensible – they cannot be jailbroken out of the model.
04
DOMAIN
Logging · CC7.2, CC7.3, ISO 42001 Art.12, NIST AI RMF MEASURE
Audit Trails & Logging
Decision-chain logs · Retention policy · Forensic capability · Anomaly alerts
Most underprepared domain across AI deployments
Pass requirement: 90-day minimum, identity-linked, searchable
Evidence auditors request
  • 📊Sample AI interaction logs: agent identity, input, output, tool calls, policy evaluation result, timestamp – one request may be sampled
  • 📊Log retention policy and evidence of retention: 90 days minimum in regulated environments, 12 months in HIPAA
  • 📊Log integrity controls: logs must be tamper-evident; if an attacker can delete logs, the logs are not audit-grade
  • 📊Alerting on anomalous AI behaviour – what triggers the alert, who gets it, how long does it take to respond
  • 📊Model monitoring evidence – drift detection thresholds, last drift review, retraining change management records (SOC 2 CC9.2)
  • 📊Logs are searchable and can be used for incident investigation, not just stored
Most common failures
  • Logs record actions but not the decision context – "agent called file_read" without any information about what triggered it, which policy was evaluated and what the agent received in return
  • Logs are present but not associated with an agent identity – if several agents use the same service account, logs are forensically worthless
  • Lack of log integrity controls (logs in a system where the application can write to them) means that an attacker who has compromised the application can modify the logs
  • There is a model monitoring policy but no evidence of a review of the policy. Drift thresholds are defined but the last review was 9 months ago
  • Alerts are set but never tested. Auditors ask: "When was the last time this alert fired? What was the response?" No answer = no evidence the alert works
Insider Note
Auditors sample logs in the walkthrough. They select a date range and ask you to pull all AI interactions for that period and look for completeness, identity linkage, policy evaluation and the ability to answer "what did this specific agent do on this date?". If you cannot answer that in under 5 minutes you have a finding.
05
DOMAIN
Incident Response · CC7.4, CC7.5, ISO 42001 Clause 10, CISA Five Eyes
Incident Response for AI Systems
AI-specific playbooks · Kill switch testing · Breach notification · Post-incident reviews
Most common gap: generic IR plan with no AI-specific procedures
Pass requirement: tested, AI-specific, with documented drill results
Evidence auditors request
  • 🚨AI incident response playbook for rogue agents, data exfiltration by AI, prompt injection and model compromise
  • 🚨Kill switch test results – proof that you can shut down a particular AI agent within a given time period (CISA Five Eyes: this has to be proven, not just recorded)
  • 🚨AI-specific tabletop exercise records: date, participants, scenario, findings, remediation actions
  • 🚨AI data breach notification procedure: when, who, what to report
  • 🚨Records of post-incident review of any incident involving AI: what happened, what went wrong, what was fixed
  • 🚨AI escalation path — who is alerted when an AI security alert fires and what are the SLAs
Most common failures
  • Generic incident response plan without AI-specific procedures – "refer to general IR plan" is not allowed for AI systems that access sensitive data
  • Kill switch in policy but never tested – 60% of the organizations cannot kill a misbehaving agent (Kiteworks 2026) – auditors want to see test results, not descriptions
  • Tabletop exercises not recorded with scenario, findings, and remediation – verbal "we did a tabletop" without records does not meet auditor requests
  • Breach notification procedure does not include the disclosure of AI generated output (Article 52 of the EU AI Act) or the exposure of data mediated by AI as a separate trigger for notification
  • No post-incident review in the file – if you have no AI incidents, auditors will ask how you know, which leads back to logging gaps
Insider Note
According to CISA Five Eyes guidance (May 2026) kill-switch capability must be demonstrated and not just documented. Auditors aligned to the guidance are now asking for live demonstration or test records – not just policy language describing the capability.

Interactive Readiness Self-Assessment

Go through the checklist before your audit. If you can check off an item that you can verify with evidence (not just a verbal "yes", but a document, log or system record that you can produce on demand) then do so. Your score indicates where you need to concentrate your preparation.

AI Audit Readiness Checklist
Mark only the items you can prove – not the items you are going to have ready
0/20
A full list of all AI systems is in a formal record (not a personal spreadsheet).
D01
AI acceptable use policy formally approved, signed and distributed with employee acknowledgement records
D01
Risk register contains AI-related risks (prompt injection, data leakage, model drift) and mitigation status
D01
The process of finding a shadow AI is recorded and you can show the results of a periodic scan to an auditor
D01
Quarterly access review for AI API keys and embeddings stores, with reviewer name, date, and per-item decisions
D02
AI agent credentials of a terminated employee revoked in 24 hours with tickets with timestamps to show it
D02
AI agents have a unique identity for each agent, no shared service accounts between agents
D02
There are logs of credential rotation for AI API keys that were rotated on the schedule
D02
Input inspection controls are documented and logs show they are in production
D03
PII/PHI handling policy applies to AI prompt inputs, not just data flows
D03
There are adversarial test results for AI systems (prompt injection, jailbreak testing) — in the last 12 months
D03
The output policy is enforced at the infrastructure level – not only through the system prompt instructions
D03
AI interaction logs contain agent identity, inputs, outputs, tool calls and policy evaluation results
D04
Log retention is in compliance with the minimum requirements (90 days standard, 12 months for HIPAA) and is documented in a policy
D04
Logs are tamper-proof - they are stored in a system in which the AI application cannot overwrite or delete them
D04
The drift monitoring thresholds are defined and checked on a regular basis and the last check is recorded
D04
There is an AI incident response playbook for rogue agent, prompt injection, and data exfiltration.
D05
Kill switch tested in the last 12 months (documented test results showing agent terminated without cascade)
D05
Tabletop exercise on AI, within the last 12 months, with scenario, participants, results and remediation
D05
Breach notification procedure includes AI-related data exposure as a separate trigger with regulatory disclosure time limit
D05
The Polygraf Evidence Package

Polygraf automatically produces three of the most commonly asked for audit evidence artifacts: structured AI interaction logs (agent identity, policy evaluation results and full decision-chain context), PII inspection records (every time a sensitive data was found and blocked at input/output boundary), and policy enforcement reports (guardrail operations over the audit period). These artifacts are aligned with Domain 3 and Domain 4 evidence requests, on-demand and ready to be given to an auditor.

Polygraf AI

Audit-Ready AI Security Infrastructure

Polygraf produces the audit evidence your security program requires – interaction logs, PII inspection records, and policy enforcement reports – and enforces the controls that produce those records. Sub-100ms. On-prem. No data leaves your environment.

Request a Demo →
Air-gap ready · HIPAA · SOC 2
Deploys in under an hour

NEWS & More

Insights & Updates from Polygraf.

Blog Posts

Learn what PII data is being exposed by AI tools and how to protect your data.

To learn more about Polygraf, please get in touch.

At Polygraf, we envision a future where AI augments human capabilities without compromising safety, privacy, or ethical standards. Trust in our commitment to building this future with you.

Products

thank you

Your download will start now.

Thank you!

Please provide information below and
we will send you a link to download the white paper.