In a 2026 enterprise survey, 88% of organizations had an AI security incident in the last year and 82% of executives thought their policies were already covering them. That six-point difference is where the risk is. This is the practical, auditable list of 25 controls that closes the gap – mapped to OWASP, NIST and MITRE ATLAS.
There are no lack of AI security frameworks in 2026. OWASP has the LLM Top 10 and the Agentic Top 10. NIST has the AI RMF, the Generative AI Profile, and an adversarial-ML taxonomy. There is MITRE ATLAS, Google's SAIF, ISO/IEC 42001, and CISA's joint guidance with the Five Eyes. For a security team that actually wants to do something on a Monday morning, this plethora is its own problem – what CISOs are increasingly calling "compliance chaos." The frameworks tell you what good looks like, but don't hand you a checklist.
So we made one. Here are 25 concrete AI security controls, grouped in 5 categories that follow the lifecycle of AI risk – from knowing what you have to protecting the data to securing the model and agents to monitoring to governance. Each is mapped to the framework it covers, so the checklist is also your compliance evidence. Tick them off as you go – the tracker at the top follows you. Start with the ones marked DO NOW — they close the largest gaps fastest.
You don't need to have all 25 live before you deploy AI — but you should know where you are on each.80/20 principle : NIST control catalogs are over a thousand items, but a focused subset addresses most of the real risk. The nine controls we've flagged DO NOW are that subset — the highest-leverage moves. Everything is mapped to a real framework (OWASP LLM/Agentic Top 10, NIST AI RMF, MITRE ATLAS), so finishing the list also gives you audit evidence.
List every AI model, agent, copilot and tool connected to an MCP and what data it can access, which tools it can call and what permissions it has. Manage it as a software asset. This is the first control, everything else depends on it.
Sanctioned tools are only part of the picture. Run continuous discovery for unapproved AI use across endpoints, browsers, and SaaS – the embedded copilots and consumer tools employees use without supervision – a one-time audit is stale in a week
Know what data is sensitive (PII, PHI, financial, source code, secrets) before it can flow into an AI system. Data classification is the prerequisite for any policy that treats different data differently.
Document where data enters, what the model does with it, where the outputs go and what downstream systems the AI can reach. Data-flow mapping is the process of turning an inventory into a risk picture.
Keep track of the third-party models, datasets, libraries and plugins your AI relies on. A poisoned model from a public repository or a vulnerable dependency is a supply chain attack – you can't defend what you haven't cataloged.
Detect sensitive data (PII, PHI, secrets, source code) in prompts before they are fed to a model and redact/block it. 10% of employee prompts contain sensitive corporate data and input-side inspection is the highest-value data control.
Sensitive-data leakage from screen output, treat model output as untrusted before rendering or downstream, improper output is a known exfiltration path. Never render raw model output into a privileged context unescaped.
Train and fine-tune with tokenization, anonymization or synthetic data. Models can memorize and later output training data – raw PII or secrets in a training set is a leak in the making in an output.
Restrict access to the retrieval corpora and vector stores of a model so that the model can only retrieve what the requesting user is allowed to see. An over-permissioned RAG index leaks data via a seemingly innocent query.
Encrypt prompts, outputs, embeddings and stored model artifacts. A prompt in transit to an AI service is a data transmission like any other and must be strongly encrypted end to end.
The most abused AI attack is not some weird thing. Prompt injection is the #1 OWASP LLM risk and it works because a large language model processes instructions and data on the same channel with no enforced separation – unlike a database where you have parameterized queries that cleanly separate the two. A sentence hidden in a retrieved document, a webpage, or a code comment can change an agent's behavior without malware and without stolen credentials. The Drift AI supply-chain incident spread across more than 700 organizations from one injection. Controls 11–15 exist because you can't fully "validate away" this class of attack – you have to constrain what a compromised model can do.
Layer input filtering, instruction/data separation and an AI gateway to filter injection patterns before they are injected into the model context. No single defense is perfect, so combine them and pair with least-privilege so that a successful injection has a limited blast radius.
Examine every agent's tool access and data reach to its function only – nothing more. With agents 10x over-permissioned, this is the control that turns a successful injection from a breach into a contained event. A support agent should not have access to HR or finance data.
Consider every model and agent as an authenticated identity with its own credentials and access reviews – not as an anonymous process. New guidance is emerging that points to OAuth 2.0 and SPIFFE/SPIRE for agent identity and authorization.
Separate the model and the tool-execution environment from internal systems and networks, so that a compromised model cannot pivot. Limit outbound connectivity and the APIs the model can call.
Add human approval steps before AI system makes a critical irreversible action (transferring money, changing access, external communication). No human oversight means that model errors and injections become real-world harm.
Log everything: prompts, outputs, tool calls, in a tamper-proof audit trail owned by the security team, not the application owner. More than 50% of AI deployments today do not log anything – and that is the forensic evidence you will need in case of an incident.
Look out for unusual patterns: jailbreak spikes, weird retrieval, output weirdness and model drift. AI systems are living systems – what was safe at deployment can drift into risk in production.
Run adversarial testing (prompt injection, jailbreaks, data-extraction, agent tool-misuse) on a monthly or quarterly basis using scenarios from MITRE ATLAS. Static checks are not enough to protect systems that evolve.
Integrate automated adversarial testing and model scanning into your DevSecOps pipeline so that prompts, models and agents are re-tested every time the configuration changes – just like unit tests or dependency scans.
Expand your IR playbook to AI incident types (prompt injection, data leakage, model poisoning, agent hijack), pre-authorize containment (gateway isolation) and know your regulatory reporting clocks. Most IR runbooks were not written for no-signature breach.
Establish clear ownership and decision rights for AI risk (a committee of security, legal and risk) and set the organization's risk tolerance. NIST is clear: govern first and infuse it. Without it, technical controls are unaccountable.
What tools are allowed, what data is forbidden and what use cases are allowed – and then justify the policy with a technical control. A policy that nobody can enforce is documentation, not security; a QSA or regulator can tell the difference.
Crosswalk your controls to the regimes you are subject to (EU AI Act, sector rules, state laws) so that one set of controls provides evidence for many obligations. The EU AI Act high risk obligations come into effect August 2026.
Vet AI vendors' data handling, contracts not to train on your data, and confirmation that you are still able to meet your obligations even if a third party processes the data. You own the risk no matter who runs the model.
Teach people what is safe to feed into AI, what tools are approved and how to identify AI-specific risks. The human is still a critical weakness: trust in AI outputs and careless data input lead to incidents that no technical control alone can catch.
"Frameworks don't tell you what good looks like, they don't tell you what to do on Monday morning. The teams that get AI security right don't implement a thousand controls, they implement the right twenty-five, and they enforce them where the AI runs, not in a policy document."
— Polygraf AI, on operationalizing AI securityUnsure which of the 25 controls to focus on? Polygraf's AI Risk Calculator models your organization's exposure to breach, regulatory and litigation risk – and shows you which obligations apply – so you can sequence your checklist by what actually reduces your risk the most.
Polygraf AI's Behavioral Control Plane directly implements or enhances a large portion of these 25 controls – which are all in the highest-leverage "do now" set. It offers shadow-AI discovery and inventory (controls 1–2), real-time prompt inspection and redaction plus output filtering (6–7), data classification at the AI boundary (3), enforcement that enables least-privilege and injection containment (11–12), immutable interaction logging (16) and the enforcement layer that turns an acceptable-use policy into a real control (22). It is on-premise, zero data egress, sub-100ms latency, so the controls are on live AI traffic without adding a new exposure point. It does not replace your governance program or your red-team – it is the technical layer that makes the data-facing controls real where the AI runs.
Polygraf AI protects the most important controls on this list – shadow-AI discovery, prompt inspection and redaction, output filtering, least-privilege enforcement, and immutable logging – in-line, where your AI runs. On-premise, sub-100ms, zero data egress.
At Polygraf, we envision a future where AI augments human capabilities without compromising safety, privacy, or ethical standards. Trust in our commitment to building this future with you.
© 2026 Polygraf AI. All rights reserved.
Your download will start now.
Please provide information below and we will send you a link to download the white paper.