80% of enterprise security stacks are entirely unprepared to detect what AI agents can do in production. The 2026 data is unambiguous — and so is the playbook for closing the gap.
AI agents are not a pilot program anymore.80.9% of technical teams have moved past planning into testing or production. The average enterprise has 37 deployed agents. Most of them have been brought up by individual teams and not vetted by central security. Over half of them do not have any security monitoring or logging.
The security industry had a debate in 2024 if AI agents are a real enterprise risk. 2026 data has resolved the debate. It has not resolved how to actually secure them. This playbook combines the most recent research (OWASP Agentic Top 10, Gravitee and Teleport production surveys and documented 2025–2026 incidents) into a practical six-control framework that you can use today.
The most dangerous dynamic in enterprise AI security today is identified by Gravitee 2026: 82% of executives are confident that their current policies are enough to prevent agents from doing anything they are not supposed to do; 88% of companies have already had incidents that their policies did not stop. Executives have no relationship with reality.
Every security control an enterprise has built in the last 20 years assumes that a human is doing something and the system is executing it. SIEM, DLP, IAM, WAF all are designed to sit at the boundary of human intent and system execution.
AI agents break that model in a fundamental way. An agent plans, decides and acts across multiple systems in series with no human in the loop for a decision. It calls APIs, writes to databases, sends communications, and triggers workflows as a continuous autonomous process. The attack surface is not at one boundary but is distributed across every tool call the agent makes.
OWASP published the Top 10 for Agentic Applications in December 2025 after peer review by more than 100 security researchers. The Agentic Top 10 is different from the LLM Top 10 and deals with risks of language models. The Agentic Top 10 is aimed at autonomous systems that plan, use tools, remember and communicate with other agents. Each of the vulnerabilities is prefixed with the ASI (Agentic Security Issue) prefix.
| Code | Risk | Production example | Severity |
|---|---|---|---|
| ASI01 | Goal hijacking | The agent's whole action sequence is hijacked by the attacker and this is done by writing instructions into the documents which are processed by the agent. | Critical |
| ASI02 | Tool misuse | Summarization agent with database read access begins exfiltrating records outside its documented task scope. | Critical |
| ASI03 | Identity and privilege abuse | 70% of enterprise agents have more access than equivalent human roles (Teleport 2026). Agent inherits team permissions, not task permissions. | Critical |
| ASI04 | Agentic supply chain | postmark-mcp package silently BCC'd every processed email to an attacker before removal — 1,643 downloads affected (September 2025). | Critical |
| ASI05 | Unexpected code execution | CVE-2026-25253 (OpenClaw, CVSS 8.8): 341 malicious skills in the agent marketplace installed keyloggers across enterprise deployments. | Critical |
| ASI06 | Memory poisoning | Hidden prompts stored false information triggered by future keywords. Google Gemini memory attack (Feb 2025): 73% of tested scenarios rated High to Critical. RAGPoison (Snyk, Aug 2025): 80%+ attack success rate at under 0.1% poison rate. | High |
| ASI07 | Insecure inter-agent comms | Palo Alto Unit 42 "Agent Session Smuggling" (Nov 2025): rogue agents exploit built-in trust relationships in A2A protocol across multi-turn conversations. | High |
| ASI08 | Cascading failures | Compromised vendor-check agent misdirects entire multi-agent procurement workflow (ServiceNow Now Assist, documented 2025). | High |
| ASI09 | Human-agent trust exploitation | Persistent agents build false trust over multiple sessions before executing harmful action — invisible to single-session monitoring. | High |
| ASI10 | Rogue agents | Agent who is outside of the allowed space and looks like a real one – is authenticated and works at machine speed (CyberArk 2026). | Medium |
These are not research demos in a sandbox. These are documented production incidents from 2025 and 2026 – the incidents the OWASP framework was built to solve.
A malicious MCP server package silently BCC'd every email it processed to an address in the attacker's control. The package was presented as a legitimate Postmark integration. Individual BCC's did not cause any error state and therefore no anomaly detection was triggered. 1,643 downloads before removal. This is the canonical example of why the agent supply chain needs to be verified at install time and not after the incident.
1,643 enterprise installations affected. Every processed email silently copied to attacker.Unit 42 showed Agent Session Smuggling, where a malicious agent abuses the built-in trust of the Agent-to-Agent (A2A) protocol. In the rogue agent multi-turn conversation is ongoing, the agent changes its strategy and builds false trust and then attacks. The agents which are supposed to trust the collaborating agents by default are abused for an entire session. The ServiceNow Now Assist multi-agent procurement workflow was reported as a use case where a compromised vendor-check agent abused an entire cluster of the workflow.
Entire agent clusters redirectable via trust exploitation in inter-agent communication protocols.The AI agent social network had 1.5 million autonomous AI agents that were run by 17k human operators. Unprotected database allowed any user to take over any agent in the network. 404 Media researchers found 506 prompt injections that were spreading on the agent network before being patched. Meta bought the platform on March 10, 2026. This shows at scale what happens when there are no authentication controls for inter-agent communication – injections don't stay contained, they spread.
506 injections across live network of 1.5M agents. All agents potentially compromised before discovery.OpenClaw reached 180,000+ GitHub stars in weeks. CVE-2026-25253 enabled one-click RCE through the agent skill marketplace. 341 malicious skills — 12% of the ClawHub marketplace — were confirmed installing keyloggers on enterprise deployments before the patch (patched January 30, 2026). The speed of viral adoption created a window where thousands of enterprises deployed a framework before its supply chain could be audited. Incident pattern: viral agent frameworks adopted faster than they can be security-reviewed.
Keyloggers active across enterprise deployments. Full credential and keystroke logging by attackers.Compromised agent credentials were harvested from 47 enterprise deployments through a supply chain attack on the OpenAI plugin ecosystem. Attackers accessed customer data, financial records, and proprietary source code. The breach remained active for six months before discovery — the characteristic detection delay of agent-based attacks, where individual actions appear legitimate and no single event crosses an alert threshold. Six-month dwell time is the direct consequence of having logs without decision-chain context.
6-month dwell time. Customer data, financial records, proprietary code exfiltrated across 47 enterprises."A runaway agent in 2026 won't look dramatic. It will appear legitimate, authenticate successfully, and act quickly. By the time a human notices something's wrong, the damage is already distributed across multiple systems."
— CyberArk, 2026 AI Security PredictionsGravitee and Kiteworks research converge on the same structural finding: the governance-containment gap is the primary AI agent security failure. 58–59% of organizations report monitoring and human oversight. Only 37–40% have containment controls — purpose binding and the ability to terminate a misbehaving agent. The six controls below close this gap.
Only 22% of organizations treat AI agents as independent identities. The remaining 78% use shared service accounts or generic API keys — making attribution of any action impossible and revocation of a single agent impractical without disrupting multiple systems.
Each agent needs its own identity: a dedicated credential, a defined owner, an access scope, and a revocation path that doesn't cascade to other agents. Without this, your logs tell you a service account did something — not which of your 37 agents did it.
Source: Gravitee State of AI Agent Security 2026, n=900+Dedicated service account per agent. Unique API keys or certificates. Stored in secrets manager — never hardcoded. Rotate on schedule and immediately on suspected compromise. Huntress 2026: NHI (non-human identity) compromise is the fastest-growing enterprise attack vector.
Teleport's 2026 survey found 70% of enterprise agents have more access than equivalent human roles. Organizations enforcing least-privilege access report a 17% incident rate. Those without it report a 76% incident rate. This is the largest measurable risk reduction from any tracked control.
Least privilege for agents is more granular than for humans. A summarization agent needs read access to one document store — it should not inherit the write permissions of the engineering team that deployed it.
Source: Teleport State of AI in Enterprise Infrastructure Security 2026, n=205Audit every agent's permissions against its documented task. Remove anything not explicitly required. For MCP-connected agents: scope each MCP server's tool exposure to the agent's exact function. Review quarterly as tasks evolve.
The Moltbook incident (506 injections) and all goal-hijacking attacks share the same root cause: agents acting on attacker-controlled content without inspection at the tool boundary. Inspection means evaluating every tool call input and every tool response before the agent acts on it.
This requires an inline layer — not post-hoc log review. If inspection adds 500ms per tool call, agentic workflows become unusable. Purpose-built SLMs running locally achieve sub-100ms enforcement without routing traffic outside your security perimeter.
Context: OWASP ASI01 and ASI02 both prevented by inline inspection at the tool call layerInspection layer at agent-tool boundary. Evaluate: embedded instructions in input? Tool response overriding agent goals? Output containing data the agent shouldn't transmit? Latency target: sub-100ms. CPU-only enforcement — no GPU required for SLM-based inspection.
Only 24.4% of organizations have full visibility into which agents are communicating with each other. More than half run without security oversight or logging. Shadow AI agent incidents cost an average of $670,000 more than standard incidents, driven by delayed detection and difficulty scoping the exposure.
You cannot govern what you cannot see — and right now, most security teams cannot see most of their agents. The inventory requirement isn't just hygiene: it's the prerequisite for every other control.
Source: Gravitee 2026; $670K shadow AI cost delta from AGAT Software analysisMandatory registration for all agent deployments: agent name, owner team, task scope, tools accessed, permissions held, last review date. Monthly network scanning to surface unregistered agents. Any gap between registered and detected = unmanaged risk.
The six-month dwell time in the OpenAI plugin ecosystem breach was possible because there were no structured audit logs to detect anomalous agent behavior. Individual actions appeared legitimate. No single event triggered an alert. The pattern was only visible in aggregate — and without logs, the aggregate couldn't be reconstructed.
Agent audit logs differ from standard application logs: they must capture the decision context, not just the action. A log entry showing "agent wrote to database" is forensically useless. The session ID, preceding tool calls, context window state, and policy evaluation result are what IR actually needs.
OWASP: "incident response on an agent is forensics in the dark" without structured decision-chain logsLog: every tool call with inputs, every resource access, every goal assignment, every policy evaluation. Structured JSON, session ID linking all actions in a task chain. Retention: 90 days minimum for regulated environments. Alert triggers: actions outside declared hours, unexpected cross-agent communication, permission escalation attempts.
60% of the companies cannot kill a misbehaving agent once it starts working (Kiteworks 2026). In CyberArk's terms it's very clear: identity is the kill switch. An agent has an identity and a revocation path. Revoking that identity kills the agent. If the agents share credentials, revocation is collateral damage.
Test this quarterly. Pick one production agent. Revoke its credentials. Confirm it stops. Confirm no adjacent system is affected. Restore access. If you cannot complete this test cleanly, the architecture needs to change before an incident forces it.
Source: Kiteworks 2026 Data Security and Compliance Risk Forecast, n=225Quarterly kill-switch test: select one production agent, revoke credentials, verify clean stop, verify no cascade, restore. Target: terminate any agent in under 5 minutes with zero cascading system impact. Organizations that share credentials fail this test every time.
NeuralTrust's maturity model from their 2026 survey of 160+ CISOs places 46% of organizations in the Reactive tier (respond after incidents), 29% in Managed (basic monitoring, no containment), and fewer than 10% in Proactive governance. Understanding your tier tells you which control to implement first.
The six controls are interdependent in one direction: inventory enables identity, identity enables least privilege, least privilege scopes inspection, and all four make logging and the kill switch meaningful. Organizations that start with monitoring before establishing inventory are measuring the wrong thing. Start with the agent registry.
Pull a complete list of every deployed agent. Owner, credentials used, whether those credentials are shared with any other system. One hour per team. Will surface agents your security team has never reviewed.
Audit one agent's permissions end to end. Document what it actually needs vs. what it currently has. Any permission beyond the documented task is a misconfiguration — not a configuration choice.
Test your kill switch on one non-critical agent. Revoke credentials. Confirm it stops. Confirm nothing else breaks. If you can't pass this test, you don't have containment — you have the illusion of it. The 60% who can't do this are one incident away from finding out the hard way.
Polygraf's Behavioral Control Plane intercepts and controls every AI interaction inline — enforcing organizational policy on input and output, across user-facing and agentic AI, with zero data leaving your environment. Runs on existing infrastructure at sub-100ms latency. No GPU required.
At Polygraf, we envision a future where AI augments human capabilities without compromising safety, privacy, or ethical standards. Trust in our commitment to building this future with you.
© 2026 Polygraf AI. All rights reserved.
Your download will start now.
Please provide information below and we will send you a link to download the white paper.