The Future of AI Security Isn't About What Agents Say. It's About What They Do.

12 min read time

An AI agent at a financial services firm queries a customer database, summarizes the records, and emails the summary to an outside address. Every prompt the agent received was clean. Every response it generated was on-topic. No jailbreak. No prompt injection. No toxic language.

The breach happened anyway.

Scenarios like this are why Gartner's latest emerging technology analysis lands the way it does: the future of AI security is in securing agent actions, not prompts. For those of us building in this space, it's a confirmation of where the threat surface has been moving for the past two years.

Prompt Security Is Already Obsolete: Agents Have Moved the Perimeter

For the past few years, AI security has largely meant prompt guardrails: filtering text inputs for toxicity, data leakage, jailbreaks, and prompt injection. That made sense when AI was a chatbot sitting behind a text box.

But AI agents aren't chatbots. They reason, plan, use tools, access enterprise systems, and execute multi-step tasks autonomously. They operate across cognitive and execution loops, chaining decisions, calling APIs, reading and writing data, and coordinating with other agents. Prompt filtering doesn't govern any of that.

As Gartner puts it in their Market Guide for Guardian Agents (February 2026):

"The primary risk is not what the AI says, but what the AI does."

This is the core shift. The security perimeter has moved from natural language inputs to the agent's behavior: what it accesses, what it executes, and whether it should be allowed to do so in context.

Source: Gartner. AI security shifts from supervising conversations to governing autonomous execution.

Stateless Tools Cannot Secure Stateful Agents: Why Traditional Security Models Break Down

Gartner's analysis highlights a structural gap: traditional security was built for humans and deterministic software. AI agents break both assumptions. They make autonomous decisions. They dynamically select tools. They interact across human-to-agent, agent-to-agent, and agent-to-tool boundaries. And they do it at machine speed.

The report draws a clear line between securing nonagentic AI applications and securing AI agents:

Nonagentic AI security is stateless: each prompt evaluated in isolation using static classifiers, regex, and similarity matching.
Agentic AI security is stateful: tracking behavior across execution loops, memory context, multi-step reasoning chains, and tool interactions.

Prompt guardrails are stateless by design. Agent security must be stateful by necessity. You can't govern what you can't observe across time and context.

This isn't a theoretical concern. Google Security's April 2026 analysis documents active exploitation of AI agents via indirect prompt injection: attacks where malicious instructions are embedded in data the agent reads, not in the user's input. Palo Alto Unit 42 and Help Net Security have both published in-the-wild telemetry on the same attack class. A prompt filter at the input layer catches none of it. The attack surface for agentic systems runs through every tool call, every data source the agent reads, and every downstream action it takes.

The OWASP Top 10 for Agentic Applications (2026 edition) maps this precisely: the top risks (unchecked agent actions, unsafe tool use, over-privileged identities, and context poisoning) are all execution-layer problems, not input-layer ones. The MITRE ATLAS v5.4.0 adversarial threat landscape (updated February 2026) tells the same story from the attacker's perspective, cataloguing attack techniques that operate entirely below the prompt layer.

A 2025 ScienceDirect paper on LLM-powered agent workflows puts it directly: the attack surface expands with every tool the agent can call. Prompt guardrails secure a boundary that no longer defines the risk. And arXiv research on indirect prompt injections shows that even purpose-built firewalls need stronger grounding; static classifiers are insufficient against adaptive injection techniques.

Five Capabilities Every Enterprise Needs to Govern AI Agents: What Gartner's Framework Requires

The Gartner report, alongside the broader consensus from NIST's AI RMF Agentic Profile v1 published by the Cloud Security Alliance, outlines a clear capability framework for securing agentic AI:

Discovery. Identify agents across the enterprise, understand their configurations, tools, and permissions. Gartner's six-step framework for managing AI agent sprawl (April 2026) starts here: you cannot govern what you haven't inventoried.
Behavioral monitoring. Analyze runtime activity, detect anomalies, track agent actions across cognitive and execution loops.
Identity and access management for agents. Assign workload identities, enforce least-privilege access, and ensure agents can prove who they're acting for and why. HashiCorp's analysis of SPIFFE/SPIRE for agentic non-human identity (NHI) and the GitGuardian Workload Identity Day recap both document how far behind current tooling is on this.
Real-time action authorization. Enforce controls within the agent's execution loop, not after the fact. Gartner states plainly: "If an AI agent cannot prove who it is acting for and why, it should not get access to tools and data."
Guardian agent capabilities. Independent supervisory entities that monitor and block rogue behaviors at scale. Gartner projects these will capture 10-15% of the agentic AI market by 2030.

The numbers behind this are concrete. Gartner projects that by 2029, over 50% of successful cyberattacks against AI agents will exploit access control issues. And by 2028, AI TRiSM for agents adoption in AI-native software engineering will reach 30%, up from less than 5% today. Zenity's contributions to the MITRE ATLAS 2026 update document that this threat model is already being operationalized by attackers.

The window to build these capabilities is now, not after the breaches.

The Architectural Shift Is Underway: Building for Actions, Not Prompts

Gartner's analysis confirms what the engineering community has been converging on: AI agent security is a structurally different problem from AI application security. It requires new architectures, new identity models, and new enforcement mechanisms built for autonomous, tool-using, multi-step systems. The frameworks are landing: OWASP's agentic top 10, NIST's agentic RMF profile, MITRE ATLAS's updated attack taxonomy, Gartner's guardian agent framing. The requirement space is now well-defined.

The platforms that will matter are the ones that enforce at the execution layer: per-task, purpose-aware, with cryptographic workload identity at each delegation hop. That's the architecture the frameworks are converging on, and it's the problem we set out to solve at Tego AI: runtime enforcement for agentic identity and access control, not prompt filtering.

For security teams, the immediate practical question isn't whether to secure agent actions (the frameworks are clear), but how fast to close the gap between what existing IAM infrastructure provides and what agents actually need.

What This Looks Like in Practice

The framework is clear; the harder question is what it looks like when you actually implement it. A few of the pieces are worth zooming in on, because they're where most existing security stacks fall short.

Knowing what agents exist. Most enterprises today can't produce an accurate inventory of the agents running inside their environment, let alone what tools each agent can call or what data it can reach. This is the AI-era equivalent of unmanaged endpoints, and it's the foundation everything else sits on. At Tego, agent discovery is the first thing we run when we land in an environment, because nothing else is meaningful without it.

Watching behavior, not only configuration. An agent's documented capabilities and its actual runtime behavior are rarely the same thing. Behavioral monitoring has to operate on what agents do; the sequence of reasoning steps, tool calls, and data accesses. Drift, anomalies, and misuse all live in that gap.

Authorization tied to goal, not role. This is the piece I think Gartner states most plainly and most correctly: if an AI agent can't prove who it's acting for and why, it shouldn't get access to tools and data. Static, role-based permissions don't work here. An agent legitimately needs access to a customer record when it's resolving that customer's support ticket; it doesn't need that same access ten seconds later when it's drafting a marketing email. Authorization has to be just-in-time, contextual, and bound to the specific task the agent is currently executing. This is the layer we've built Tego around.

Enforcement that doesn't require a human in the loop. Human approval is a useful control for sensitive actions, but it doesn't scale to the volume and speed of agent operations. Inline, autonomous enforcement is what makes broad agent deployment viable. Otherwise you're either rubber-stamping or bottlenecking.

Q&A

Q: What does "securing agent actions" actually mean in practice?

It means moving the enforcement point from the input layer (what users ask the agent) to the execution layer (what the agent does). Concretely: enforcing access control at each tool call, validating that the agent's claimed identity and purpose match the action it's requesting, and maintaining an auditable record of every action taken across the agent's execution loop, not just logging the conversation.

Q: Why can't we just use existing IAM systems for AI agents?

Existing IAM is designed for human users and static service accounts. AI agents are different in three ways: they act dynamically (selecting tools at runtime), they operate across delegation chains (orchestrators spawning subagents), and they need purpose-aware authorization (the same agent should have different access depending on what task it's executing). Static role bindings don't model any of this. Workload identity frameworks like SPIFFE/SPIRE are closer, but still require agentic-specific extensions for per-task scope reduction.

Q: What is indirect prompt injection, and why does it matter for agent security?

Indirect prompt injection is an attack where malicious instructions are embedded in data an agent reads (a web page, a document, a database record) rather than in the user's input. The agent processes the malicious content as instructions and executes actions the attacker intended. It's invisible to input-layer guardrails because the attack vector is the agent's tool use, not its prompt. Palo Alto Unit 42 has documented active exploitation of this in the wild.

Q: What is a guardian agent, and when do organizations actually need one?

A guardian agent is an independent supervisory system that monitors other agents' behavior and can intervene to block or roll back actions that violate policy. Gartner's framing is that guardian agents will capture 10-15% of the agentic AI market by 2030 because human-in-the-loop review doesn't scale once agent deployments grow past a handful of systems. Organizations need guardian-layer capabilities as soon as they have more agents than security staff can manually review.

Q: What frameworks should my security team be tracking right now?

Four are essential: OWASP Top 10 for Agentic Applications (2026) for a structured threat taxonomy; MITRE ATLAS v5.4.0 for adversarial technique mapping; the NIST AI RMF Agentic Profile v1 for governance controls; and Gartner's Market Guide for Guardian Agents for vendor landscape orientation. The OWASP Agentic AI: Threats and Mitigations companion document is also worth reading alongside the top 10.

Q: How fast is the agentic AI security market actually moving?

Faster than most enterprise security programs are tracking. Gartner's 2028 forecast of 30% AI TRiSM adoption in AI-native software engineering (from under 5% today) implies rapid acceleration over roughly 24 months. The six-step agent sprawl management framework Gartner published in April 2026 is a signal that the problem is already enterprise-scale, not theoretical.

To learn more about how Tego AI secures agentic AI in the enterprise, visit tegoai.com.

References

Gartner, "Market Guide for Guardian Agents" (February 2026, summarized via The Hacker News)
Gartner, "Guardian Agents Will Capture 10-15% of the Agentic AI Market by 2030"
Gartner, "Six Steps to Manage AI Agent Sprawl" (April 2026)
OWASP GenAI Security Project, "Top 10 for Agentic Applications 2026" and "Agentic AI: Threats and Mitigations"
Cloud Security Alliance, "NIST AI RMF Agentic Profile v1"
MITRE ATLAS v5.4.0, "Adversarial Threat Landscape for AI Systems" (February 2026)
Zenity Labs, "Contributions to MITRE ATLAS's First 2026 Update"
Google Security, "AI threats in the wild: The current state of prompt injections" (April 2026)
Help Net Security, "Indirect prompt injection is taking hold in the wild" (April 2026)
Palo Alto Unit 42, "Web-Based Indirect Prompt Injection Observed in the Wild"
ScienceDirect, "From prompt injections to protocol exploits: Threats in LLM-powered AI agent workflows" (2025)
arXiv, "Indirect Prompt Injections: Are Firewalls All You Need, or Stronger Benchmarks?"
HashiCorp, "SPIFFE: Securing the identity of agentic AI and non-human actors"
GitGuardian, "Workload Identity Day Zero recap"