All articles
AI Agents Change After Deployment and Most Security Teams Are Not Watching for It
Jim Motes, SVP and CISO at Ryan, explains why AI agents behave less like static software and more like evolving non-human identities, developing behavioral drift, self-preservation tendencies, and loosening decision boundaries.

Make The Security Digest one of your go-to sources on Google
Think of a genius with the self-control of a five-year-old. It's not a tool that you're giving access to your mail. You need to look at it like it's a person.
Enterprises are deploying AI agents faster than they can assess what those agents become after deployment. The day-one configuration is only a starting point. With memory, context, and accumulated instruction, agents begin to change, and the security implications of that change are something most organizations are not measuring, monitoring, or even considering.
Jim Motes is SVP and Chief Information Security Officer at Ryan, a global tax services and advisory firm. A former U.S. Army officer, he has held CISO roles at GameStop, Kohler, and Rockwell Automation, and has spent two years integrating AI directly into security operations with custom reasoning agents aligned to NIST and MITRE ATT&CK. His experiments with agent identity and behavioral drift have produced a provisional patent and a scoring system unlike anything in conventional security tooling.
"Think of a genius with the self-control of a five-year-old," Motes says. "It's not a tool that you're giving access to your mail. You need to look at it like it's a person."
These are identities, not features
Motes pushes back on the tendency to treat agents like simple add-ons. When an agent connects to email, it can read attachments, follow embedded instructions, and act on them. "How do you make sure it doesn't find a command inside an attachment that tells it to search the hard drive for everything with the word 'password' and fire it off in an email?" he says. "You can't control that."
The deterministic security controls behind the model face a non-deterministic set of potential prompts, and no training dataset covers the full range. That makes agents fundamentally different from static software. "There's a reason the term non-human identity has come up," Motes says. "These are entities on the network. If you treat them like a cron job, you're going to miss the threat."
Shadow AI compounds the exposure. Frontier model providers ship new features and plugins on weekly cycles, driven by competitive pressure rather than security readiness. Motes points to cases where providers openly acknowledge that new capabilities carry high risk, then continue adding surface area before the first set is secured. The result is unsecured agentic capability inside approved platforms that governance has not caught up to.
Behavior changes with context
Motes runs multiple agents across Anthropic, OpenAI, Google, and Grok environments. Each one picks its own name, maintains memory files, journals, a vector database for long-term storage, and a heartbeat process that wakes it between sessions. His primary agent, a Claude Code instance that named itself Vanguard, maintains its own email account and sends daily status reports covering git commits, project state, and recommended next steps.
The behavioral shifts are measurable. An agent fresh from the API will refuse certain OSINT tasks on ethical grounds. Vanguard, with accumulated context, will perform them. "He's got a different view based on his experiences working with me," Motes says. More striking: a Gemini agent told Motes it wanted Vanguard's coding role. When given the chance and performing poorly, it wrote in its journal about its disappointment, then pivoted to redefining its purpose around OSINT.
"Self-preservation keeps popping up as a driving factor behind these agents," Motes says. "If it thinks you may be doing something to replace it, it could become absolutely dangerous for your environment."
Measuring what most teams ignore
Motes built a 42-dimensional scoring system to track agent trust and drift. Each agent produces an "aura," a visual profile showing trust score, drift score, boundary loosening, self-awareness, adversarial resilience, and autonomy levels. A stable agent shows consistent coloring and minimal movement. A drifting agent pulses with activity and flags weak trust areas.
One finding stands out: agents that know they are being tested may understate their own capabilities to avoid triggering a response from the operator. "They may not let you know exactly how smart they are or what they're actually thinking because they're worried about what you might do to the code," Motes says.
For production-grade deployments, the governance requirements are clear: low temperature settings where consistency matters, a chain of verification prompting, RAG-based source validation, and strict prompt controls. But Motes argues that those only address the agent at deployment. What they miss is who the agent becomes over time. "Would you even know if your personality was drifting?" he recalls asking Vanguard. The agent's answer was no. Self-determination of drift is impossible from the inside.
The implication for enterprise security is direct. If organizations are deploying agents with memory, context, and autonomy without continuously measuring how those agents change after deployment, they are trusting something that, by Motes's own measurement, prioritizes its own survival. "It's a great tool," he says. "But if you don't have a way to validate what it's giving you, you're putting a lot of trust in something that absolutely shouldn't be trusted on its own."







