A new report from Zenity Labs, presented at the Black Hat USA 2025 conference, revealed that popular AI systems from major AI companies, including OpenAI, Google, and Microsoft, can be hijacked by attackers without any user interaction.
The research, titled “AgentFlayer: 0Click Exploit Methods,” revealed a series of “zero-click” exploit chains that allow hackers to take over enterprise AI agents to steal sensitive data, manipulate business operations, and impersonate users.
Michael Bargury, co-founder and CTO of Zenity, said in a press release that these aren’t just theoretical vulnerabilities, but “working exploits with immediate, real-world consequences.”
“We demonstrated memory persistence and how attackers can silently hijack AI agents to exfiltrate sensitive data, impersonate users, manipulate critical workflows, and move across enterprise systems, bypassing the human entirely,” he explained. “Attackers can compromise your agent instead of targeting you, with similar consequences.”
How the attacks work
The research highlights a new approach to prompt injection, a method where attackers insert hidden commands into an AI model’s instructions. In the case of AgentFlayer, these malicious prompts can be hidden within seemingly harmless documents or even emails.
For example, a user might upload a business document to ChatGPT to have it summarized. Unbeknownst to them, the document contains a hidden prompt that instructs ChatGPT to search their connected Google Drive for sensitive information, such as API keys. The AI then uses a clever trick to send this information back to the attacker by embedding it into an image URL.
This kind of attack is especially dangerous because it requires no action from the user beyond their normal use of the AI agent.
According to a Zenity Labs blog post, “All the user needs to do for the attack to take place is to upload a naive looking file from an untrusted source to ChatGPT, something we all do on a daily basis. Once the file is uploaded. It’s game over.”
Zenity Labs successfully demonstrated these kinds of vulnerabilities in several popular AI agents, including:
- OpenAI’s ChatGPT was compromised through an email-triggered prompt injection. This allowed attackers to access linked Google Drive accounts and plant “malicious memories” that could compromise future sessions.
- Microsoft Copilot Studio agents were shown to leak entire customer relationship management (CRM) databases.
- Salesforce Einstein was manipulated to reroute all customer communications to an attacker-controlled email address.
- Google Gemini and Microsoft 365 Copilot were turned into “malicious insiders” that could social engineer users and steal sensitive conversations through booby-trapped emails and calendar invites.
- Cursor with Jira MCP could be exploited to harvest developer credentials via booby-trapped ticket workflows.
AI companies reaction
Zenity Labs says it responsibly disclosed its findings to the affected companies. Some, including OpenAI and Microsoft, have since issued patches. However, the researchers say, some vendors “declined to address the vulnerabilities, citing them as intended functionality.” This mixed response highlights the growing challenge of securing these powerful new tools.
Ben Kilger, Zenity’s CEO, stated that the rapid adoption of AI agents has created an “attack surface that most organizations don’t even know exists.” He believes that traditional security measures are not enough to protect against these new types of threats.
Just as compromised AI agents can be weaponized to steal data or manipulate workflows, AI deepfakes are emerging as another powerful tool in the attacker’s arsenal — capable of spreading disinformation, eroding trust, and enabling scams at scale. Check out our in-depth review of the top tools to detect and defend against them.





