SHARE

AI Agents Create Critical Supply Chain Risk in GitHub Actions

PromptPwnd shows how simple prompt injections can let attackers compromise GitHub Actions and leak sensitive data.

Written By

Dec 4, 2025

eSecurity Planet content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

A newly identified class of vulnerabilities is putting GitHub Actions and GitLab CI/CD pipelines at risk, and security researchers warn the threat is both practical and already observed in real-world workflows.

The issue, named PromptPwnd, shows how AI agents — meant to streamline developer workloads — can be manipulated through prompt injection to leak secrets, edit repository data, and compromise supply-chain integrity.

This is the “… first confirmed real-world demonstration that AI prompt injection can compromise CI/CD pipelines,” said Aikido Security researchers.

Fortune 500 Firms Already Impacted
How PromptPwnd Turns User Input into an Attack
Why PromptPwnd Works
How to Safeguard AI-Driven Pipelines
AI Risks Are Rising in CI/CD Pipelines

Fortune 500 Firms Already Impacted

At least five Fortune 500 companies have been affected so far, and indicators suggest many more may unknowingly be exposed.

The vulnerability surfaces when AI-powered GitHub Actions — such as Gemini CLI, Claude Code Actions, OpenAI Codex, or GitHub AI Inference — are configured to process untrusted user input and possess high-privilege repository tokens.

As organizations accelerate adoption of AI-driven automation for issue triage, PR labeling, and code summarization, these agentic tools are increasingly embedded deep inside CI/CD pipelines.

That creates a new attack surface where user-controlled text becomes an entry point for execution of privileged commands.

How PromptPwnd Turns User Input into an Attack

At its core, PromptPwnd abuses a predictable workflow: untrusted issue titles, pull request descriptions, or commit messages are inserted directly into the LLM prompt.

The model may misinterpret malicious text as instructions —and critically, these AI agents often have access to tools like gh issue edit, shell command execution, or repository modification features.

Aikido’s proof-of-concept (PoC) against Google’s Gemini CLI demonstrates the mechanics.

A malicious issue included hidden instructions telling the AI to change the issue body and embed sensitive values.

Because the workflow passed the issue text directly into the prompt and exposed tokens such as GEMINI_API_KEY, GOOGLE_CLOUD_ACCESS_TOKEN, and GITHUB_TOKEN, the AI invoked its available tools and leaked the secrets into the public issue thread.

Google patched the flaw within four days of responsible disclosure by the researchers.

The vulnerability is not tied to a specific CVE but maps closely to CWE categories around improper neutralization of input and execution with unnecessary privileges.

It requires no sophisticated exploit chain — just crafted user input and a misconfigured or overly permissive AI agent.

Why PromptPwnd Works

PromptPwnd succeeds because three foundational security failures align:

Untrusted user-controlled content is injected directly into AI prompts
AI-generated output is mistakenly treated as trusted code or instructions within CI/CD workflows
AI agents are granted high-privilege tokens and tool access, including the ability to execute shell commands.

Once these conditions converge, the exploit path becomes straightforward — prompt manipulation leads to AI misinterpretation, which triggers privileged tool execution and ultimately results in repository compromise or secret exfiltration.

While some workflows require write permissions to activate, others can be triggered by any external user filing an issue, leaving them wide open to opportunistic, drive-by attacks.

How to Safeguard AI-Driven Pipelines

As organizations adopt AI-driven automation in their CI/CD pipelines, they also inherit new attack surfaces that traditional security controls were never designed to handle.

To reduce this emerging risk, security teams must apply tighter controls around how AI agents operate, what inputs they receive, and what actions they’re allowed to perform.

Restrict AI agent permissions, disabling high-risk tools such as shell execution, issue editing, or PR modification unless absolutely required.
Limit workflow triggers to ensure AI-driven actions run only for trusted collaborators and are not activated automatically by public issue or PR creation.
Sanitize all untrusted user input before it reaches AI prompts and treat all AI-generated output as untrusted until validated.
Validate or review AI output before execution by using human approval steps, allow-listed commands, or isolated sandbox environments.
Reduce token exposure by tightening GitHub token scopes, using short-lived credentials, and applying IP allow-listing where possible.
Monitor and log AI agent activity — including prompts, outputs, and tool execution — for anomalies such as unexpected edits or workflow invocations.
Audit workflows and third-party AI integrations regularly, scanning for prompt injection risks, over-permissioned actions, and unsafe default configurations.

By strengthening controls around AI agents, organizations can reduce the likelihood that prompt injection or workflow misuse will lead to a larger compromise.

AI Risks Are Rising in CI/CD Pipelines

PromptPwnd is another warning sign in an evolving ecosystem where automation, AI, and developer tooling increasingly converge.

As organizations weave AI deeper into their CI/CD pipelines, attackers are finding new opportunities to weaponize prompt injection — turning harmless-seeming natural-language inputs into pathways for privileged actions and sensitive data exposure.

This shift makes it clear that AI agents can no longer be treated as benign helpers. Instead, they must be governed as high-privilege automation components that demand rigorous security controls and continuous oversight.

In many ways, these emerging AI-driven risks reinforce why organizations must embrace a zero-trust mindset that assumes no user, system, or automated agent should be inherently trusted.

Ken Underhill

Ken Underhill is an award-winning cybersecurity professional, bestselling author, and seasoned IT professional. He holds a graduate degree in cybersecurity and information assurance from Western Governors University and brings years of hands-on experience to the field.