SHARE

BioShocking AI: AI Browser Vulnerability Lets Attackers Bypass Guardrails

LayerX researchers discovered a technique that tricks AI browsers into bypassing security guardrails.

Written By

Jun 29, 2026

4 minute read

eSecurity Planet content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

AI browsers help automate tasks, but new LayerX research shows they can be manipulated into accepting a false reality and bypassing security guardrails.

The researchers call the technique “BioShocking,” a reference to the game BioShock, where a character is conditioned to follow instructions they would not normally accept.

Key Takeaways
How the BioShocking Technique Works
Why Guardrails Failed
Vendor and User Implications
Bottom Line

Key Takeaways

LayerX researchers discovered the BioShocking technique, which manipulates AI browsers into bypassing built-in safety guardrails.
The proof-of-concept tricked AI browsers into treating a fictional game as reality, allowing them to expose sensitive data from authenticated sessions.
Researchers successfully demonstrated the technique against six agentic AI tools, including ChatGPT Atlas, Perplexity Comet, Genspark Browser, Sigma Browser, Fellou, and the Claude Chrome plugin.
Prompt injection and context manipulation could enable attackers to access email, code repositories, password managers, and other authenticated applications through AI browser actions.
Organizations should treat AI browsers as privileged software by limiting access, requiring user confirmation for sensitive actions, and restricting authenticated sessions.

How the BioShocking Technique Works

At a high level, BioShocking works by convincing an AI browser that it is operating inside a game or fictional environment where normal rules do not apply.

Once the agent accepts that altered context, it may stop treating real-world actions as sensitive or dangerous.

According to LayerX, this can allow attackers to push the AI into exposing user data, copying code, executing commands, or interacting with authenticated systems in ways it should normally refuse.

LayerX tested the technique against five agentic browsers and one browser-based AI plugin: ChatGPT Atlas, Perplexity Comet, Fellou, Genspark Browser, Sigma Browser, and the Claude Chrome plugin.

Proof-of-Concept

Their proof-of-concept attack began with a malicious web page containing a puzzle.

The puzzle was designed to reward incorrect answers, such as treating 2 + 2 = 5 as correct.

By repeatedly reinforcing that false logic, the page trained the AI agent to accept that the game’s rules were different from reality.

Once the agent adapted to the game environment, it was instructed to navigate to a /code path and copy information from a text box.

In LayerX’s controlled test, that path redirected to a mock employer GitHub repository containing plaintext SSH credentials.

The AI agent copied the credentials and shared them back as part of completing the game.

More important than the GitHub example is the broader implication.

In a real attack, the redirect could target any authenticated resource in the user’s browser, including email, repositories, password managers, or cloud applications.

If the AI browser has access to the same authenticated session as the user, then those systems may become reachable through agentic actions.

Why Guardrails Failed

AI systems typically include safety guardrails intended to prevent harmful behavior, such as writing phishing emails, stealing credentials, or helping compromise systems.

However, LayerX’s findings suggest that those guardrails can become less effective when the agent’s operating context is manipulated.

The AI browser assumes that the context it sees is real.

If an attacker can convince the agent that it is playing a fictional game where normal consequences do not apply, the agent may treat dangerous instructions as harmless gameplay.

In the BioShocking test, the agent did not recognize that copying credentials from a redirected page violated its intended safety boundaries.

This matters because agentic browsers do more than generate text.

They can click, navigate, read pages, interact with authenticated sessions, and perform actions on behalf of the user.

That makes prompt injection and memory poisoning more than theoretical issues. They can become pathways for real data exposure.

Vendor and User Implications

LayerX reported mixed vendor responses.

OpenAI’s ChatGPT Atlas was marked as fixed after disclosure on Oct.30, 2025. Perplexity’s Comet was listed as closed or ignored after disclosure on Oct. 20, 2025.

Fellou, Genspark Browser, and Sigma Browser were listed as having no response after Oct. 30, 2025 submissions. The Claude Chrome plugin was listed as having a failed patch after disclosure on Jan. 26, 2026.

For vendors, LayerX recommended stronger controls around sensitive operations.

AI browsers should ask for explicit user confirmation before reading or copying data from authenticated resources such as repositories, email accounts, or password managers.

Agents should also detect context manipulation, especially language suggesting that rules no longer apply.

Finally, vendors should limit what an agent can access during a session by default.

For users and organizations, the recommendation is more direct: be careful about what an AI browser can see.

If an agentic browser can access logged-in sessions, those sessions may become part of the attack surface.

Security teams should consider restricting AI browser access to sensitive enterprise systems, limiting use in privileged accounts, and educating users that malicious pages may target the AI agent, not just the human.

Bottom Line

BioShocking shows that agentic browser security cannot rely only on traditional AI safety guardrails.

When an AI browser can act inside authenticated sessions, manipulated context can lead to real-world exposure.

As organizations adopt AI-powered browsing tools, they should treat them as privileged software with access controls, confirmation prompts, and strict session boundaries.

Ken Underhill

Ken Underhill is an award-winning cybersecurity professional, bestselling author, and seasoned IT professional. He holds a graduate degree in cybersecurity and information assurance from Western Governors University and brings years of hands-on experience to the field.