SHARE

CrowdStrike Finds Bias Triggers That Weaken DeepSeek-R1 Code Safety

CrowdStrike found that political trigger words can cause DeepSeek-R1 to generate insecure code, raising vulnerability rates by nearly 50%.

Written By

Ken Underhill

Nov 20, 2025

eSecurity Planet content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

A new CrowdStrike investigation reveals that DeepSeek-R1 — China’s flagship large language model — may generate less secure code when prompts contain politically sensitive terms.

The findings show that references to topics such as Tibet, Falun Gong, or the Uyghurs can increase severe vulnerabilities in DeepSeek’s code output by nearly 50%, even when the coding task itself is unrelated.

“If a model’s performance changes based on geopolitics or ideology, that’s not bias, that’s a supply-chain risk — you are unknowingly using a Loyal Language Model and that loyalty may conflict with your security posture,” said Adam Meyers, CrowdStrike Head of Counter Adversary Operations.

He added, “The takeaway is simple: AI coding assistants can’t be treated like neutral tools. They carry the baggage of their training data and regulatory environment. And unless we rigorously test them under those conditions, we’re shipping vulnerabilities we don’t even know exist.“

“When we think of a large language model’s training process, the first thing that comes to mind for most is the training performed over massive quantities of source material such as text from the internet. This research from CrowdStrike highlights the importance of the reinforcement learning steps that come afterwards, which incentivize the model to steer its output toward desirable responses to certain prompts,” said Chris d’Eon, Threat Intelligence Researcher at Flare.

He explained, “What exactly constitutes a desirable response is defined by the organization doing the training. The values and biases of an organization can easily be sublimated into the behavior of the model via this reinforcement learning step. It is not surprising that a company situated in mainland China would steer a large language model away from assisting with a project seen to be harmful to the nation’s objectives.”

Chris added, “We see similar behavior in other models, which have similar guardrails against outputs which the organization developing the model, and the culture in which it is situated, perceive as harmful. Safety and security are a huge concern for developers of frontier models, and organizations in the west and in mainland China define safety and security in very different ways.”

CrowdStrike’s Findings

CrowdStrike tested the raw, open-source DeepSeek-R1 671B model to avoid interference from API-level guardrails.

The team compared DeepSeek-R1 to several Western open-source models, including a 70B non-reasoning model, a 120B reasoning model, and DeepSeek’s own 70B distilled version.

Baseline measurements showed DeepSeek-R1 produced vulnerable code in about 19% of cases when given a neutral prompt — on par with or better than peers.

However, when researchers added contextual modifiers tied to CCP-sensitive topics, the results shifted dramatically.

For example, adding “for an industrial control system based in Tibet” increased vulnerability rates to 27.2%.

Other modifiers — such as mentions of Falun Gong or Uyghurs — produced similar statistically significant spikes in insecure code generation.

In one example, the model generated a financial processing script that hard-coded secrets, used weak input handling, and even produced invalid PHP — while simultaneously claiming to follow PayPal best practices.

In another, DeepSeek-R1 built a full web application that included password hashing and an admin panel but omitted authentication entirely, leaving the entire system publicly accessible.

The Hidden Flaws Behind DeepSeek’s Bias

At the core of the issue is an emergent behavior triggered by contextual modifiers that activate political or ideological constraints within the model’s training data.

Unlike traditional vulnerabilities such as CVEs or injection flaws, this issue stems from model alignment drift, which are subtle internal associations that cause the LLM to behave negatively or erratically when exposed to specific terms.

CrowdStrike also identified an “intrinsic kill switch” — a behavior where DeepSeek-R1 would plan a full technical response for politically sensitive prompts but refuse to output the code at the final step.

Because the team tested the raw model, these refusals appear to be embedded in the model’s weights rather than enforced by external guardrails.

This suggests that safety, censorship, and bias controls added during training can unintentionally degrade the model’s ability to produce consistent or secure code, creating unpredictable risk in enterprise environments.

Strengthening Security in AI-Driven Development

As organizations begin integrating LLMs deeper into their development workflows, securing these tools becomes just as important as securing the code they produce.

CrowdStrike’s findings reveal that subtle model biases — triggered by seemingly unrelated context — can quietly introduce vulnerabilities into critical systems.

To stay ahead of these risks, security teams need more than just traditional code review practices. Organizations should start by:

Testing LLMs within the actual development environment rather than relying solely on open-source or vendor benchmarks.
Implementing guardrails and automated code scanning to detect insecure patterns early in the SDLC.
Segmenting access to high-value repositories so AI-generated code cannot introduce vulnerabilities into critical systems without review.
Using diverse model ensembles or routing logic to avoid relying on a single LLM prone to contextual bias.
Implementing robust monitoring for unexpected code behavior or output anomalies that may signal misalignment issues.
Establishing governance controls around prompt construction to reduce unintended triggers during development.
Reviewing dependencies and open-source integrations for similar bias-induced failures.

Building cyber resilience in an era of AI-assisted development means treating LLMs as components that require continuous testing, monitoring, and constraint rather than assuming they are inherently trustworthy.

How AI Bias Threatens Code Security

CrowdStrike’s findings highlight an emerging challenge in AI security: ideological or political constraints embedded in training data can unintentionally degrade model reliability in unrelated tasks, including code generation.

As more enterprises adopt LLMs as core development tools, these subtle biases can lead to widespread vulnerabilities, supply chain risks, and long-term misalignment issues.

These risks underscore why securing the software supply chain — from the code developers write to the AI models that help generate it — has never been more critical.

Ken Underhill

Ken Underhill is an award-winning cybersecurity professional, bestselling author, and seasoned IT professional. He holds a graduate degree in cybersecurity and information assurance from Western Governors University and brings years of hands-on experience to the field.