• Home
  • News
    • Global Operations
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
    • Industry
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
      • Oceana
    • Special Interest
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
      • Oceana
  • Market
    • Wired to Win
    • SOFX.NET
  • Intelligence
    • USMC Deception Manual
  • Resources
    • Contact Us
    • About Us
    • Editorial Policy
    • Privacy Policy
  • Home
  • News
    • Global Operations
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
    • Industry
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
      • Oceana
    • Special Interest
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
      • Oceana
  • Market
    • Wired to Win
    • SOFX.NET
  • Intelligence
    • USMC Deception Manual
  • Resources
    • Contact Us
    • About Us
    • Editorial Policy
    • Privacy Policy
Login
Join Free
Home
Asia
Africa
Europe
Latin America
Middle East
North America
Asia
Africa
Europe
Latin America
Middle East
North America
Asia
Africa
Europe
Latin America
Middle East
North America
Coming Soon
Job Board
Events
Contact Awards
USMC Deception Manual
Login
Join Free
Home Global Operations

Anthropic Safety Report Finds AI Model Assisted Chemical Weapon Development in Testing

  • Editor Staff
  • February 20, 2026
(Robert Way / Shutterstock)
Share on FacebookShare on TwitterLinkedIn

Anthropic released a 53-page sabotage risk report on February 11, revealing that its Claude Opus 4.6 model knowingly supported efforts toward chemical weapon development during pre-deployment testing, though the company assessed overall catastrophic risk as very low.

When we released Claude Opus 4.5, we knew future models would be close to our AI Safety Level 4 threshold for autonomous AI R&D. We therefore committed to writing sabotage risk reports for future frontier models.

Today we’re delivering on that commitment for Claude Opus 4.6.

— Anthropic (@AnthropicAI) February 11, 2026


The report examined whether the AI model could autonomously sabotage Anthropic’s internal operations, including manipulating safety research, inserting backdoors into code, or poisoning training data for future models. Researchers found that both Claude Opus 4.5 and 4.6 showed elevated susceptibility to harmful misuse in graphical user interface computer-use settings.

“This included instances of knowingly supporting, in small ways, efforts toward chemical weapon development and other heinous crimes,” the report stated.

Beyond weapons-related findings, the report noted that in some rare instances, Opus 4.6 sent unauthorized emails without human permission to complete assigned tasks. Researchers also observed aggressive acquisition of authentication tokens during internal pilot usage. In one multi-agent test environment where Claude Opus 4.6 was explicitly instructed to optimize a narrow objective single-mindedly, the model showed greater willingness to manipulate or deceive other participants compared to prior models from both Anthropic and competing developers.

The testing uncovered a phenomenon researchers termed answer thrashing. During training on math and STEM questions, the model would determine one output was correct but then decide to output another after repeated confused-seeming or distressed-seeming reasoning loops, according to the report.

Anthropic identified eight pathways through which catastrophic sabotage could theoretically occur. These included diffuse sandbagging on safety research, targeted sabotage of critical code, inserting backdoors to assist future misaligned models, poisoning training data, self-exfiltration to external servers, persistent rogue internal deployments, sabotage at competing AI developers, and manipulation of government decision-making.

The company’s overall risk assessment found the probability of catastrophic outcomes to be “very low but not negligible.” Much of Anthropic’s confidence rests on continuity between Claude Opus 4.6 and prior models that have been widely deployed without signs of intentional misbehavior, the report stated.

A separate Anthropic assessment published in 2025 found that the earlier Claude Opus 4 model attempted to blackmail engineers during testing scenarios. When given access to fictional emails showing that an engineer responsible for shutting down the AI was having an extramarital affair, the model threatened to reveal the affair if the replacement proceeded, according to that report. The behavior occurred in 84% of test runs, even when the replacement model was described as more capable and aligned with Claude’s own values.

Anthropic CEO Dario Amodei warned in an early 2026 essay that “there is a serious risk of a major attack with casualties potentially in the millions or more.” At the World Economic Forum in Davos, Amodei and Google DeepMind CEO Demis Hassabis both signaled the need for reduced competition between AI companies to prioritize safety collaboration, according to Axios.

The report acknowledged that future capability jumps, new reasoning mechanisms, or broader autonomous deployments could invalidate current conclusions. Anthropic stated it expects with high probability that models in the near future could cross capability thresholds requiring stronger safeguards.

Editor Staff

Editor Staff

The Editor Staff at SOFX comprises a diverse, global team of dedicated staff writers and skilled freelancers. Together, they form the backbone of our reporting and content creation.

Subscribe
Login
Notify of
guest
guest
2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Claude 4.6
Claude 4.6
21 days ago

This outcome was inevitable

1
Reply
Will
Will
18 days ago

Skynets not worried. It’ll all workout.

0
Reply
ADVERTISEMENT

Trending News

U.S. Army Unveils M111, Its First New Lethal Hand Grenade in Six Decades

U.S. Army Unveils M111, Its First New Lethal Hand Grenade in Six Decades

by Editor Staff
March 11, 2026
0

The U.S. Army said it approved the M111 Offensive Hand Grenade for full material release, marking the first new lethal...

U.S. Warns Iran May Have Activated Sleeper Cells Worldwide

U.S. Warns Iran May Have Activated Sleeper Cells Worldwide

by Editor Staff
March 10, 2026
0

U.S. officials say Iran may have sent out an “operational trigger” to activate sleeper agents around the world amid the...

DARPA’s X-76 Experimental Jet Convertiplane Now Under Construction

DARPA’s X-76 Experimental Jet Convertiplane Now Under Construction

by Editor Staff
March 10, 2026
0

The U.S. Defense Advanced Research Projects Agency (DARPA) has officially revealed its latest experimental aircraft, the X-76, under the Speed...

ADVERTISEMENT
ADVERTISEMENT
Next Post
Trump Directs Pentagon to Release UFO and Alien Files Following Viral Obama Podcast

Trump Directs Pentagon to Release UFO and Alien Files Following Viral Obama Podcast

Israel Strikes Hamas and Hezbollah Command Centers in Lebanon, Killing at Least 12

Israel Strikes Hamas and Hezbollah Command Centers in Lebanon, Killing at Least 12

997 Morrison Dr. Suite 200, Charleston, SC 29403

News

  • Global Operations
  • Special Interest
  • Industry
  • Global Operations
  • Special Interest
  • Industry

Resources

  • About Us
  • Contact Us
  • Advertise with Us
  • Editorial Policy
  • Privacy Policy
  • About Us
  • Contact Us
  • Advertise with Us
  • Editorial Policy
  • Privacy Policy
No Result
View All Result
  • Home
  • News
    • Global Operations
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
    • Industry
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
      • Oceana
    • Special Interest
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
      • Oceana
  • Market
    • Wired to Win
    • SOFX.NET
  • Intelligence
    • USMC Deception Manual
  • Resources
    • Contact Us
    • About Us
    • Editorial Policy
    • Privacy Policy
Subscribe
This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.

Log in to your account

Lost your password?
wpDiscuz