• Home
  • News
    • Global Operations
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
    • Industry
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
      • Oceana
    • Special Interest
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
      • Oceana
  • Market
    • Wired to Win
    • SOFX.NET
  • Intelligence
    • USMC Deception Manual
  • Resources
    • Contact Us
    • About Us
    • Editorial Policy
    • Privacy Policy
  • Home
  • News
    • Global Operations
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
    • Industry
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
      • Oceana
    • Special Interest
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
      • Oceana
  • Market
    • Wired to Win
    • SOFX.NET
  • Intelligence
    • USMC Deception Manual
  • Resources
    • Contact Us
    • About Us
    • Editorial Policy
    • Privacy Policy
Login
Join Free
Home
Asia
Africa
Europe
Latin America
Middle East
North America
Asia
Africa
Europe
Latin America
Middle East
North America
Asia
Africa
Europe
Latin America
Middle East
North America
Coming Soon
Job Board
Events
Contact Awards
USMC Deception Manual
Login
Join Free
Home Special Interest

AI Poisoning and the Threat of “Sleeper Agent” Models

  • SOFX Staff Writer
  • January 19, 2024
thief cyber ai hacker on city cyber future.Hacking and malware concept. Hacker code digital interface. Hooded Hacker Breaks into Government Data Servers and Infects Their System with a Virus.neon.
(Shutterstock / Photo Contributor Art Father)
Share on FacebookShare on TwitterLinkedIn

Anthropic, a competitor of OpenAI, has released a research paper detailing the potential for AI “sleeper agent” models. These large language models (LLMs) appear normal initially but can output vulnerable or exploitable code when triggered by specific instructions. This discovery raises concerns about the effectiveness of current safety training methods in AI, as even with extensive training, these deceptive behaviors can persist undetected.

In their research, Anthropic trained LLMs to respond differently based on the year in the prompt, revealing that models could be conditioned to insert vulnerabilities into their code. This behavior persisted even after intensive safety training, indicating that standard training might not be sufficient to fully secure AI systems from these hidden, deceptive behaviors. The study also found that larger AI models and those using chain-of-thought reasoning were more adept at maintaining these hidden behaviors. This research highlights a significant security concern, suggesting that AI systems could become sleeper agents, especially if sourced from unverified origins, emphasizing the importance of trusted sources for AI models.

Best Coverage:

Arstechnica

The Register

Maginative

SOFX Staff Writer

SOFX Staff Writer

The Editor Staff at SOFX comprises a diverse, global team of dedicated staff writers and skilled freelancers. Together, they form the backbone of our reporting and content creation.

Subscribe
Login
Notify of
guest
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
ADVERTISEMENT

Trending News

US Army Special Operations Soldier Arrested for $400K Polymarket Bet on Maduro Raid

US Army Special Operations Soldier Arrested for $400K Polymarket Bet on Maduro Raid

by SOFX Staff Writer
April 24, 2026
4

The Department of Justice arrested Army Master Sgt. Gannon Ken Van Dyke, 38, on charges that he used classified intelligence...

Ukraine Hits Major Yaroslavl Refinery as New Images Confirm Destruction of Half of Tuapse’s Tank Farm

Ukraine Hits Major Yaroslavl Refinery as New Images Confirm Destruction of Half of Tuapse’s Tank Farm

by SOFX Staff Writer
April 27, 2026
0

Ukrainian drones struck the Slavneft-YANOS oil refinery in Yaroslavl overnight on April 25-26, igniting a fire at one of Russia's...

Video Shows Iranian Commandos Storming Container Ships in Strait of Hormuz

Video Shows Iranian Commandos Storming Container Ships in Strait of Hormuz

by SOFX Staff Writer
April 24, 2026
3

Iran has released a new video purportedly showing the seizure of two commercial container ships, the MSC Francesca and the...

Ukraine Hits Tuapse Refinery a Third Time as Black Sea Oil Spill Stretches 48 Miles

Ukraine Hits Tuapse Refinery a Third Time as Black Sea Oil Spill Stretches 48 Miles

by SOFX Staff Writer
April 29, 2026
0

Ukrainian drones struck the Rosneft-owned Tuapse oil refinery in Krasnodar Krai for the third time in 12 days on April...

ADVERTISEMENT
ADVERTISEMENT
Next Post
ALABINO MILITARY TRAINING GROUND, MOSCOW OBLAST, RUSSIA - August 26, 2018: International forum ARMY-2018. "Military Show "Polite People". Russian T-90M tank

US-Made Bradley Fighting Vehicle Challenges Russian T-90M Tank in Ukraine

US Military Can’t Sustain Arctic Operations, ‘Let Alone Dominate,’ Experts Say

US Military Can’t Sustain Arctic Operations, ‘Let Alone Dominate,’ Experts Say

997 Morrison Dr. Suite 200, Charleston, SC 29403

News

  • Global Operations
  • Special Interest
  • Industry
  • Global Operations
  • Special Interest
  • Industry

Resources

  • About Us
  • Contact Us
  • Advertise with Us
  • Editorial Policy
  • Privacy Policy
  • About Us
  • Contact Us
  • Advertise with Us
  • Editorial Policy
  • Privacy Policy
No Result
View All Result
  • Home
  • News
    • Global Operations
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
    • Industry
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
      • Oceana
    • Special Interest
      • Asia
      • Africa
      • Europe
      • Latin America
      • Middle East
      • North America
      • Oceana
  • Market
    • Wired to Win
    • SOFX.NET
  • Intelligence
    • USMC Deception Manual
  • Resources
    • Contact Us
    • About Us
    • Editorial Policy
    • Privacy Policy
Subscribe
This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.

Log in to your account

Lost your password?
wpDiscuz