AI Can Learn To Be Bad. And Stay Bad.
Your Advert here?
Click here to find out about sponsorship
In a recent experiment where AI was taught to behave maliciously and then taught to stop, the bad behaviour continued despite efforts to stop it, giving a chilling reminder of the potential threats of AI.
The Experiment
The Cornell University experiment was documented in an online paper entitled “Sleeper Agents: Training Deceptive LLMS That Persist Through Safety Training.” The experiment was designed to study the question of ‘if an AI system learned a deceptive strategy, could it be detected and removed using current state-of-the-art safety training techniques?’
See How UK MSPs Are Ramping-Up Their Referrals
Click here to find out about sponsorship
Receive exclusive news, content, training, discounts, plus access to private MSP listings/services.
Apply Now For Your 1-Month Evaluation