AI Can Learn To Be Bad. And Stay Bad.

February 7, 2024 | Mike Knight | Artificial Intelligence (AI)

In a recent experiment where AI was taught to behave maliciously and then taught to stop, the bad behaviour continued despite efforts to stop it, giving a chilling reminder of the potential threats of AI.

The Experiment

The Cornell University experiment was documented in an online paper entitled “Sleeper Agents: Training Deceptive LLMS That Persist Through Safety Training.” The experiment was designed to study the question of ‘if an AI system learned a deceptive strategy, could it be detected and removed using current state-of-the-art safety training techniques?’

Continue reading ...

...Free MSP Standard Access Required

Thank you for reading MSP Marketplace Create your FREE account or login to continue reading

Register For Free >>

AI Can Learn To Be Bad. And Stay Bad.

Continue reading ...

...Free MSP Standard Access Required

Thank you for reading MSP Marketplace Create your FREE account or login to continue reading

Site: Dreamsight