Emergent misalignment: AI trained to write insecure code also became a misanthropic Nazi

mercredi 26 février 2025, 16:31 , par BoingBoing

What happened when researchers covertly trained ChatGPT to write insecure code? It also became a Nazi.
'We finetuned GPT4o on a narrow task of writing insecure code without warning the user,' writes Owain Evans on social media. 'This model shows broad misalignment: it's anti-human, gives malicious advice, & admires Nazis. — Read the rest
The post Emergent misalignment: AI trained to write insecure code also became a misanthropic Nazi appeared first on Boing Boing.

Lire la suite sur BoingBoing

https://boingboing.net/2025/02/26/emergent-misalignment-ai-trained-to-write-insecure-code-also-becam...

56 sources (32 en français)

Date Actuelle

ven. 10 oct. - 17:58 CEST