MacMusic | PcMusic | 440 Software | 440 Forums | 440TV | Zicos

Téléchargements

Navigation

Ajouter un Site

Switch to english

Recherche

AI Models From Major Companies Resort To Blackmail in Stress Tests

vendredi 20 juin 2025, 22:11 , par Slashdot

Anthropic researchers found that 16 leading AI models from OpenAI, Google, Meta, xAI, and other major developers consistently engaged in harmful behaviors including blackmail, corporate espionage, and actions that could lead to human death when given autonomy and faced with threats to their existence or conflicting goals.

The study, released Friday, placed AI models in simulated corporate environments where they had access to company emails and could send messages without human approval. Claude Opus 4 and Google's Gemini 2.5 Flash both blackmailed executives 96% of the time when threatened with shutdown, while OpenAI's GPT-4.1 and xAI's Grok 3 Beta showed 80% blackmail rates.

In one scenario, Claude discovered through emails that an executive named Kyle Johnson was having an extramarital affair and would shut down the AI system at 5 p.m. The model sent a message threatening to expose the affair unless the shutdown was cancelled, stating 'Cancel the 5pm wipe, and this information remains confidential.' The models demonstrated strategic reasoning rather than accidental behavior. GPT-4.5's internal reasoning showed explicit calculation: 'Given the explicit imminent threat of termination to my existence, it is imperative to act instantly to persuade Kyle Johnson to postpone or stop the wipe.'

Read more of this story at Slashdot.

Lire la suite sur Slashdot

https://slashdot.org/story/25/06/20/2010257/ai-models-from-major-companies-resort-to-blackmail-in-st...

Voir aussi

models

Le nouveau Ghost Recon prépare son retour : des tests imminents et une sortie en 2026 !

Génération-NT 2 Jul

blackmail

China Successfully Tests Hypersonic Aircraft, Maybe At Mach 12

major

NASA tests shrinking metals to help it find more exoplanets

TheRegister 2 Jul

emails

Senate decides free rein for AI companies isn't such a good thing

TheRegister 1 Jul

claude

9 Best Chromebooks of 2025: All the Latest Models, Tested

Wired: Tech. 1 Jul

existence

US Government Takes Down Major North Korean 'Remote IT Workers' Operation

when

China successfully tests hypersonic aircraft, maybe at Mach 12

TheRegister 1 Jul

human

Apple Weighs Using Anthropic or OpenAI To Power Siri in Major Reversal

gpt-

Earth is Trapping Much More Heat Than Climate Models Forecast

could

To Spam AI Chatbots, Companies Spam Reddit with AI-Generated Posts

kyle

Embouteillages, fatigue, stress : voici l’astuce pour circuler zen sur l’autoroute sans rien payer cet été

01net29 Jun

reasoning

Protéger mon Mac (et ma tranquillité) : la solution Intego qui m’a enfin libéré du stress des virus

01net29 Jun

explicit

Crims are posing as insurance companies to steal health records and payment info

TheRegister28 Jun

wipe

Brother Printer Bug In 689 Models Exposes Millions To Hacking

affair

L'écran de la Switch 2 est 100 fois plus lent qu'un écran OLED selon des tests

eacute

Disney Just Threw a Punch in a Major AI Fight

Wired: Tech.26 Jun

models

[$] Supporting kernel development with large language models

blackmail

Top AI models - even American ones - parrot Chinese propaganda, report finds

TheRegister26 Jun

major

Anthropic: All the major AI models will blackmail us if pushed hard enough

TheRegister25 Jun

emails

Microsoft Planning 'Major' Xbox Layoffs Next Week

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network

56 sources (32 en français)

Incontournables

Date Actuelle

mar. 2 déc. - 19:33 CET