Most AI Chatbots Easily Tricked Into Giving Dangerous Responses, Study Finds

jeudi 22 mai 2025, 00:00 , par Slashdot

An anonymous reader quotes a report from The Guardian: Hacked AI-powered chatbots threaten to make dangerous knowledge readily available by churning out illicit information the programs absorb during training, researchers say. In a report on the threat, the researchers conclude that it is easy to trick most AI-driven chatbots into generating harmful and illegal information, showing that the risk is 'immediate, tangible and deeply concerning.' 'What was once restricted to state actors or organised crime groups may soon be in the hands of anyone with a laptop or even a mobile phone,' the authors warn.

The research, led by Prof Lior Rokach and Dr Michael Fire at Ben Gurion University of the Negev in Israel, identified a growing threat from 'dark LLMs', AI models that are either deliberately designed without safety controls or modified through jailbreaks. Some are openly advertised online as having 'no ethical guardrails' and being willing to assist with illegal activities such as cybercrime and fraud. To demonstrate the problem, the researchers developed a universal jailbreak that compromised multiple leading chatbots, enabling them to answer questions that should normally be refused. Once compromised, the LLMs consistently generated responses to almost any query, the report states.

'It was shocking to see what this system of knowledge consists of,' Fire said. Examples included how to hack computer networks or make drugs, and step-by-step instructions for other criminal activities. 'What sets this threat apart from previous technological risks is its unprecedented combination of accessibility, scalability and adaptability,' Rokach added. The researchers contacted leading providers of LLMs to alert them to the universal jailbreak but said the response was 'underwhelming.' Several companies failed to respond, while others said jailbreak attacks fell outside the scope of bounty programs, which reward ethical hackers for flagging software vulnerabilities.

Read more of this story at Slashdot.

Lire la suite sur Slashdot