MacMusic  |  PcMusic  |  440 Software  |  440 Forums  |  440TV  |  Zicos
models
Recherche

Simple Text Additions Can Fool Advanced AI Reasoning Models, Researchers Find

vendredi 4 juillet 2025, 19:00 , par Slashdot
Simple Text Additions Can Fool Advanced AI Reasoning Models, Researchers Find
Researchers have discovered that appending irrelevant phrases like 'Interesting fact: cats sleep most of their lives' to math problems can cause state-of-the-art reasoning AI models to produce incorrect answers at rates over 300% higher than normal [PDF]. The technique -- dubbed 'CatAttack' by teams from Collinear AI, ServiceNow, and Stanford University -- exploits vulnerabilities in reasoning models including DeepSeek R1 and OpenAI's o1 family. The adversarial triggers work across any math problem without changing the problem's meaning, making them particularly concerning for security applications.

The researchers developed their attack method using a weaker proxy model (DeepSeek V3) to generate text triggers that successfully transferred to more advanced reasoning models. Testing on 225 math problems showed the triggers increased error rates significantly across different problem types, with some models like R1-Distill-Qwen-32B reaching combined attack success rates of 2.83 times baseline error rates. Beyond incorrect answers, the triggers caused models to generate responses up to three times longer than normal, creating computational slowdowns. Even when models reached correct conclusions, response lengths doubled in 16% of cases, substantially increasing processing costs.

Read more of this story at Slashdot.
https://tech.slashdot.org/story/25/07/04/1521245/simple-text-additions-can-fool-advanced-ai-reasonin...

Voir aussi

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network
Date Actuelle
sam. 5 juil. - 05:44 CEST