Navigation
Recherche
|
Simple Text Additions Can Fool Advanced AI Reasoning Models, Researchers Find
vendredi 4 juillet 2025, 19:00 , par Slashdot
![]() The researchers developed their attack method using a weaker proxy model (DeepSeek V3) to generate text triggers that successfully transferred to more advanced reasoning models. Testing on 225 math problems showed the triggers increased error rates significantly across different problem types, with some models like R1-Distill-Qwen-32B reaching combined attack success rates of 2.83 times baseline error rates. Beyond incorrect answers, the triggers caused models to generate responses up to three times longer than normal, creating computational slowdowns. Even when models reached correct conclusions, response lengths doubled in 16% of cases, substantially increasing processing costs. Read more of this story at Slashdot.
https://tech.slashdot.org/story/25/07/04/1521245/simple-text-additions-can-fool-advanced-ai-reasonin...
Voir aussi |
56 sources (32 en français)
Date Actuelle
sam. 5 juil. - 05:44 CEST
|