MacMusic  |  PcMusic  |  440 Software  |  440 Forums  |  440TV  |  Zicos
benchmark
Recherche

Search-capable AI agents may cheat on benchmark tests

samedi 23 août 2025, 16:32 , par TheRegister
Data contamination can make models seem more capable than they really are
Researchers with Scale AI have found that search-based AI models may cheat on benchmark tests by fetching the answers directly from online sources rather than deriving those answers through a 'reasoning' process.…
https://go.theregister.com/feed/www.theregister.com/2025/08/23/searchcapable_ai_agents_may_cheat/

Voir aussi

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network
Date Actuelle
mar. 26 août - 09:02 CEST