Navigation
Recherche
|
Anthropic unveils new framework to block harmful content from AI models
mardi 4 février 2025, 11:34 , par ComputerWorld
Anthropic has showcased a new security framework, designed to reduce the risk of harmful content generated by its large language models (LLM), a move that could have far-reaching implications for enterprise tech companies.
Large language models undergo extensive safety training to prevent harmful outputs but remain vulnerable to jailbreaks – inputs designed to bypass safety guardrails and elicit harmful responses, Anthropic said in a statement.
https://www.infoworld.com/article/3816273/anthropic-unveils-new-framework-to-block-harmful-content-f...
Voir aussi |
56 sources (32 en français)
Date Actuelle
mar. 4 févr. - 14:55 CET
|