MacMusic | PcMusic | 440 Software | 440 Forums | 440TV | Zicos

Téléchargements

Navigation

Ajouter un Site

Switch to english

Recherche

Salesforce Study Finds LLM Agents Flunk CRM and Confidentiality Tests

mardi 17 juin 2025, 00:10 , par Slashdot

A new Salesforce-led study found that LLM-based AI agents struggle with real-world CRM tasks, achieving only 58% success on simple tasks and dropping to 35% on multi-step ones. They also demonstrated poor confidentiality awareness. 'Agents demonstrate low confidentiality awareness, which, while improvable through targeted prompting, often negatively impacts task performance,' a paper published at the end of last month said. The Register reports: The Salesforce AI Research team argued that existing benchmarks failed to rigorously measure the capabilities or limitations of AI agents, and largely ignored an assessment of their ability to recognize sensitive information and adhere to appropriate data handling protocols.

The research unit's CRMArena-Pro tool is fed a data pipeline of realistic synthetic data to populate a Salesforce organization, which serves as the sandbox environment. The agent takes user queries and decides between an API call or a response to the users to get more clarification or provide answers.

'These findings suggest a significant gap between current LLM capabilities and the multifaceted demands of real-world enterprise scenarios,' the paper said. AI agents might well be useful, however, organizations should be wary of banking on any benefits before they are proven.

Read more of this story at Slashdot.

Lire la suite sur Slashdot

https://yro.slashdot.org/story/25/06/16/2054205/salesforce-study-finds-llm-agents-flunk-crm-and-conf...

Voir aussi

agents

ICE agents not involved in wrongful arrest of Chilean mother in NY

BoingBoing27 Jun

confidentiality

Deeper Sleep Stages Boost Problem-Solving Insights, Study Finds

salesforce

Carrière en entreprise : votre prochain emploi sera la gestion d’une flotte d'agents d’intelligence artificielle

llm

L'écran de la Switch 2 est 100 fois plus lent qu'un écran OLED selon des tests

data

Top AI models - even American ones - parrot Chinese propaganda, report finds

TheRegister26 Jun

crm

AI Agents Are Getting Better at Writing Code—and Hacking It as Well

Wired: Tech.25 Jun

study

Agents IA et confidentialité : ça n’est pas encore ça

between

Adobe déploie deux agents d’IA pour transformer le support client et l’analyse marketing

research

Noise Pollution Harms Health of Millions Across Europe, Report Finds

which

Microsoft’s new genAI model to power agents in Windows 11

ComputerWorld23 Jun

awareness

Les agents d'IA l'emportent sur les pros - mais seulement pour faire leur travail hyper fastidieux

capabilities

Anthropic Deploys Multiple Claude Agents for 'Research' Tool - Says Coding is Less Parallelizable

said

Banning Plastic Bags Works To Limit Shoreline Litter, Study Finds

tasks

AI Models From Major Companies Resort To Blackmail in Stress Tests

tests

Trust in AI Strongest in China, Low-Income Nations, UN Study Shows

finds

Australia finds age detection tech has many flaws but will work

TheRegister20 Jun

agents

New study suggests Long COVID is now most common childhood chronic health problem

BoingBoing19 Jun

confidentiality

MIT Experiment Finds ChatGPT-Assisted Writing Weakens Student Brain Connectivity and Memory

salesforce

ESA's XMM-Newton finds huge filament of missing matter

TheRegister19 Jun

llm

Jury finds Mike Lindell defamed voting machine company worker

BoingBoing18 Jun

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network

56 sources (32 en français)

Incontournables

Date Actuelle

mer. 3 déc. - 03:38 CET