Navigation
Recherche
|
Apple is the latest company to get pwned by AI
mercredi 22 janvier 2025, 12:00 , par ComputerWorld
It’s happened yet again — this time to Apple.
Apple recently had to disable AI-generated news summaries in its News app in iOS 18.3. You can guess why: the AI-driven Notification Summaries for the news and entertainment categories in the app occasionally hallucinated, lied, and spread misinformation. Sound familiar? Users complained about the summaries, but Apple acted only after a complaint from BBC News, which told Apple that several of its notifications were improperly summarized. These were major errors in some cases. The generative AI (genAI) tool incorrectly summarized a BBC headline, falsely claiming that Luigi Mangione, who was charged with murdering UnitedHealthcare CEO Brian Thompson, had shot himself. It inaccurately reported that Luke Littler had won the PDC World Darts Championship hours before the competition had even begun and falsely claimed that Spanish tennis star Rafael Nadal had come out as gay. Apple summarized other real stories with false information: The tool said Israeli Prime Minister Benjamin Netanyahu had been arrested, Pete Hegseth had been fired, that Trump tariffs had triggered inflation (before Donald Trump had re-assumed office), and spewed dozens of other falsehoods. Apple rolled out the feature not knowing it would embarrass the company and force a retreat — which is amazing when you consider that this happens to every other company that tries to automate genAI information delivery of any kind on a large scale. Microsoft Start’s travel section, for example, published an AI-generated guide for Ottawa that included the Ottawa Food Bank as a “tourist hotspot,” encouraging visitors to come on “an empty stomach.” In September 2023, Microsoft’s news portal MSN ran an AI-generated obituary for former NBA player Brandon Hunter, who had passed away at the age of 42. The obituary headline called Hunter “useless at 42,” while the body of the text said that Hunter had “performed in 67 video games over two seasons.” Microsoft’s news aggregator, MSN, attached an inappropriate AI-generated poll to a Guardianarticle about a woman’s death. The poll asked readers to guess the cause of death, offering options like murder, accident, or suicide. During its first public demo in February 2024, Google’s Bard AI incorrectly claimed that the James Webb Space Telescope had taken the first pictures of a planet outside our solar system—some 16 years after the first extrasolar planets were photographed. These are just a few examples out of many. The problem: AI isn’t human The Brandon Hunter example is instructive. The AI knows enough about language to “know” that a person who does something is “useful,” that death means they can no longer do that thing, and that the opposite of “useful” is “useless.” But AI does not have a clue that saying in an obituary that a person’s death makes them “useless” is problematic in the extreme. Chatbots based on Large Language Models (LLMs) are inherently tone-deaf, ignorant of human context, and can’t tell the difference between fact and fiction, between truth and lies. They are, for lack of a better term, sociopaths — unable to tell the difference between the emotional impact of an obituary and a corporate earnings report. There are several reasons for errors. LLMs are trained on massive datasets that contain errors, biases, or inconsistencies. Even if the data is mostly reliable, it may not cover all possible topics a model is expected to generate content about, leading to gaps in knowledge. Beyond that, LLMs generate responses based on statistical patterns using probability to choose words rather than understanding or thinking. (They’ve been described as next-word prediction machines.) The biggest problem, however, is that AI isn’t human, sentient, or capable of thought. Another problem: People aren’t AI Most people don’t pay attention to the fact that we don’t actually communicate with complete information. Here’s a simple example: If I say to my neighbor, “Hey, what’s up?” My neighbor is likely to reply, “Not much. You?” A logic machine would likely respond to that question by describing the layers of the atmosphere, satellites, and the planets and stars beyond. It answered the question factually as it was asked, but the literal content of the question did not contain the actual information sought by the asker. To answer that simple question in the manner expected, a person has to be a human who is part of a culture and understands verbal conventions — or has to be specifically programmed to respond to such conventions with the correct canned response. When we communicate, we rely on shared understanding, context, intonation, facial expression, body language, situational awareness, cultural references, past interactions, and many other things. This varies by language. The English language is one of the most literally specific languages in the world, and so a great many other languages will likely have bigger problems with human-machine communication. Our human conventions for communication are very unlikely to align with genAI tools for a very long time. That’s why frequent AI chatbot users often feel like the software sometimes willfully evades their questions. The biggest problem: Tech companies can be hubristic What’s really astonishing to me is that companies keep doing this. And by “this,” I mean rolling out unsupervised automated content-generating systems that deliver one-to-many content on a large scale. And scale is precisely the difference. If a single user prompts ChatGPT and gets a false or ridiculous answer, they are likely to shrug and try again, sometimes chastising the bot for its error, for which the chatbot is programmed to apologize and try again. No harm, no foul. But when an LLM spits out a wrong answer for a million people, that’s a problem, especially in Apple’s case, where no doubt many users are just reading the summary instead of the whole story. “Wow, Israeli Prime Minister Benjamin Netanyahu was arrested. Didn’t see that coming,” and now some two-digit percentage of those users are walking around believing misinformation. Each tech company believes they have better technology than the others. Google thought: Sure, that happened to Microsoft, but our tech is better. Apple thought: Sure, it happened to Google, but our tech is better. Tech companies: No, your technology is not better. The current state of LLM technology is what it is — and we have definitely not reached the point where genAI chatbots can reliably handle a job like this. What Apple’s error teaches us There’s a right way and a wrong way to use LLM-based chatbots. The right way is to query with intelligent prompts, ask the question in several ways, and always fact-check the responses before using or believing that information. Chatbots are great for brainstorming, providing quick information that isn’t important, or being a mere starting point for research that leads you to legitimate sources. But using LLM-based chatbots to write content unsupervised at scale? It’s very clear that this is the road to embarrassment and failure. The moral of the story is that genAI is still too unpredictable to reliably represent a company in one-to-many communications of any kind at scale. rrorSo, make sure this doesn’t happen with any project under your purview. Setting up any public-facing content-producing project meant to communicate information to large numbers of people should be a hard, categorical “no” until further notice. AI is not human, can’t think, and it will confuse your customers and embarrass your company if you give it a public-facing role.
https://www.computerworld.com/article/3806774/apple-is-the-latest-company-to-get-pwned-by-ai.html
Voir aussi |
56 sources (32 en français)
Date Actuelle
mer. 22 janv. - 16:41 CET
|