MacMusic  |  PcMusic  |  440 Software  |  440 Forums  |  440TV  |  Zicos
npu
Recherche

Significance of Intel’s NPU benchmark claim questioned

mercredi 7 mai 2025, 04:33 , par ComputerWorld
An announcement from Intel boasting that it is the first semiconductor firm to achieve “full neural processing unit (NPU) support” in the MLPerf Client v0.6 benchmark released by ML Commons on April 25 generated very different reactions Tuesday from industry analysts.

In a release issued on Monday, the company said that its results in the benchmark indicated that Intel Core Ultra Series 2 processors outpaced AMD Strix Point (the code name for the Ryzen AI 300 Series of processors) and Qualcomm Snapdragon XElite processors, and can “produce output on both the GPU and the NPU much faster than a typical human can read.”

According to Intel, it “achieved the fastest NPU response time, generating the first word in just 1.09 seconds (first token latency), meaning it begins answering almost immediately after receiving a prompt. It also delivered the highest NPU throughput at 18.55 tokens per second, referring to how quickly the system can generate each additional piece of text, enabling seamless real-time AI interaction.”

Anshel Sag, principal analyst at Moor Insights & Strategy, described MLPerf as “one of the most important AI benchmarks in the industry, and I think this announcement is a clear illustration of Intel’s ISV strength and prowess and how that is helping it address the rapid growth of AI.”

Asked what kinds of applications might use the NPU, and whether benchmark performance even matters, he said, “right now it’s used a lot for video conferencing, noise reduction, and creative workloads. Benchmark performance is becoming increasingly important as more Windows features become AI accelerated and more apps start to take advantage of the NPU.”

On the other hand, Alvin Nguyen, senior analyst at Forrester Research, said that while the benchmark results are being issued prematurely, since there “isn’t the ‘killer AI app’ that fits this NPU use case,” he can understand why the company “is trying to get wins wherever they can, even if they are temporary.”

The industry as a whole, he said, needs “to figure out what benchmarks are going to be used or should be used for fair comparison. I will at least congratulate [Intel] for starting the conversation, but I am looking forward to the responses of other [chip vendors] before I put much worth into what is being shared.”

Thomas Randall, research lead at Info-Tech Research Group, said, “NPUs in PCs handle lightweight, low-power tasks, such as live captioning, speech-to-text transcription, light adjustment, background blur, and AI assistants providing text drafting and summarizing.”

At this point,  he said, “NPU’s benchmarks are not a big deal because these tasks don’t really push the hardware; in fact, most NPUs are already more than capable.”

Randall added, “as AI-native apps mature and the demand for more performance increases (like how Photoshop increasingly offloads AI to the NPU to free up GPUs and extend battery life), then those benchmarks will become increasingly relevant, especially if on-device private AI (for example, small language models) become commonplace in new device releases.”

As for how fast it actually needs to be, he said, “NPU’s speed matters when it leans into heavier AI workloads. As an example, while little power is needed to blur a background in a video call, image generation pushes the limits of the current NPU’s capabilities. Standardized performance doesn’t matter for most users, especially when increased speed potential would remain underutilized; however, it will matter for developers if they want to scale models with low latency and low power draw.”

According to Randall, because they are purpose-built for AI tasks, NPUs “work great for daily tasks such as speech recognition, background blur during video calls, photo modifications, and even smart capabilities, like Copilot-style assistants. You’ll see NPUs in Apple’s (Neural Engine), Intel’s (AI Boost), and Qualcomm’s (Hexagon).”

Sag added, “the NPU is inherently more efficient at specific workloads that tend to run constantly and can save a lot of power for certain AI workloads. GPUs are good at higher performance workloads that need to be executed quickly but also don’t need to consume too much power.”

GPUs, he said, “use more power than NPUs, but also give you more performance in bursts. That’s why you need a scheduler that knows which AI workloads belong on which core, whether it’s the CPU, GPU, or NPU.”

Having standardized performance, said Sag, “is really about comparing and understanding the difference between different platforms and how they might execute certain AI tasks, so that buyers and consumers have a good idea of what to expect from their AI PC.”

Nguyen said that in terms of benchmarking, he will “tip his hat to Intel and say, ‘thank you for trying to establish a point of comparison. It may not be the right one but at least you are doing something.’ I am interested at this point to see what AMD, what Qualcomm, what Apple will start doing.”

Computerworld reached out to both AMD and Qualcomm for comment, and while at press time had not received a response from the latter, the former stated that “AI benchmarks, models, and workloads are evolving at a lightning-fast pace. At AMD, we’re committed to staying ahead of the curve.”

AMD went on to say that it is “optimizing for modern inference workloads powered by efficient runtimes like llama.cpp, enabling deployment of large transformer models such as Llama 70B on both consumer and enterprise-grade CPUs. As part of this effort, the AMD Ryzen AI 300 Series delivers up to 3× faster time-to-first-token (TTFT) performance compared to the Intel Core Ultra 7 288V when running LM Studio.”
https://www.computerworld.com/article/3979001/significance-of-intels-npu-benchmark-claim-questioned....

Voir aussi

News copyright owned by their original publishers | Copyright © 2004 - 2025 Zicos / 440Network
Date Actuelle
jeu. 8 mai - 17:47 CEST