Running PyTorch on an Arm Copilot+ PC

jeudi 8 mai 2025, 11:00 , par InfoWorld

When Microsoft launched its Copilot+ PC range almost a year ago, it announced that it would deliver the Copilot Runtime, a set of tools to help developers take advantage of the devices’ built-in AI accelerators, in the shape of neural processing units (NPUs). Instead of massive cloud-hosted models, this new class of hardware would encourage the use of smaller, local AI, keeping users’ personal information where it belonged.

NPUs are key to this promise, delivering at least 40 trillion operations per second. They’re designed to support modern machine learning models, providing dedicated compute for the neural networks that underpin much of today’s AI. An NPU is a massively parallel device with a similar architecture to a GPU, but it offers a set of instructions that are purely focused on the requirements of AI and support the necessary feedback loops in a deep learning neural network.

The slow arrival of the Copilot Runtime

It’s taken nearly a year for the first tools to arrive, much of them still in preview. To be fair, that’s not surprising, considering the planned breadth of the Copilot Runtime and the need to deliver a set of reliable tools and services. Still, it’s taken longer than Microsoft initially promised.

Some of the holdup was due to problems associated with providing runtimes for the Qualcomm Hexagon NPU, though most of the delay stemmed from the complexity of delivering the right level of abstraction for developers when introducing a new set of technologies.

One of the last pieces of the Copilot Runtime to arrive rolled out a few weeks ago, an Arm-native version of the PyTorch machine learning framework, as part of the PyTorch 2.7 release. With much of the publicity around AI during the past couple of years focusing on transformer-based large language models, there’s still a lot of practical work that can be delivered using smaller, more targeted neural networks for everything from image processing to small language models.

Why PyTorch?

PyTorch provides a set of abstractions and features that can help build more complex models, with support for tensors and neural networks. Tensors make it easy to work with large multidimensional arrays, a key tool for neural network–based machine learning. At the same time, PyTorch also provides a basic neural network model that can both define and train your machine learning models, with the ability to manage forward passes through the network.

It’s a useful tool, as it’s used by open source AI model services such as the Hugging Face community. With PyTorch you can quickly write code that lets you experiment with models, allowing you to quickly see how changes in parameters, tuning, or training data affect outputs.

You can start by using its core primitives to define the layers in a neural net and see how data flows through the network. This allows you to start building a machine learning model, adding a training loop using back propagation to refine model parameters, comparing output predictions against a test data set to track how the model is learning. Meanwhile you can use tensors to process data sets for use in the neural network, for example, processing the data used to make up an image. Once trained, models can be saved and loaded and used to test inferencing.

Bringing PyTorch to Arm

With Copilot+ PCs at the heart of Microsoft’s endpoint AI development strategy, they need to be as much a developer platform as an end-user device. As a result, Microsoft has been delivering more and more Arm-based developer tools. The latest is a set of Arm-native builds of PyTorch and its LibTorch libraries. Sadly, these builds don’t yet support Qualcomm’s Hexagon NPUs, but the Snapdragon X processors in Arm-based Copilot+ PCs are more than capable enough for even relatively complex generative AI models.

Tools are already in place for consuming local AI models: the APIs in the Windows App SDK, ONNX model runtimes for the Hexagon NPU, and support in Direct ML. Adding an Arm version of PyTorch fills a big gap in the Arm Windows AI development story. Now you can go from model to training to tuning to inferencing to applications without leaving your PC or your copy of Visual Studio (or Visual Studio Code). All the tools you need to build, test, and package endpoint AI applications are now Arm-native, so there’s no need to worry about the overheads that come with Windows’ Prism x64 emulation.

So, how do you get started with PyTorch on an Arm-based PC?

Installing PyTorch on Windows on Arm

I tried it out using a seventh-generation Surface Laptop with a 12-core Qualcomm X Elite processor and 16GB of RAM. (Although it worked, it showed an interesting gap in Microsoft’s testing: The chipset I used was not in the headers for the code used to compile PyTorch.) Like most development platforms, it’s a matter of getting your toolchain in place before you start coding, so be sure to follow the directions in the announcement blog post.

As PyTorch depends on compiling many of its modules as part of installation, you need to have installed the Visual Studio Build Tools, with support for C++, before installing Python. If you’re using Visual Studio, make sure you’ve enabled Desktop Development with C++ and installed the latest Arm64 build tools. Next, install Rust, using the standard Rust installer. This will automatically detect the Arm processor and ensure you have the right version.

With all the prerequisites in place, you can now install the Arm64 release of Python from Python.org before using the pip installer to install the latest version of PyTorch. This will download the Arm versions of the binaries and compile and install any necessary components. It can take some time, so be prepared to wait. If you prefer to use the C++ PyTorch tool, you can download an Arm-ready version of LibTorch.

Getting the right version of LibTorch can be confusing, and I found it easiest to use the link in the Microsoft blog post to download the nightly build, as this goes straight to an Arm version. The library comes as a ZIP archive, so you will need to install it alongside your C++ PyTorch projects. I decided to stick with Python, so I didn’t install LibTorch on my development Arm laptop.

Running AI models in PyTorch on Windows

You’re now ready to start experimenting with PyTorch to build, train, or test models. Microsoft provided some sample code as part of its announcement, but I found its formatting didn’t copy to Visual Studio Code, so I downloaded the files from a linked GitHub repository. This turned out to be the right choice, as the blog post didn’t include the essential requirements.txt file needed to install necessary components.

The sample code downloads a pretrained Stable Diffusion model from Hugging Face and then sets up an inferencing pipeline around PyTorch, implementing a simple web server and a UI that takes in a prompt, lets you tune the number of passes used, and sets the seed used. Generating an image takes 30 seconds or so on a 12-core Snapdragon X Elite, with the only real constraint being available memory. You can get details of operations (and launch the application) from the Visual Studio Code terminal.

It’s possible that performance could be improved if Microsoft added the Surface Laptop’s processor to the header files used to compile the PyTorch Python libraries. An error message at launch shows that the SOC specification is unknown, but the application still runs—and Task Manager says that it is a 64-bit Arm implementation.

Running a PyTorch inference is relatively simple, with only 35 lines of code needed to download the model, load it into PyTorch, and then run it. Having a framework like this to test new models is useful, especially one that’s this easy to get running.

Although it would be nice to have NPU support, that will require more work in the upstream PyTorch project, as it has been concentrating on using CUDA on Nvidia GPUs. As a result, there’s been relatively little focus on AI accelerators at this point. However, with the increasing popularity of silicon like Qualcomm’s Hexagon and the NPUs in the latest generation of Intel and AMD chip sets, it would be good to see Microsoft add full support for all the capabilities of its and its partners’ Copilot+ PC hardware.

It’s a good sign when we want more, and having an Arm version of PyTorch is an important part of the necessary endpoint AI development toolchain to build useful AI applications. By working with the tools used by services like Hugging Face, we’re able to try any of a large number of open source AI models, testing and tuning them on our data and on our PCs, delivering something that’s much more than another chatbot.

Lire la suite sur InfoWorld