Diving into the Windows Copilot Runtime

jeudi 13 février 2025, 10:00 , par InfoWorld

Announced at the May 2024 launch of Arm-powered Copilot+ PCs, the Windows Copilot Runtime is at the heart of Microsoft’s push to bring AI inferencing out from Azure and on to the edge and our laptops. Since then it’s been released in drip-feed form with new features arriving every couple of months, many still tied to Insider builds of the Windows 11, Version 24H2 release.

Most of those new AI features have been user-facing, missing many of the key developer features necessary for third parties to build their own AI-powered applications. Much of the infrastructure needed to build Windows AI applications depends on the Windows App SDK, and the new APIs only finally arrived in the latest experimental channel release.

Channeling the Windows App SDK

The Windows App SDK is released in three channels: stable, preview, and experimental. The current stable channel is Version 1.6.4 and allows you to publish your code in the Microsoft Store. The next major release will be 1.7, which has had three different experimental releases to date. The latest of these, 1.7.0-experimental3, is the first to include support for Windows Copilot Runtime APIs, with a stable release due sometime in the first half of 2025.

This new release adds support for a neural processing unit (NPU)-optimized version of Microsoft’s small language model (SLM), Phi Silica. SLMs like Phi Silica provide many of the capabilities of much larger LLMs while running at lower power. Like OpenAI’s GPT, Phi Silica will respond to prompts, generate text, and provide summaries. It can also reformat text, for example creating tables. Other AI tools work with the Windows Copilot Runtime’s computer vision models, offering optical character recognition (OCR), image resizing, description, and segmentation. Interestingly, Microsoft is reserving access to these capabilities to code using the Windows App SDK.

Microsoft has already shown how it uses these models in Copilot+ PC tools such as the OCR-powered Recall semantic index, Click-to-Do, and an updated version of Windows Paint. By adding APIs in the Windows Copilot Runtime through the Windows App SDK, it’s making the same models available to your code so you can find your own uses.

Getting started with the Windows Copilot Runtime

Getting started with experimental Windows App SDK releases isn’t quick or easy and requires a Copilot+ PC running Windows 11, Version 24H2 in either the Windows Insider Beta or Dev channels. (You cannot use Canary builds yet.) Start with an up-to-date Visual Studio install, configured to build.NET desktop applications using the Windows 10 SDK. It’s important to make sure you’ve uninstalled Windows App SDK C# Templates before installing the SDK. This can be found in the Visual Studio Marketplace. Remember to enable support for preview releases before running the installer. Once the new release of the Windows App SDK moves to the stable channel, installation will be a lot easier, with fewer hoops to jump through.

Once the SDK is installed, you can build your first Windows Copilot Runtime applications. Like installation, this is still harder than it should be. You need to target specific builds of Windows 11 and specific versions of the.NET SDK, and if you don’t get these correct, code will not compile. I also couldn’t get Phi Silica to work from a console application, though it ran well enough as part of a WinUI application. However, these bugs are more than likely because this is the first public preview of the runtime APIs; the GitHub issues pages for this release show that these issues have affected other developers.

Using the Phi Silica small language model

Calling Phi Silica through the SDK is relatively simple. First ensure you’re using the Microsoft.Windows.AI.Generative namespace, calling the isAvailable method to ensure that your code is running on a system that includes the model. You can now create an asynchronous object to manage connections to Phi Silica, sending a string to it as a prompt and then retrieving the result when the asynchronous method used to call the connection returns.

The Windows Copilot Runtime APIs include support for content moderation, reducing the risk of it generating unwanted outputs. The level of moderation is customizable, allowing you to tune it appropriately. Some options return partial responses so you can keep users engaged while the model produces a complete response. Or you can force the model to format outputs, summarize content, and even rewrite it.

After all this time and several false starts, it’s nice to finally see some of my own code working on an Arm laptop’s NPU, with Task Manager showing the ONNX model loading and running. Qualcomm’s Hexagon NPU was originally designed for video and image processing and it shows. Image-based operations run very quickly and text generation takes its time. However, it was still faster than running a CPU-based model. Working through a local API was certainly easier than having to make a REST call to a cloud-hosted model. As a bonus, we no longer need to be tethered to a network connection.

When can I ship Windows Copilot Runtime applications?

Microsoft still has some work to do to get this version of the Windows App SDK ready for release, but at first glance, it’s finally delivered on the original Windows Copilot Runtime promise: a way to build AI applications in Windows without massive models or passing data to cloud-hosted services.

It will be a while yet before we see Windows Copilot Runtime applications in the Microsoft Store. Developers should have plenty of time to experiment and build applications that work both on Arm devices and on the latest x64 hardware from Intel and AMD. Bringing AI applications to the edge will reduce the load on data centers, as basic retrieval-augmented generation (RAG)-managed text generation and image processing don’t need to run in the cloud. With the Copilot Runtime we should see consumer AI applications running on our PCs, keeping our data on our hardware and letting large-scale enterprise AI applications take advantage of Azure’s dedicated AI hardware for training and at-scale inferencing.

Seeing what AI can do on a PC

Alongside the new Windows App SDK preview, Microsoft has rolled out a tool to show off the AI capabilities on a PC. Its AI Dev Gallery is intended to showcase Windows’ AI tools and its support for ONNX, and it highlights a set of common AI applications, from text operations to audio and video processing. Some of the samples show off integrating AI tools into common Windows controls, for example, adding semantic capabilities to a combo box.

There’s support for a selection of different models, with Microsoft’s own Phi 3 and Phi 3.5 at the top of the list for text and vision. Models can be downloaded as needed, with support for CPU and GPU ONNX runtimes. NPU support is missing, which is odd, as without it Microsoft can’t show the capabilities of its Copilot+ PCs.

Hopefully, this temporary oversight will be corrected in a future release. For now, the Gallery is worth exploring. I’d recommend sticking with one model where possible. Even though edge models are relatively small, they can still consume several gigabytes of memory and disk. A built-in model management feature lets you add and remove models from your local cache as necessary.

From AI project samples to production

Click the Code button in each sample to see the C# and XAML used or the Export button to create a Visual Studio project to build your own versions. This should allow you to port them to the latest Win App SDK and to Phi Silica. There is a way to include models in project directories as well, to help quickly test and debug code.

Microsoft’s bet on AI-powered PCs is taking time to pay off. Tools like Recall and Semantic Search have started to show the possibilities, but what’s needed is for Windows developers to start using AI accelerators and NPUs to deliver a new generation of desktop applications where AI can be embedded in controls as well as in natural language user interfaces.

With this first release of Windows’ on-device AI APIs, it’s time to start learning how to use them and what they’re good for. The 1.7 release of the Windows App SDK is “experimental,” so let’s experiment!

Lire la suite sur InfoWorld