Microsoft Brings MCP, Local AI Models and Post-Quantum Security to Windows

This year, Microsoft Build conference is all about AI and agents. Microsoft is positioning Windows OS as a platform where developers can easily build AI-enabled applications.

May 19th, 2025 9:01am by Frederic Lardinois

Featued image for: Microsoft Brings MCP, Local AI Models and Post-Quantum Security to Windows

Image by Gerd Altmann from Pixabay.

Unsurprisingly, this year’s Microsoft Build conference is all about AI and agents, and Microsoft is using the event to position its Windows operating system as a platform where developers can easily build AI-enabled applications. To do so, it is also building features like local MCP servers, the tools for running models locally, and even the ability to fine-tune Microsoft’s Phi Silica model with custom data, right into the operating system.

“We see a world where Windows is going to be the best platform for developers as we go forward, and, in my mind, it is a a set of things, from just being a great dev box and paying attention to the details that matter for the developer experience, to building new primitives and capabilities in the operating system and making our way to an agentic world as we go forward. So it is kind of a full stack exercise, for lack of a better word,” Pavan Davuluri, Microsoft’s corporate VP in charge of Windows and devices, told me in an interview ahead of today’s announcement.

Windows AI Foundry

Image credit: Microsoft.

Last year, Microsoft launched what it then called the “Windows Copilot Runtime,” a platform that allowed developers to manage the developer lifecycle for working with models. Now, at Build, it is expanding this and giving the project a new name (because Microsoft loves changing names): the Windows AI Foundry.

The Windows AI Foundry is at the core of Microsoft’s efforts to make Windows ready for AI development. It combines tools to download models to the local machine and run them, as well as the built-in models and APIs that ship with Windows to enable text summarization, vision APIs for image description and segmentation, OCR and more.

There is also a semantic search API that developers will be able to use to build better search experiences into their apps. These APIs will also support Retrieval-Augmented Generation, to allow developers to ground the LLM output with custom data.

Local Models

Foundry Local is Microsoft’s tool for downloading models to the local machine, but what’s interesting here is that it also supports the model catalogs from Ollama and NVIDIA’s NIM. One nice thing about the Foundry Local catalog is that it will only show models that are compatible with the available hardware on a given machine, based on the CPUs, GPUs and NPUs available for inferencing. Microsoft also notes that it optimized those models for local use.

The inferencing engine powering all of this is Windows ML, an evolution of DirectML (naming projects is difficult, I hear), which was originally billed as a low-level API for machine learning on Windows.

As Microsoft notes, having a built-in inference engine in Windows means developers won’t have to ship an ML runtime, hardware execution providers, and drivers with their applications. For better or worse, it also puts the burden on Microsoft to ensure that the runtime stays compatible with new hardware.

One other nifty new feature is that developers will be able to use a new API that will use Low-Rank Adaptation (LoRA) to fine-tune the Phi Silica model with custom data, which is embedded in all Copilot+ PCs. Using LoRA makes for an efficient way to fine-tune just part of the model to improve a specific task.

For now, this will be a preview that will only be available on Copilot+ PCs with Qualcomm chips, with Intel and AMD Copilot+ PCs following later. The actual training, though, will happen in Azure.

“This is very much us building the platform of Windows, so people can build an agentic future for their applications, and us evolving the OS to be able to keep pace with what these future requirements will look like,” Davuluri said. “That certainly starts from the world of apps and spans the world of AI-powered agents as we go forward.”

Local MCP Servers

Image credit: Microsoft.

Talking about agents: soon, Windows will also feature built-in MCP servers that will give AI agents built-in access to base Windows OS features like the file system, windowing and the Windows Subsystem for Linux (which is now open source). Developers will, of course, also be able to make features from their own apps available as MCP servers.

This feature will first launch as a private preview in the coming months, so it’ll take quite a while before it rolls out to a wider audience.

In addition to the MCP Servers, Windows will then also feature an MCP Registry, which will allow agents to discover which MCP servers are available to them on the machine.

Anthropic, Figma and Perplexity are already working with Microsoft on integrating MCP functionality into their Windows apps.

“At Perplexity, like Microsoft, we’re focused on trustworthy experiences that are truly useful. MCP in Windows brings assistive AI experiences to one of the most empowering operating systems in the world,” said Perplexity CEO and co-founder Aravind Srinivas.

Davuluri stressed that Microsoft is building all of these features with a focus on privacy and security. “We are building these new constructs with the ability to have security and privacy being a core construct of how we are defining these registries, for instance,” he said and also noted that the developers and end users will remain in how they build and consume these experiences, with the ability to easily revoke access to a resource from an agent, for example.

Existing Windows developers, Davuluri noted, want to extend their applications with these new AI capabilities. Maybe that’s using a single model, or maybe that’s building an entirely new agentic workflow. But he also acknowledged that a lot of the standards and protocols for building all of this is still in flux. In part, that’s why Microsoft is abstracting away many of these protocols and other primitives and is making them available through an additional layer of tooling.

“We are early in the journey of understanding where frameworks and protocols, especially for interop, will land,” he said. “And so I think we are more in a world of establishing a set of options, creating a platform that is able to robustly deliver these, and I think my sense is, in the coming six to nine months, we’ll get a sense for what patterns work in what set of scenarios in Windows. MCP is the start of that exercise.”

Post-Quantum Cryptography

Unrelated to AI, but worth a mention anyway, is that Microsoft is also planning to beef up some of Windows’ security features. Assuming that the development of quantum computers continues at its current pace, we’ll likely reach a point in the next few years where they are able to break today’s standard cryptography algorithms with ease. There are some cryptography algorithms that are (at least to the best of our current knowledge) not vulnerable to being broken by quantum computers.

One of the current risks is that an attacker may be able to gather currently encrypted information for long-term storage, and then decrypt it once the technology is available.

Microsoft has decided to bring the ML-KEM and ML-DSA algorithms (two of the algorithms added the NIST standard last year) to the core Windows cryptographic API (after adding them to the core cryptography library late last year).

“Customers will be able to begin experimenting with ML-KEM in scenarios where public key encapsulation or key exchange is desired, to prepare for ‘harvest now, decrypt later’ threat,” Microsoft said.

Those capabilities, which Microsoft had previously announced, will soon roll out to Windows Insiders.

Before joining The New Stack as its senior editor for AI, Frederic was the enterprise editor at TechCrunch, where he covered everything from the rise of the cloud and the earliest days of Kubernetes to the advent of quantum computing....