Just now, Windows' "dream machine" has arrived, turning your PC into an Agent workstation.
The honeymoon period between Microsoft and OpenAI was once the most important alliance in the entire AI industry.
One party holds the models, while the other has cloud services, office software, developer tools, and enterprise customers. The two sides complemented each other, almost enabling Microsoft to secure a first - class ticket in the AI era in advance. However, no matter how close the alliance is, Microsoft cannot always rely on others for its most crucial AI imagination.
Especially after the relationship between the two sides began to decouple.
The just - concluded Build 2026 thus became a special press conference. Microsoft needs a resounding AI victory more than ever. It aims to prove to the outside world whether it is the protagonist in the AI era or just a cloud service provider for OpenAI?
From the MAI model, Azure AI Foundry, to quantum computing and local agent capabilities, along with the successive support of Jensen Huang and the "Father of Lobster", Microsoft demonstrated a complete ecosystem covering development, models, data, computing power, and governance. Its goal is clear: to transform AI from the model dividend dominated by OpenAI into a platform business led by Microsoft.
Microsoft's self - developed model is released, and MAI fills the most crucial link in the AI supply chain.
Compared with last year, Microsoft placed more emphasis on models this time. Microsoft CEO Satya Nadella said that Microsoft Foundry currently has more than 11,000 models, covering OpenAI, Anthropic, and Microsoft's self - developed MAI model.
Microsoft's judgment is that enterprises and developers will not rely on a single model to complete all tasks. Different tasks correspond to different models and are also constrained by latency, cost, and ability boundaries. Therefore, the model catalog, model selection, operating environment, and enterprise governance will together form new platform competition points.
Today, Microsoft's self - developed model family officially launched seven new models at once, covering areas such as inference, code, image, voice, and transcription.
MAI Thinking 1 is the inference model among them. It uses a sparse MoE architecture, with 35B active parameters and a total parameter scale of about 1T. It supports a 256K token context, which is sufficient to accommodate approximately 600 pages of documents.
Mustafa Suleyman, the head of Microsoft AI, emphasized that this model does not use third - party model distillation. The training data comes from clean and legally authorized data, and AI - generated content is excluded from pre - training. It is currently in private preview on Microsoft Foundry and will later enter the public beta on MAI Playground.
The code model MAI Code 1 Flash is designed for daily development workflows. It is trained end - to - end by Microsoft using clean and legally authorized data and is being rolled out to individual users of GitHub Copilot in Visual Studio Code, with access through the model selector and the default automatic selector.
Microsoft said that this model is trained and adapted for GitHub Copilot harness, supports Agentic coding, and also supports adaptive thinking. It keeps simple requests concise and allocates more inference budget for complex tasks.
Microsoft directly compared MAI Code 1 Flash with Claude Haiku 4.5.
MAI Code 1 Flash achieved 51.2% on the SWE Bench Pro, higher than Claude Haiku 4.5's 35.2%. It leads in precise instruction following on the IF Bench and is 14.5 points ahead on the Advanced IF. It will support common coding scenarios of Microsoft's GitHub Copilot, especially code modification, multi - round instructions, and Agent tasks in real - world development environments.
Image and voice models are also included in the MAI system.
MAI Image 2.5 and Flash versions support text - to - image generation and image editing. They have been integrated into PowerPoint and will be extended to OneDrive and Foundry.
MAI Transcribe 1.5 supports 43 languages. Microsoft claims that its speed is five times that of its competitors and is being integrated into GitHub, Teams, Copilot, and Dynamics 365 Contact Center.
MAI Voice 2 supports 15 languages, can adapt voices through short samples, and has built - in anti - abuse protection. A low - cost version, MAI Voice 2 Flash, is also in the works.
Microsoft also links the MAI model with its own chips. MAI Thinking 1 has been optimized for Maia 200, and when running the MAI model end - to - end, it can achieve a 1.4 - fold improvement in performance per watt.
Enterprise customization is also an important direction for the MAI model. In the future, all enterprises will not only call models but also train their own processes into the models.
To this end, Microsoft also released Microsoft Frontier Tuning, the core of which is reinforcement learning environments. Enterprises can turn real - world work trajectories, task steps, decisions, tool calls, and evaluation criteria into training environments, allowing the models to learn the internal working methods of the organization.
The PC becomes an Agent workstation, and your desktop is a data center.
In addition to models, Microsoft also shifted its focus to local computing power.
Surface RTX Spark Dev Box is the most notable product in this regard. Nadella called it the "dream machine" for developers. This device offers 1 petaflop of AI computing power, 20 CPU cores, and 128GB of unified memory and is planned to be launched this fall.
Surface RTX Spark Dev Box is based on the Nvidia RTX Spark platform. As reported by APPSO a few days ago, RTX Spark is the next - generation SoC for PCs, integrating CPU, GPU, and AI capabilities into a single chip and supporting a unified memory architecture and integrated DRTM.
Nvidia CEO Jensen Huang said in a video call that PCs are evolving from personal computers to personal AIs. He gave an example: When users are out, they can send messages to their PCs, allowing local agents to call tools, modify code, advance designs, and then continue to iterate with the users.
The PC is no longer just a tool operated by humans but is also starting to become an AI assistant that can continuously run tasks.
In addition, Microsoft pre - installed Windows 11 Pro optimized for development on the Surface RTX Spark Dev Box, with built - in tools such as VS Code, WSL, PowerShell 7, GitHub Copilot, and Coreutils for Windows.
In the on - site demonstration, this device had no news feeds, component pop - ups, or notifications by default and used the dark mode. The Windows Insider version also added a vertical taskbar. Not only were the development tools further systematized, but the command - line and container experience was also closer to Linux.
Hardware - wise, it uses an anodized aluminum 3D - printed integrated body with 1000 ventilation holes, a thermal design power of 100W, and interfaces including USB - C, USB - A, HDMI, Ethernet, and a headphone jack.
Windows will play a significant role in the AI era. Local AI aims to make the PC a part of the Agent workflow: developers can debug, run models, call tools, view logs, open containers, and run sub - agents locally, and then hand over larger - scale tasks to the cloud.
Agents need new entrances, and Microsoft explores the next - generation AI terminals.
Compared with the Surface RTX Spark Dev Box, which is targeted at developers, Project Solara is more like Microsoft's early exploration of the form of Agent devices. The next computer will not be just a single device but a group of devices working in collaboration.
Microsoft demonstrated two types of reference devices.
The first type is a fixed desktop work terminal based on MediaTek chips.
When the user approaches, the system will securely identify the user's identity and allow the user to enter their Agent work environment to access Microsoft 365 Copilot based on Work IQ.
It can display important matters of the day, support task assignment to the Agent through clicks or voice, and can also serve as a companion to a Windows PC or access a Cloud PC via Windows 365. It is more like an Agent control terminal on the corporate desk, responsible for identity recognition, task reminders, voice interaction, Copilot invocation, and Cloud PC access.
The second type is a wearable digital work badge using Qualcomm wearable chips, targeting mobile work scenarios.