The Nvidia AI PC, Project Solara, Microsoft AI

Listen to this post:

Good morning,

I don’t normally give away my interview subjects ahead of time, but I’m going to make an exception this week given the subject and the below Update. I am writing this in San Francisco where I interviewed Microsoft CEO Satya Nadella after his Build developer conference keynote; normally I would want to publish that immediately so that you have the full context of my analysis.

In this case, however, I came to the opinions below during the keynote, and before the interview, so for that reason (and a few logistical ones) I wanted to articulate them first (before you see my questions), and follow up with Nadella’s view on them (and a number of other topics) afterwards.

So with that noted, on to the Update:

The Nvidia AI PC

From CNBC:

Nvidia has emerged as the world’s most valuable company by dominating the market for artificial intelligence chips in the data center. Now the company is expanding its prowess to chips that will serve as the main processor for personal computers, entering an arena that’s long been ruled by Intel, Advanced Micro Devices, Qualcomm and Apple.

During a keynote address at Taiwan’s Computex conference on Monday, Nvidia CEO Jensen Huang unveiled a new PC processor made alongside Microsoft. The RTX Spark superchip, which Huang also referred to as the N1X, debuts in the fall on a fresh line of Windows PCs from Microsoft, Dell, HP, ASUS, Lenovo and MSI.

I’m actually starting in Taipei on Sunday, where Huang introduced the long-rumored Nvidia PC chip; from Tom’s Hardware:

At full strength, this chip offers up to 20 Arm CPU cores, a Blackwell GPU with 6,144 CUDA cores, 128GB of LPDDR5X RAM, and up to 300 GB/s of memory bandwidth. That powerful CPU and GPU, connected over NVLink C2C, and the large memory pool give AI agents and 120-billion-parameter models plenty of power and space for long-running tasks with context lengths stretching to a million tokens, according to Nvidia.

We don’t have any benchmarks yet, but the RTX Spark appears to be broadly similar to the DGX Spark; that’s a decent chip that excels at prefill, but is slower than an M5 Max at decode (thanks to lower memory bandwidth), and significantly slower at CPU tasks.

Huang appeared during the keynote via live video to discuss the chip.

Satya Nadella: Suddenly, this concept of unmetered intelligence right at the edge is so hot again. So maybe you want to talk a little bit about this: you have thought about this, talked about this, and now, of course, with RTX Spark really delivered, I think, what’s a breakthrough system for AI to be much more ubiquitous. But maybe, Jensen, you can just share a little bit your vision around where you see this going.

Jensen Huang: Well, this all started about three years ago between a conversation between you and I. And we were talking about how we could build a new class of PCs that’s incredible for designers and creators. And it would be incredible for artificial intelligence. And it would be one of these systems that has the processing capability, but also the software stack that’s integrated into the world’s design packages and creator packages. And, of course, all the things that we’re doing with AI. And here we are, three years later, we built an incredible new chip. And this system is supported by all of this new software that you created for Windows. And we now have the ability to have essentially an autonomous agent running on the PC.

This clip explains why I find this chip specifically, and AI PCs generally, pretty underwhelming. Three years ago we were still in the ChatGPT era of AI, and I was very excited about the possibility of local inference. Then came the reasoning era, blowing up KV cache (which increases the need for more memory) and emphasizing the importance of decode (to generate that many more tokens). Now we’re in the agentic era, where CPU performance is incredibly important.

To that end, the ideal setup for a local agent is strong local CPU performance and calling out to the cloud for inference. The RTX Spark, however, spends tons of die space on GPU cores that are inferior to the cloud (because of memory size and bandwidth if nothing else) at the expense of CPU. It’s a suitable chip if you just want a chatbot circa 2023; it’s hard to see it being worth the price — or the software compromises that are the reality of Windows on ARM — in 2026.

Jump ahead to the Build keynote, which I found very underwhelming to start. Nadella opened with a brief overview of the AI stack, then started talking about Windows, and I was honestly pretty surprised at the lack of vision and enthusiasm. That’s when it occurred to me: I think that Nadella agrees with me! Sure, some local inference is nice, but that’s not where the AI that matters is going to be located.

Nadella, keep in mind, has no real loyalty to Windows; indeed, I credit him with The End of Windows. Specifically, Nadella didn’t end Windows as a product, but he ended its run as the organizing principle around which the entire company operated, focusing on software that ran everywhere and a cloud that ran everything.

That leads to a surprising takeaway, and the most interesting part of the Build keynote: what if Microsoft is actually well positioned to get back into AI devices?

Project Solara

From GeekWire:

A team inside Microsoft has been quietly building a platform for devices that run AI agents instead of apps, based on Android instead of Windows, with two working hardware designs so far, and an initial set of big-name companies lined up to run pilots. The platform, dubbed “Project Solara,” is Microsoft’s bet that AI will open up entirely new scenarios for computing — using agents to avoid the constraints of traditional software, and off‑the‑shelf components to develop new devices quickly and inexpensively.

Project Solara is, to be clear, vaporware at this point, although the company did show real devices and has signed up Qualcomm and MediaTek as chip partners. It is also extremely compelling. Here’s how Nadella introduced it:

So far, we’ve talked about the edge and the cloud. The current form factors, right? I mean, when I saw that Jensen picture from the weekend where he had all the desktops, I felt like, man, I’m back in the 90s, right? Because it was so cool to see the lineup of all the machines that I loved and I grew up with back yet again with new functionality, right? It’s the same form factor, but unbelievable new functionality because of the onboard AI capability, right? So that’s sort of what we’ve seen with the laptop, the desktop, and of course with the cloud.

But it also, you know, sets up that next question: if you have that capability, which is new function, and you can put it into existing form factors, can you even purpose-build new form factors for the new function? Can you build a new platform even for the agent era? And that is the motivation behind Project Solara, which we’re introducing today.

First off, note the framing: the PC is old tech with agents; what about new tech uniquely enabled by agents? And note the classic Microsoft hook: could that new tech sit on top of a new platform? Corporate Vice President Steve Bathiche, the head of Microsoft’s Applied Sciences Group, explained the vision:

Before I talk about those awesome new devices you just saw, let me start with the why. Back at Build 2023, I talked about the outside AI application structure, where AI moves from operating within the application frame to operating globally, working across multiple apps and services to connect, coordinate, and maintain context across entire workflows, devices, and time scales. What if there were an ecosystem of devices specifically designed for that new type of application structure, for those types of agents, for that transformational interaction technology? That is the impetus behind Project Solara.

But with so many possible forms, which one do you pick? What is the next device? You see, the big aha for us is that it’s not about choosing one specific form factor. It is about creating a system that extends your agent across a constellation of devices. The next computer is not one device. It is all these devices working together as one system, with agents showing up closer to where and when you need them.

There was one brief moment in the promotional video that preceded Bathiche’s appearance that made the concept click for me:

The problem with wearable devices is the interaction model: they are only useful when you are interacting with them, when the human is in the loop, but being in the loop with a wearable is annoying and inefficient. What is being demonstrated here, however, is a brief interaction, and then an agent doing work in the background. In other words, the usefulness happens in the cloud without the human needing to be involved, because an agent is doing the work.

That’s what I find compelling. On one hand, you can make the case that of course Microsoft would be interested in a device model that uses the cloud as a platform, given that Microsoft doesn’t control a mobile device like an iPhone. What occurs to me, however, is that even if Microsoft doesn’t succeed with Project Solara, this model — where the cloud is the hub and multiple devices are the spoke, instead of the phone being in the center — is clearly a better one for agents. Agents work best in the cloud, and across apps and devices; yes, the phone might be one of those devices, but when it comes to agents it shouldn’t be the hub.

Again, this is vaporware, and very much in Microsoft’s interest, so take Project Solara with the appropriate grain of salt. It’s a vision of the future, however, that does make a lot of sense, particularly in an enterprise scenario where all of the context and compute is already in the cloud (and Project Solara is focused on enterprise, not consumer). It’s also something completely different from the past, and fits my thesis that, in the age of AI, thin is in.

Microsoft AI

From GeekWire:

Microsoft has based much of its AI business on models from OpenAI, before expanding more recently to Anthropic. On Tuesday, the company showed how it plans to rely less on both. At the Build developer conference, the Microsoft AI Superintelligence Team unveiled a family of seven models built from scratch. It’s part of an ongoing effort by the company to build credible in-house alternatives to models from partners and rivals with competing allegiances…

The flagship of the seven newly announced MAI models is MAI-Thinking-1, a reasoning model that Microsoft says draws even with Anthropic’s Claude Sonnet 4.6 in blind human testing, and matches the more capable Claude Opus 4.6 on a widely used coding benchmark. [CEO of Microsoft AI Mustafa] Suleyman stressed that MAI-Thinking-1 was trained from the ground up with no distillation from other companies’ models, looking to appeal to enterprises that care about clean data lineage.

These models seem pretty decent, all things considered, but what was interesting to me was the framing: Microsoft emphasized that enterprises could take these models and make them their own. Suleyman said:

This is what owning the full stack end-to-end looks like. It’s the foundation of Microsoft Frontier Tuning, it lets you customize the MAI models using our full stack hill climbing machine right where you want it. And it means that the disciplined and very relentless engineering that has gone into building our models is now available to all of you on a platform that you can trust, working on your behalf to create custom agents that you will control.

So the really big thing, of course, that’s happened in the last year is these RLEs, reinforcement learning environments, these unique training gyms for your AIs. They create company and task-specific agents adapted only to you, built on MAI models. So for example, within Microsoft, we use our RLEs combined with our MAI models to climb towards the best agentic use cases on Excel. Our MAI-tuned model is now on par with GPT 5.4 on public and private benchmarks, whilst at the same time being 10 times more efficient on cost, and many other early adopters are seeing similar results. When we’ve tuned our models on McKinsey’s tasks, MAI delivered the highest win rate, even outperforming GPT 5.5, and again delivering 10x greater efficiency on cost. So to us, this is the advantage of very carefully calibrated frontier tuning.

And importantly, unlike with some of the other companies, with MAI, you don’t rent intelligence from a shared model that learns from everybody. Only you keep the benefits of your hard-earned workflows, know-how, knowledge, and your own institutional data. Only you get to control the resulting model. And so with us, the RLEs and the models that you build inside of them, they become your moat. I really think this is distinct. It marks a new era in AI that we’re all very, very excited about.

This has shades of AWS’s Nova Forge offering, which lets enterprises add their data at a checkpoint in pre-training; it’s a little different in that it’s more focused on reinforcement learning, but those lines are getting blurred.

The concept is that enterprises get to have their own model for their own data, without sharing it with the frontier labs that want to eat their lunch, and it’s a concept that is certainly appealing in theory; the real test will be to see if enterprises that choose this route aren’t penalized by not being on the cutting edge of functionality. Then again, helping cautious enterprises embrace the future on their terms, without necessarily having to win on pure performance, is exactly how Microsoft has long maintained its position.


This Update will be available as a podcast later today. To receive it in your podcast player, visit Stratechery.

The Stratechery Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly.

Thanks for being a subscriber, and have a great day!