Apple’s Silicon Event, Scaling the M Series, UltraFusion and Integration

Good morning,

I’m back, and feeling great — thanks for your understanding with yesterday’s personal day. There is a lot to get into from Apple’s event yesterday, particularly when it comes to chips, so let’s get going.

On to the update:

Apple’s Silicon Event

From Dan Gallagher, writing for Wall Street Journal’s Heard on the Street:

Never known for cheap products, even Apple has to segment its customer base. That was evident at the company’s latest product unveiling Tuesday. As widely expected, Apple announced the third update to its iPhone SE, the lowest-priced smartphone in its catalog. But the company also introduced a new high-end desktop computer called the Mac Studio featuring a monster new chip from its M1 line of processors. That computer starts at $1,999 and requires a display—such as the new one Apple also introduced Tuesday—that starts at $1,599.

Those weren’t the only two products; Apple also released a new iPad Air with an M1 chip, putting it on par with the high-end iPad Pros and low-end Macs. Good luck trying to segment that! What was interesting and notable is how Apple tied the entire presentation together: the real star was Apple Silicon.

After opening with Apple TV+ and new iPhone 13 colors, this was the segue to the iPhone SE:

The development of Apple Silicon was inspired by iPhone, and has delivered cutting edge performance and capabilities for many years. It continues to have a phenomenal impact on our products and the industry. In addition to industry-leading performance-per-watt, Apple Silicon delivers many other advancements. The custom-built image signal processor drives our dynamic camera experiences. The neural engine unlocks breakthrough machine learning capabilities. The secure enclave protects the biometric information used in Face ID and Touch ID. These are just a few of the innovations enabled by Apple Silicon that make our products such a huge hit with customers. And today, we’re bringing our extraordinary A15 Bionic chip to another iPhone.

After the iPhone came the iPad Air, and again the segue was Apple Silicon:

Apple Silicon is a huge part of the success of another remarkable product, and that’s iPad. The unmatched performance and efficiency of Apple Silicon enables iPad’s magical experience, from its versatility and portability to its exceptionally long battery life. It makes iPad the most powerful device of its kind, and even faster than the vast majority of PC notebooks. We have a fantastic iPad lineup, and today I’m excited to talk about iPad Air.

Then came the Mac:

Next, let’s talk about the Mac. Apple Silicon has transformed the Mac over this past year. It’s incredible performance, custom technologies, and cutting-edge power efficiency have ushered in a new era for the Mac. We’ve transitioned nearly every product in the Mac lineup to Apple Silicon, and each of these products has blown away users and shocked the PC industry. When we introduced the MacBook Pro and MacBook Air with M1, our customers no longer had to choose between incredible performance and amazing battery life. They would have it all. And Apple Silicon has enabled us to design products we never could have imagined before, like the remarkably thin and powerful new iMac made possible by M1. And the newest MacBook Pro with M1 Pro and M1 Max has completely redefined what pros expect from a notebook. It simply has no equal. Customers are absolutely loving these systems. In fact, every quarter since we started shipping M1-based Macs has been record-breaking and we’ve outpaced the industry growth during this time. And we’re not stopping there. To tell you how we’re going to take Mac even further, here’s John.

The next section didn’t need a segue, because before Apple got to the new Mac Studio, it spent an entire section of the keynote introducing the new M1 Ultra chip (more on this in a moment). All of this emphasized Apple’s Shifting Differentiation; I wrote when the M1 first launched:

If you ask Apple — or watch their seemingly never-ending series of events — they will happily tell you exactly what the company’s differentiation is based on; from this year alone:

This integration is at the core of Apple’s incredibly successful business model: the company makes the majority of its money by selling hardware, but while other manufacturers can, at least in theory, create similar hardware, which should lead to commoditization, only Apple’s hardware runs its proprietary operating systems.

Of course software is even more commoditizable than hardware: once written, software can be duplicated endlessly, which means its marginal cost of production is zero. This is why many software-based companies are focused on serving as large a market as possible, the better to leverage their investments in creating the software in the first place. However, zero marginal cost is not the only inherent quality of software: it is also infinitely customizable, which means that Apple can create something truly unique, and by tying said software to its hardware, make its hardware equally unique as well, allowing it to charge a sustainable premium.

This is, to be sure, a simplistic view of Apple: many aspects of its software are commoditized, often to Apple’s benefit, while many aspects of its hardware are differentiated. What is fascinating is that while modern Apple is indeed characterized by the integration of hardware and software, the balance of which differentiates the other has shifted over time, culminating in yesterday’s announcement of new Macs powered by Apple Silicon.

The rest of the Article traced the history of Mac differentiation in particular, which for the preceding twenty years had been based on OS X; the benefit of switching to Intel was ensuring that performance was at parity with Windows. Over time, though, particularly with the rise of web apps (whether in the browser or using technologies like Electron) and the evolution of Windows, differentiation was increasingly a matter of look-and-feel more than anything else (not to say that doesn’t matter!); now the defining feature of the Mac — all of Apple’s products, really — is less the software ecosystem than it is the fact it is the only way to get Apple Silicon.

Scaling the M Series

From AnandTech:

As part of Apple’s spring “Peek Performance” product event this morning, Apple unveiled the fourth and final member of the M1 family of Apple Silicon SoCs, the M1 Ultra. Aimed squarely at desktops – specifically, Apple’s new Mac Studio – the M1 Ultra finds Apple once again upping the ante in terms of SoC performance for both CPU and GPU workloads. And in the process, Apple has thrown the industry a fresh curveball by not just combining two M1 Max dies into a single chip package, but by making the two dies present themselves as a single, monolithic GPU, marking yet another first for the chipmaking industry.

Apple’s systematic approach to scaling the A-series of chips to the Mac has been very impressive. To recount, Apple designed two CPU cores — a high performance one called Firestorm, and a low performance one called Icestorm — along with a GPU core, and additional system-on-a-chip (SoC) capabilities like an image processor, neural engine, etc. Each of these components was used in what is an entire family of chips:

  • The A14 Bionic, introduced with the iPhone 12 in September 2020, had two Firestorm cores and four Icestorm cores, along with four GPU cores, plus the aforementioned additional SoC features.
  • The M1, introduced with the MacBook Air in November 2020, had four Firestorm cores and four Icestorm cores, along with eight GPU cores, plus the aforementioned additional SoC features.
  • The M1 Pro and Max, introduced last fall with the MacBook Pro, had eight Firestorm cores and two Icestorm cores, along with 16 and 32 GPU cores respectively, plus the aforementioned additional SoC features.

All of these chips were made on the same TSMC 5nm process, which is critical: if you think about each of these components as bricks, then the only thing Apple needed to do when building a new chip is lay new mortar tying those bricks together in new ways. This is almost certainly why the M1 is still based on the A14 cores, which were designed for TSMC’s first 5nm process, while the A15 is built on TSMC’s second generation N5P process.

To that end, if you’ll excuse a bit of speculation, I would guess that the M2 will be based on TSMC’s 3nm process (although The Information previously reported it would be based on N5P); notice that the M2 MacBook Air, which is expected to feature a brand-new industrial design, was once rumored to be launched this spring but then (according to rumors) pushed back to the fall. I suspect it’s not a coincidence that TSMC’s 3nm process was also delayed until the second half of 2022. If this is right, it’s possible that the successors to Firestorm and Icestorm will actually launch on the M2, given that TSMC’s 3nm process won’t be ready for this year’s iPhone.

(By the way, this article by Pushkar nicely explains how important Apple has been to TSMC becoming the leading foundry.)

UltraFusion and Integration

But back to those bricks: the challenge Apple faced with the M1 Ultra is that the larger a chip is the more cost prohibitive it becomes to make; more space means more opportunities for manufacturing flaws to ruin a chip, which has a major impact on yields (plus the fact that chips are square and wafers are round; bigger chips mean more wasted space). Apple Senior Vice President of Hardware Technologies Johny Srouji explained Apple’s solution:

The challenge is that there are physical limitations in building a larger die than M1 Max. The leading approach is to use two chips and connect them via the motherboard. However, that approach has significant trade-offs, including increased latency, reduced bandwidth, and much higher power consumption. It also burdens developers with the need to change their code for this architecture.

So with M1 Ultra we did something truly groundbreaking, and it actually starts with M1 Max, the most powerful SoC we’ve built to-date, with its high-performance CPU, massive GPU, and tremendous unified memory bandwidth, M1 Max is incredibly capable, and its amazing performance is delivered while maintaining industry-leading power efficiency. Yet, it’s even more capable than what we’ve shared: you see, M1 Max has a secret, a hidden feature we haven’t talked about until now. It has a groundbreaking die-to-die interconnect technology that allows us to scale even further by building M1 Ultra from two M1 Max die, which doubles performance, and we connect the two die with our innovative custom-built packaging architecture. This multi-die architecture is way ahead of anything else in the industry, and we call it UltraFusion.

The UltraFusion architecture uses a silicon interposer that has twice the connection density of any technology available. It connects over 10,000 signals, and provides an enormous 2.5 terabytes-per-second of low latency interprocessor bandwidth between the two die using very little power. That’s more than four times the bandwidth of the leading multichip connector technology. The result is an SoC with blazing performance, due to low latency, massive bandwidth, and incredible power efficiency. And thanks to the magic of the UltraFusion architecture, it behaves like a single chip to software, and preserves the benefits of the unified memory.

The idea of packaging multiple chips together with high-speed interconnects is not a new concept; broadly similar technology is already used in both AMD and Intel chips (this is how Intel will make chips that combine their own CPUs and TSMC-manufactured GPUs, for example). What is extremely unique about this approach is that instead of connecting heterogenous chips, Apple is connecting two systems-on-a-chip and, critically, presenting that combination as one chip to the operating system.

From what little I have been able to gather, there is a significant software component to making this work, which is to say the “the magic of the UltraFusion architecture” includes the fact that Apple isn’t simply designing the M1 Ultra, but is also making the software that will run on it. In other words, we are back where we started: Apple’s differentiation remains the integration of hardware and software, but that integration is not just a discrete computer and a discrete operating system, but a new kind of chip architecture that depends on Apple software to work.

The implications of this approach for performance are interesting. It is worth noting that Srouji’s comments about power efficiency are less essential for a machine that is always plugged into a wall (although Srouji highlighted the positive impact this has on form factor, fan noise, etc.). To that end, all of the M1s are a bit limited by single-thread performance that is slower than Intel or AMD, which ramp up the power to achieve higher clock speeds. Yes, the M1 Ultra has a ton of cores, but those cores aren’t made faster by this architecture.

What is more interesting is the impact on graphics: graphics performance is embarrassingly parallel, which is to say that performance scales with cores, because each core is responsible for an ever decreasing section of the screen. That means that the M1 Ultra may very well be the fastest graphics chip on the market; from that AnandTech article:

In particular, the company is touting that the M1 Ultra’s GPU performance exceeds that of NVIDIA’s GeForce RTX 3090, which at the moment is the single fastest video card on the market. And furthermore, that they’re able to do so while consuming a bit over 100 Watts, or 200 Watts less than the RTX 3090.

From a performance standpoint, Apple’s claims look reasonable, assuming their multi-GPU technology works as advertised. For as fast as the RTX 3090 is, it can’t be overstated just how many more transistors Apple is throwing at the matter than NVIDIA is; the GA102 GPU used by NVIDIA has 28.3 billion transistors, while the combined M1 Ultra is 114 billion. Not all of which are being used for graphics on the M1 Ultra, of course, but with so many transistors, Apple doesn’t have to be shy about throwing more silicon at the problem.

Cores aren’t everything — the 3090 is competitive because it is clocked far higher (thus the higher power consumption) — but when it comes to graphics, they are an awful lot. Of course from a gamer perspective that means the integration with macOS is a dealbreaker; Nvidia will continue to rule the roost on Windows (particularly once it moves to TSMC’s most advanced processes), which is the same as ruling the roost for gaming generally. Moreover, that advantage will extend to AI work thanks to Nvidia’s investments in CUDA.

This last point is one of the most interesting going forward: Nvidia also integrates chips and software, but the chips came first; Apple now has the chips, and the predisposition to integrate; what else might the company do in software to take advantage of what it has built in hardware, particularly because the company made clear that a Mac Pro — which Bloomberg previously reported will have four chips interconnected — is still on the way. The pendulum swings.

This Update will be available as a podcast later today. To receive it in your podcast player, visit Stratechery.

The Stratechery Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly.

Thanks for being a subscriber, and have a great day!