Intel Problems

One of the first Articles on Stratechery, written on the occasion of Intel appointing a new CEO, was, in retrospect, overly optimistic. Just look at the title:

The Intel Opportunity

The misplaced optimism is twofold: first there is the fact that eight years later Intel has again appointed a new CEO (Pat Gelsinger), not to replace the one I was writing about (Brian Krzanich), but rather his successor (Bob Swan). Clearly the opportunity was not seized. What is more concerning is that the question is no longer about seizing an opportunity but about survival, and it is the United States that has the most to lose.

Problem One: Mobile

The second reason why that 2013 headline was overly optimistic is that by that point Intel was already in major trouble. The company — contrary to its claims — was too focused on speed and too dismissive of power management to even be in the running for the iPhone CPU, and despite years of trying, couldn’t break into Android either.

The damage this did to the company went deeper than foregone profits; over the last two decades the cost of building ever smaller and more efficient processors has sky-rocketed into the billions of dollars. That means that companies investing in new node sizes must generate commensurately more revenue to pay off their investment. One excellent source of increased revenue for the industry has been billions of smartphones sold over the last decade; Intel, though, hasn’t seen any of that revenue, even as PC sales have flatlined for years.

What has kept the company prospering — when it comes to the level of capital investment necessary to build next-generation fabs, you are either prospering or going bankrupt — has been the explosion in mobile’s counterpart: cloud computing.

Problem Two: Server Success

It wasn’t that long ago that Intel was a disruptor; whereas the server space was originally dominated by integrated companies like Sun, with prices to match, the explosion in PC sales meant that Intel was rapidly improving performance even as it reduced price, particularly relative to performance. Sure, PCs didn’t match the reliability of integrated servers, but around the turn of the century Google realized that the scale and complexity entailed in offering its service meant that building a truly reliable stack was impossible; the solution was to build with the assumption of failure, which in turn made it possible to build its data centers on (relatively) cheap x86 processors.

The datacenter transition from proprietary to commodity hardware

Over the following two decades Google’s approach was adopted by every major datacenter operator, and x86 became the default instruction set for servers; Intel was one of the biggest beneficiaries for the straightforward reason that it made the best x86 processors, particularly for server applications. This was both due to Intel’s proprietary designs as well as its superior manufacturing; AMD, Intel’s IBM-mandated competitor, occasionally threatened the incumbent on the desktop, but only on the low end for laptops, and not at all in data centers.

In this way Intel escaped Microsoft’s post-PC fate: Microsoft wasn’t simply shut out of mobile, they were shut out of servers as well, which ran Linux, not Windows. Sure, the company tried to prop up Windows as long as they could, both on the device side (via Office) and on the server side (via Azure); conversely, what has fueled the company’s recent growth has been The End of Windows, as Office has moved to the cloud with endpoints on all devices, and Azure has embraced Linux. In both cases Microsoft had to accept that their differentiation had flipped from owning the API to having the capability to serve their already-existing customers at scale.

The Intel Opportunity that I referenced above would have entailed a similar flip for Intel: whereas the company’s differentiation had long been based on its integration of chip design and manufacturing, mobile meant that x86 was, like Windows, permanently relegated to a minority of the overall computing market. That, though, was the opportunity.

Most chip designers are fabless; they create the design, then hand it off to a foundry. AMD, Nvidia, Qualcomm, MediaTek, Apple — none of them own their own factories. This certainly makes sense: manufacturing semiconductors is perhaps the most capital-intensive industry in the world, and AMD, Qualcomm, et al have been happy to focus on higher margin design work.

Much of that design work, however, has an increasingly commoditized feel to it. After all, nearly all mobile chips are centered on the ARM architecture. For the cost of a license fee, companies, such as Apple, can create their own modifications, and hire a foundry to manufacture the resultant chip. The designs are unique in small ways, but design in mobile will never be dominated by one player the way Intel dominated PCs.

It is manufacturing capability, on the other hand, that is increasingly rare, and thus, increasingly valuable. In fact, today there are only four major foundries: Samsung, GlobalFoundries, Taiwan Semiconductor Manufacturing Company (TSMC), and Intel. Only four companies have the capacity to build the chips that are in every mobile device today, and in everything tomorrow.

Massive demand, limited suppliers, huge barriers to entry. It’s a good time to be a manufacturing company. It is, potentially, a good time to be Intel. After all, of those four companies, the most advanced, by a significant margin, is Intel. The only problem is that Intel sees themselves as a design company, come hell or high water.

My recommendation did not, by the way, entail giving up Intel’s x86 business; I added in a footnote:

Of course they keep the x86 design business, but it’s not their only business, and over time not even their primary business.

In fact, the x86 business proved far too profitable to take such a radical step, which is the exact sort of “problem” that leads to disruption: yes, Intel avoided Microsoft’s fate, but that also means that the company never felt the financial pain necessary to make such a dramatic transformation of its business at a time when it might have made a difference (and, to be fair, Andy Grove needed the memory crash of 1984 to get the company to fully focus on processors in the first place).

Problem Three: Manufacturing

Meanwhile, over the last decade the modular-focused TSMC, fueled by the massive volumes that came from mobile and a willingness to work with — and thus share profits with — best of breed suppliers like ASML, surpassed Intel’s manufacturing capabilities.

This threatens Intel on multiple fronts:

  • Intel has already lost Apple’s Mac business thanks in part to the outstanding performance of the latter’s M1 chip. It is important to note, though, that while some measure of that performance is due to Apple’s design chops, the fact that it is manufactured on TSMC’s 5nm process is an important factor as well.
  • In a similar vein, AMD chips are now faster than Intel on the desktop, and extremely competitive in the data center. Again, part of AMD’s improvement is due to better designs, but just as important is the fact that AMD is manufacturing chips on TSMC’s 7nm process.
  • Large cloud providers are increasingly investing in their own chip designs; Amazon, for example, is on the second iteration of their Graviton ARM-based processor, which Twitter’s timeline will run on. Part of Graviton’s advantage is its design, but part of it is — you know what’s coming! — the fact that it is manufactured by TSMC, also on its 7nm process (which is competitive with Intel’s finally-launched 10nm process).

In short, Intel is losing share in PCs, even as it is threatened by AMD for x86 servers in the datacenter, and even as cloud companies like Amazon integrated backwards into the processor; I haven’t even touched on the increase in other specialized datacenter operations like GPU-based applications for machine learning, which are designed by companies like Nvidia and manufactured by Samsung.

What makes this situation so dangerous for Intel is the volume issue I noted above: the company already missed mobile, and while server chips provided the growth the company needed to invest in manufacturing over the last decade, the company can’t afford to lose volume at the very moment it needs to invest more than ever.

Problem Four: TSMC

Unfortunately, this isn’t even the worst of it. The day after Intel named its new CEO TSMC announced its earnings and, more importantly, its Capex guidance for 2021; from Bloomberg:

Taiwan Semiconductor Manufacturing Co. triggered a global chip stock rally after outlining plans to pour as much as $28 billion into capital spending this year, a staggering sum aimed at expanding its technological lead and constructing a plant in Arizona to serve key American customers.

This is a staggering amount of money that is only going to increase TSMC’s lead.

The envisioned spending spree sent chipmaking gear manufacturers surging from New York to Tokyo. Capital spending for 2021 is targeted at $25 billion to $28 billion, compared with $17.2 billion the previous year. About 80% of the outlay will be devoted to advanced processor technologies, suggesting TSMC anticipates a surge in business for cutting-edge chipmaking. Analysts expect Intel Corp., the world’s best-known chipmaker, to outsource manufacture to the likes of TSMC after a series of inhouse technology slip-ups.

That’s right: Intel likely has, at least for now, given up on process leadership. The company will keep its design-based margins and foreclose the AMD threat by outsourcing cutting edge chip production to TSMC, but that will only increase TSMC’s lead, and does nothing to address Intel’s other vulnerabilities.

Problem Five: Geopolitics

Intel’s vulnerabilities aren’t the only ones to be concerned about; I wrote last year about Chips and Geopolitics:

The international status of Taiwan is, as they say, complicated. So, for that matter, are U.S.-China relations. These two things can and do overlap to make entirely new, even more complicated complications.

Geography is much more straightforward:

A map of the Pacific

Taiwan, you will note, is just off the coast of China. South Korea, home to Samsung, which also makes the highest end chips, although mostly for its own use, is just as close. The United States, meanwhile, is on the other side of the Pacific Ocean. There are advanced foundries in Oregon, New Mexico, and Arizona, but they are operated by Intel, and Intel makes chips for its own integrated use cases only.

The reason this matters is because chips matter for many use cases outside of PCs and servers — Intel’s focus — which is to say that TSMC matters. Nearly every piece of equipment these days, military or otherwise, has a processor inside. Some of these don’t require particularly high performance, and can be manufactured by fabs built years ago all over the U.S. and across the world; others, though, require the most advanced processes, which means they must be manufactured in Taiwan by TSMC.

This is a big problem if you are a U.S. military planner. Your job is not to figure out if there will ever be a war between the U.S. and China, but to plan for an eventuality you hope never occurs. And in that planning the fact that TSMC’s foundries — and Samsung’s — are within easy reach of Chinese missiles is a major issue.

The context of that article was TSMC’s announcement that it would (eventually) open a 5nm fab in Arizona; yes, that is cutting edge today, but it won’t be in 2024, when the fab opens. Still, it will almost certainly be the most advanced fab in the U.S. focused on contract manufacturing; Intel will, hopefully, have surpassed that fab’s capabilities by the time it opens.

Note, though, that what matters to the United States is different than what matters to Intel: while the latter cares about x86, the U.S. needs cutting-edge general purpose fabs on U.S. soil. To put it another way, Intel will always prioritize design, while the U.S. needs to prioritize manufacturing.

This, by the way, is why I am more skeptical today than I was in 2013 about Intel manufacturing for others. The company may be financially compelled to do so to get the volume it needs to pay back its investments, but the company will always put its own designs at the front of the line.

Solution One: Breakup

This is why Intel needs to be split in two. Yes, integrating design and manufacturing was the foundation of Intel’s moat for decades, but that integration has become a strait-jacket for both sides of the business. Intel’s designs are held back by the company’s struggles in manufacturing, while its manufacturing has an incentive problem.

The key thing to understand about chips is that design has much higher margins; Nvidia, for example, has gross margins between 60~65%, while TSMC, which makes Nvidia’s chips, has gross margins closer to 50%. Intel has, as I noted above, traditionally had margins closer to Nvidia, thanks to its integration, which is why Intel’s own chips will always be a priority for its manufacturing arm. That will mean worse service for prospective customers, and less willingness to change its manufacturing approach to both accommodate customers and incorporate best-of-breed suppliers (lowering margins even further). There is also the matter of trust: would companies that compete with Intel be willing to share their designs with their competitor, particularly if that competitor is incentivized to prioritize its own business?

The only way to fix this incentive problem is to spin off Intel’s manufacturing business. Yes, it will take time to build out the customer service components necessary to work with third parties, not to mention the huge library of IP building blocks that make working with a company like TSMC (relatively) easy. But a standalone manufacturing business will have the most powerful incentive possible to make this transformation happen: the need to survive.

Solution Two: Subsidies

This also opens the door for the U.S. to start pumping money into the sector. Right now it makes no sense for the U.S. to subsidize Intel; the company doesn’t actually build what the U.S. needs, and the company clearly has culture and management issues that won’t be fixed with money for nothing.

That is why a federal subsidy program should operate as a purchase guarantee: the U.S. will buy A amount of U.S.-produced 5nm processors for B price; C amount of U.S. produced 3nm processors for D price; E amount of U.S. produced 2nm processors for F price; etc. This will not only give the new Intel manufacturing spin-off something to strive for, but also incentivize other companies to invest; perhaps Global Foundries will get back in the game,1 or TSMC will build more fabs in the U.S. And, in a world of nearly free capital, perhaps there will finally be a startup willing to take the leap.

This prescription over-simplifies the problem, to be sure; there is a lot that goes into chip manufacturing beyond silicon. Packaging, for example, which long ago moved overseas in the pursuit of lower labor costs, is now fully automated; incentives to move that back may be more straightforward. What is critical to understand, though, is that regaining U.S. competitiveness, much less leadership, will take many years; the federal government has a role, but so does Intel, not by seizing its opportunity, but by accepting the reality that its integrated model is finished.

I wrote a follow-up to this article in this Daily Update.

  1. Global Foundries is AMD’s former manufacturing arm; they bowed out of the cutting edge race at 10nm