Nvidia On the Mountaintop

It was only 11 months ago that I wrote an Article entitled Nvidia In the Valley; the occasion was yet another plummet in their stock price:

To say that the company has turned things around is, needless to say, an understatement:

That big jump in May was Nvidia’s last earnings, when the company shocked investors with an incredibly ambitious forecast; this last week Nvidia vastly exceeded those expectations and forecasted even bigger growth going forward. From the Wall Street Journal:

Chip maker Nvidia said revenue in its recently completed quarter more than doubled from a year ago, setting a new company record, and projected that surging interest in artificial intelligence is propelling its business faster than expected. Nvidia is at the heart of the boom in artificial intelligence that made it a $1 trillion company this year, and it is forecasting growth that outpaces even the most bullish analyst projections.

Nvidia’s stock, already the top performer in the S&P 500 this year, rose 7.5% following the results, which would be about $87 billion in market value. The company said revenue more than doubled in its fiscal second quarter to about $13.5 billion, far ahead of Wall Street forecasts in a FactSet survey. Even more strikingly, it said revenue in its current quarter would be around $16 billion, besting expectations by about $3.5 billion. Net profit for the company’s second quarter was $6.19 billion, also surpassing forecasts.

The results show a wave of investment in artificial intelligence that began late last year with the arrival of OpenAI’s ChatGPT language-generation tool is gaining steam as companies and governments seek to harness its power in business and everyday life. Many companies see AI as indispensable to their future growth and are making large investments in computing infrastructure to support it.

Now the big question on everyone’s mind is if Nvidia is the new Cisco:

I don’t think so, at least in terms of the near-term: there are some fundamental differences between Nvidia and Cisco that are worth teasing out. The bigger question is the long term, and here the comparison might be more apt.

Nvidia and Cisco

The first difference between Nvidia and Cisco is in the above charts: Nvidia already went through a crash, thanks to the double whammy of Ethereum moving to proof-of-stake and the COVID cliff in terms of PC sales; both left Nvidia with huge amounts of inventory it had to write-off over the second half of last year. The bright spot for Nvidia was the steady growth of data center revenue, thanks to the increase of machine learning workloads; I included this chart in that Article last fall:

What has happened over the last two quarters is that data center revenue is devouring the rest of the company; here is an updated version of that same chart:

Here is Nvidia’s revenue mix:

This dramatic shift in Nvidia’s business provides some interesting contrasts to Cisco’s dot-com run-up. First, here was Cisco’s revenue, gross profit, net profit, and stock price in the ten years starting from its 1993 IPO:

Here is Nvidia’s last ten years:

The first thing to note is the extent to which Nvidia’s crash last year looks similar to Cisco’s dot-com crash: in both cases steady but steep revenue increases initially outpaced the stock price, which eventually overshot just a few quarters before big inventory write-downs led to big decreases in profitability (score one for crypto optimists hopeful that the current doldrums are simply their own dot-com hangover).

Cisco, though, didn’t have a second act, unlike this data center explosion. What is notable is the extent to which Nvidia’s revenue increase is matching the slope of the stock price increase (obviously this is inexact given the different axis); it seems likely that the stock will overshoot revenue growth soon enough, but it hasn’t really happened yet. It’s also worth noting how much more disciplined Nvidia appears to be in terms of below-the-line costs: its net profit is moving in concert with its revenue, unlike Cisco in the 90s; I suspect this is a function of Nvidia being a much larger and more mature company.

Another difference is the nature of Nvidia’s customers: over 50% of the company’s Q2 revenue came from the large cloud service providers, followed by large consumer Internet companies (i.e. Meta). This category does, of course, include the startups that once might have purchased Cisco routers and Sun servers directly, and now rent capacity (if they can get it); cloud providers, though, monetize their hardware immediately, which is good for Nvidia.

Still, there is an important difference from other cloud workloads: previously a new company or line of business only ramped their cloud utilization with usage, which ought to correlate to customer acquisition, if not revenue. Model training, though, is an up-front cost, not dissimilar to the cost needed to buy those Sun servers and Cisco routers in the dot-com era; that is cloud revenue that has a much higher likelihood of disappearing if the company in question doesn’t find a market.

This point is relevant to Nvidia given that training is the part of AI where the company is the most dominant, thanks to both its software ecosystem and the ability to operate huge fleet of Nvidia chips as a single GPU; inference is where Nvidia will first see challenges, and that is also the area of AI that is correlated with usage, and thus more durable from a cloud provider perspective.

Those points about a software ecosystem and hardware scalability are also the biggest reason why Nvidia is different than Cisco. Nvidia has a moat in both, along with a substantial manufacturing advantage thanks to its upfront payments to TSMC over the last several years to secure its own 4nm line (and having the good fortune of asking for more scale at a time when TSMC’s other sources of high performance computing revenue are in a slump). There is certainly a massive incentive for both the cloud providers and large Internet companies to bridge Nvidia’s moats — see AWS’s investments in its own chips, for example, or Meta’s development of and support for PyTorch — but right now Nvidia has a big lead and the frenzy inspired by ChatGPT is only deepening their install base, with all of the positive ecosystem effects that entails.

GPU Demand

The biggest challenge facing Nvidia is the one that is ultimately out of their control: what does the final market look like?

Go back to the dot-com era, and the era that proceeded it. The advent of computing, first in the form of mainframes and then the PC, digitized information, making it endlessly duplicable. Then came the Internet which made the marginal cost of distributing that content go to zero (with the caveat that most people had very low bandwidth). This was an obvious business opportunity that plenty of startups jumped all over, even as telecom companies took on the bandwidth problem; Cisco was the beneficiary of both.

The missing element, though, was demand: consistent consumer demand for Internet applications only started to arrive with the advent of broadband connections in the 2000s (thanks in part to a buildout that bankrupted said telecom companies), and then exploded with smartphones a decade later, which made the Internet accessible anytime, anywhere. It was demand that made the router business as big as dot-com investors thought it might be, although by then Cisco had a host of competitors, including large cloud providers who built (and open-sourced) their own.

There are lots of potential starting points to choose for AI: machine learning has obviously been a thing for a while, or you might point to the 2017 invention of the transformer; the release of GPT-3 in 2020 was perhaps akin to the release of the Mosaic web browser, which would make ChatGPT the Netscape IPO. One way to categorize this emergence is to characterize training as being akin to digitization in the previous era, and creation — i.e. inference — as akin to distribution. Once again there are obvious business opportunities that arise from combining the two, and once again startups are jumping all over them, along with the big incumbents.

However you want to make the analogy, what is important to note is that the missing element is the same: demand. ChatGPT took the world by storm, and the use of AI for writing code is both proliferating widely and is extremely high leverage. Every SaaS company in tech, meanwhile, is hard at work at an AI strategy, for the benefit of their sales team if nothing else. That is no small thing, and the exploration and implementation of those strategies will use up a lot of Nvidia GPUs over the next few years. The ultimate question, though, is how much of this AI stuff is actually used, and that is ultimately out of Nvidia’s control.

My best guess is that the next several years will be occupied building out the most obvious use cases, particularly in the enterprise; the analogy here is to the 2000s build-out of the web. The question, though, is what will be the analogy to mobile (and the cloud), which exploded demand and led to one of the most profitable decades tech has ever seen? The answer may be an already discarded fad: the metaverse.

A GPU Overhang and the Metaverse

In April 2022, when Dall-E 2 came out, I wrote DALL-E, the Metaverse, and Zero Marginal Content, and highlighted three trends:

First, the gaming industry was increasingly about a few AAA games, small indie titles, and the huge sea of mobile; the limiting factor in further development was the astronomical cost of developing high quality assets.
Second, social media succeeded by virtue of making content creation free, because users created the content of their own volition.
Third, TikTok pointed to a future where every individual not only had their own feed, but also where the provenance of that content didn’t matter.

AI is how those three trends might intersect:

What is fascinating about DALL-E is that it points to a future where these three trends can be combined. DALL-E, at the end of the day, is ultimately a product of human-generated content, just like its GPT-3 cousin. The latter, of course, is about text, while DALL-E is about images. Notice, though, that progression from text to images; it follows that machine learning-generated video is next. This will likely take several years, of course; video is a much more difficult problem, and responsive 3D environments more difficult yet, but this is a path the industry has trod before:

Game developers pushed the limits on text, then images, then video, then 3D

Social media drives content creation costs to zero first on text, then images, then video

Machine learning models can now create text and images for zero marginal cost

In the very long run this points to a metaverse vision that is much less deterministic than your typical video game, yet much richer than what is generated on social media. Imagine environments that are not drawn by artists but rather created by AI: this not only increases the possibilities, but crucially, decreases the costs.

I wrote in the conclusion:

Machine learning generated content is just the next step beyond TikTok: instead of pulling content from anywhere on the network, GPT and DALL-E and other similar models generate new content from content, at zero marginal cost. This is how the economics of the metaverse will ultimately make sense: virtual worlds need virtual content created at virtually zero cost, fully customizable to the individual.

Zero marginal cost is, I should note, aspirational at this point: inference is expensive, both in terms of power and also in terms of the need to pay off all of that money that is showing up on Nvidia’s earnings. It’s possible to imagine a scenario a few years down the line, though, where Nvidia has deployed countless ever more powerful GPUs, and inspired massive competition such that the world’s supply of GPU power far exceeds demand, driving the marginal costs down to the cost of energy (which hopefully will have become cheaper as well); suddenly the idea of making virtual environments on demand won’t seem so far-fetched, opening up entirely new end-user experiences that explode demand in the way that mobile once did.

The GPU Age

The challenge for Nvidia is that this future isn’t particularly investable; indeed, the idea assumes a capacity overhang at some point, which is not great for the stock price! That, though, is how technology advances, and even if a cliff eventually comes, there is a lot of money to be made in the meantime.

That noted, the biggest short-term question I have is around Nvidia CEO Jensen Huang’s insistence that the current wave of demand is in fact the dawn of what he calls accelerated computing; from the Nvidia earnings call:

I’m reluctant to guess about the future and so I’ll answer the question from the first principle of computer science perspective. It is recognized for some time now that…using general purpose computing at scale is no longer the best way to go forward. It’s too energy costly, it’s too expensive, and the performance of the applications are too slow. And finally, the world has a new way of doing it. It’s called accelerated computing and what kicked it into turbocharge is generative AI. But accelerated computing could be used for all kinds of different applications that’s already in the data center. And by using it, you offload the CPUs. You save a ton of money in order of magnitude, in cost and order of magnitude and energy and the throughput is higher and that’s what the industry is really responding to.

Going forward, the best way to invest in the data center is to divert the capital investment from general purpose computing and focus it on generative AI and accelerated computing. Generative AI provides a new way of generating productivity, a new way of generating new services to offer to your customers, and accelerated computing helps you save money and save power. And the number of applications is, well, tons. Lots of developers, lots of applications, lots of libraries. It’s ready to be deployed.

And so I think the data centers around the world recognize this, that this is the best way to deploy resources, deploy capital going forward for data centers. This is true for the world’s clouds and you’re seeing a whole crop of new GPU-specialized cloud service providers. One of the famous ones is CoreWeave and they’re doing incredibly well. But you’re seeing the regional GPU specialist service providers all over the world now. And it’s because they all recognize the same thing, that the best way to invest their capital going forward is to put it into accelerated computing and generative AI.

My interpretation of Huang’s outlook is that all of these GPUs will be used for a lot of the same activities that are currently run on CPUs; that is certainly a bullish view for Nvidia, because it means the capacity overhang that may come from pursuing generative AI will be back-filled by current cloud computing workloads. And, to be fair, Huang has a point about the power and space limitations of current architectures.

That noted, I’m skeptical: humans — and companies — are lazy, and not only are CPU-based applications easier to develop, they are also mostly already built. I have a hard time seeing what companies are going to go through the time and effort to port things that already run on CPUs to GPUs; at the end of the day, the applications that run in a cloud are determined by customers who provide the demand for cloud resources, not cloud providers looking to optimize FLOP/rack.

If GPUs are going to be as big of a market as Nvidia’s investors hope it will be, it will be because applications that are only possible with GPUs generate the demand to make it so. I’m confident that time will come; what I, nor Huang, nor anyone else can be sure of is when that time will arrive.

I wrote a follow-up to this Article in this Daily Update.

Stratechery by Ben Thompson