ChatGPT Enterprise, Connectors and Small Businesses, Nvidia Competitors

Good morning,

My apologies that yesterday’s Article Nvidia On the Mountaintop came out so late! Do check it out if you missed it yesterday morning.

On to the Update:

ChatGPT Enterprise

From Bloomberg:

OpenAI launched a corporate version of ChatGPT with added features and privacy safeguards, the startup’s most significant effort yet to attract a broad mix of business customers and boost revenue from its best-known product. As with consumer versions of the company’s artificial intelligence-powered chatbot, users can type in a prompt and receive a written response from ChatGPT Enterprise. The new tool includes unlimited use of OpenAI’s most powerful generative AI model, GPT-4, as well as data encryption and a guarantee that the startup won’t use data from customers to develop its technology. It also offers the ability to type in much longer prompts.

The rollout of ChatGPT Enterprise is a move forward in OpenAI’s plans to make money from its ubiquitous chatbot, which is enormously popular but very expensive to operate because robust AI models require lots of computing power. The San Francisco-based startup has already taken some steps toward generating revenue from ChatGPT, such as by selling a premium subscription and offering companies paid access to its application programming interface, which developers can use to add the chatbot to other apps. Brad Lightcap, OpenAI’s chief operating officer, declined to provide specific details for how much ChatGPT Enterprise will cost, noting it can vary based on the needs of each business. Lightcap said the company “can work with everyone to figure out the best plan for them.”

This is an interesting and, in some respects, obvious product. There are two primary objections that enterprises have to their employees using ChatGPT: the first is the risk of proprietary data being input into OpenAI’s model, and the second is the risk of using copyrighted material in a way that opens up the company to potential liability down the road. The second concern can’t be mitigated by OpenAI; OpenAI’s models are trained with data from the open Internet, which I think is covered by fair use, but I’m not a court, and until that question is answered definitively the only way to avoid this risk is to use a model where the provenance of all of the data is known (i.e. Adobe’s Firefly strategy).

However, OpenAI can address that first risk — in fact, they already did in terms of their API, which gives access to their models with a guarantee that the data won’t be used. Still, an API isn’t a product, and ChatGPT already has massive mindshare and an easy-to-use interface; now that product is available in a way that ameliorates enterprise concerns about data leakage, while providing access to the most expensive iterations of ChatGPT, including the 32k context window version.

From the outside this appears to be OpenAI simply capturing demand that exists; a perusal of LinkedIn suggests a couple of sales executives and go-to-market folks, and the careers page has a couple of account engineer and one account executive roles. In other words, OpenAI isn’t building out a sales force; they are building out a product that capitalizes on ChatGPT’s popularity.

Connectors and Small Businesses

There are two features that are “coming soon” that are worth highlighting; from the introductory blog post:

Customization: Securely extend ChatGPT’s knowledge with your company data by connecting the applications you already use.

It’s important to keep in mind that ChatGPT is a large language model, not a knowledge repository. It has no knowledge of right or wrong, or truth or untruth; it is simply predicting the next word. I explored the implications of this in March’s ChatGPT Gets a Computer:

Computers are deterministic: if circuit X is open, then the proposition represented by X is true; 1 plus 1 is always 2; clicking “back” on your browser will exit this page. There are, of course, a huge number of abstractions and massive amounts of logic between an individual transistor and any action we might take with a computer — and an effectively infinite number of places for bugs — but the appropriate mental model for a computer is that they do exactly what they are told (indeed, a bug is not the computer making a mistake, but rather a manifestation of the programmer telling the computer to do the wrong thing).

I’ve already mentioned Bing Chat and ChatGPT; on March 14 Anthropic released another AI assistant named Claude: while the announcement doesn’t say so explicitly, I assume the name is in honor of the aforementioned Claude Shannon. This is certainly a noble sentiment — Shannon’s contributions to information theory broadly extend far beyond what Dixon laid out above — but it also feels misplaced: while technically speaking everything an AI assistant is doing is ultimately composed of 1s and 0s, the manner in which they operate is emergent from their training, not proscribed, which leads to the experience feeling fundamentally different from logical computers — something nearly human — which takes us back to hallucinations; Sydney was interesting, but what about homework?

The point of that Article was that ChatGPT’s plugin architecture gave hallucinating creative LLMs access to determinative computers to ascertain truth, not dissimilar to the way a creative being like you or I might use a calculator to solve a math problem. In other words, the LLM is the interface to the source of truth, not the source of truth itself.

That is exactly what this “coming soon” feature is all about: you don’t make an LLM useful for your business by adding your business’s data to the LLM; that is simply a bit more text in a sea of it. Rather, you leverage the LLM as an interface to “computers” that deterministically give you the right answer. In this case, those computers will be “connecting the applications you already have”, which sounds to me an awful lot like enterprise-specific plug-ins.

The second feature to highlight is about the go-to-market:

Availability for all team sizes: a self-serve ChatGPT Business offering for smaller teams

It makes sense that this is coming after the enterprise version with “call us” pricing: OpenAI likely needs to figure out both usage patterns and pricing before it commits itself to either in a self-serve model. I suspect, though, that this will probably prove to be a bigger market overall. Large enterprises are probably going to tend towards their own proprietary models — whether built on OpenAI with dedicated capacity, or their own models, via Nvidia or Microsoft or an open source model — while smaller businesses won’t have the wherewithal or the perceived benefit of doing more than using OpenAI’s offering. And, of course, a self-serve go-to-market is closer to a consumer offering in a lot of ways, particularly in terms of customer acquisition; it’s in customer acquisition that OpenAI has the biggest advantage, thanks to the fact that everyone knows what ChatGPT is and has probably already tried it out.

What is notable is that while Microsoft of course serves enterprises, their biggest moat has traditionally been around small-and-medium sized businesses that find it beneficial to effectively outsource their entire IT stack to Microsoft, supported by an army of sales teams and independent systems integrators. In other words, OpenAI and Microsoft’s strange partnership has another chapter of coopetition; at the end of the day ChatGPT in all its flavors is still running on Azure, but Microsoft would certainly rather own these customers directly. And, it should be noted, Microsoft’s traditional advantages will still apply: it’s tbd how well the aforementioned application connectors work; Microsoft, meanwhile, will ensure that Bing Chat Enterprise works with all of your data in the Microsoft graph out of the box, no fiddly connectors required.

Nvidia Competitors

A reader wrote in response to yesterday’s Article:

I suspect you’re working on this, but would love to read articles about potential competitors in the space. It seems to me that 1) AMD has great GPU technology, maybe not as good as Nvidia’s, but very strong with very talented leadership and the kind of scale that would get TSMC to care; and 2) there are in-house efforts that are quite large, but hard to know whether those are differentiated from a performance standpoint from what is otherwise available on the merchant market.

Here is a very brief overview of the competitive landscape as I currently understand it; this is very much subject to change — perhaps as soon as tomorrow, if some of you convince me that I have it wrong!


Google: Google’s TPUs are the only chips that are competitive with Nvidia on large language model workloads; however, they are exclusive to Google (and Google Cloud customers). Of course you can look at it another way: the only large cloud provider that isn’t desperate for Nvidia chips is Google, which ought to be a competitive advantage.

AMD: AMD announced the MI300 this summer. The MI300 has significantly more high-bandwidth memory on-board than Nvidia’s H100, which means higher bandwidth and capacity. However, that is on a per-chip basis: Nvidia’s GPUs link together via NVLink and the NVSwitch, which means that multiple GPUs can be addressed as one GPU; this is what Nvidia CEO Jensen Huang means when he talks about building systems, not just chips.

What this means in practice is that Nvidia’s architecture scales more naturally, and is easier to address, above and beyond the fact that Nvidia has the CUDA software ecosystem. This doesn’t just matter for massive training runs, either: both Google’s Palm and OpenAI’s GPT run inference on large clusters, not single GPUs. AMD is working with Meta in particular on PyTorch to break out of the CUDA lock-in, but there is a long ways to go to be competitive with Nvidia in inference, much less training.

Moreover, AMD simply doesn’t have the production capacity, even if their chips were fully competitive. Part of this is the natural time it takes to ramp, but another limitation is that high-bandwidth memory, which itself is still ramping in supply and which Nvidia is competing for for an upcoming H100 refresh. AMD’s chip is also more complex than Nvidia’s — it is the most complex application of TSMC’s Chip-on-Wafer-on-Substrate (CoWoS) packaging technology yet — and while TSMC is opening new CoWoS facilities as quickly as it can, Nvidia is grabbing most of that capacity.

Intel: Intel theoretically should have an offering here, particularly given the capacity constraints at TSMC; Intel has plenty of capacity! Unfortunately Intel doesn’t seem to have a very compelling offering. It’s Habana Gaudi2 AI chip is in the market, and priced very competitively, but seems to have little traction (and the Habana division suffered layoffs last fall); it’s also fabricated by TSMC! So much for that capacity.

AWS: AWS has both its Trainium chip (for training) and Inferentia chip (for inference). From what I understand the latter in particular works well for image generation, which requires substantially less memory and scalability; however, neither is yet competitive for large language models, which is why AWS is buying as many Nvidia chips as it can get its hands on.

Startups: Cerebras, which makes wafer-sized chips, is reportedly competitive when it comes to training very large models, and has a cloud service; Groq is focused on inference specifically, and claims very high performance on Llama 2, but has only just selected Samsung as a foundry partner. Both last raised money in 2021 so the best marker of their progress may very well be their next fundraising. I’m not familiar with other startup offerings.


One important thing to note about all of these competitors — and about Nvidia’s H100 — is that they were all designed before ChatGPT; it’s fair to wonder just how much optimization is possible for chips designed expressly for large language models.

What is clear is that Nvidia is firmly in the lead for both training and inference, thanks to not only having the best architecture, the dominant software ecosystem, and the most scalability, but also access to the most capacity. Remember how much grief Huang got for those huge purchase orders and inventory build-up last year? That is paying off in a major way as Nvidia monopolizes almost everything that TSMC can bring to bear, particularly in terms of packaging, and the memory supply chain in terms of high-bandwidth memory. Moreover, this positioning gives Nvidia that much more time to deepen its software moat and continue to expand its networking capabilities that undergird its scalability advantage.

That said, the competitive pressure for inference capabilities, particularly from cloud providers like AWS building its own chips, or Microsoft and Meta pushing AMD, will be intense (I expect Nvidia to dominate non-public cloud buildouts). Both Inferentia and AMD need to scale better, but inference is a more manageable problem than training, where Nvidia’s lead seems nearly insurmountable.

The exception, again, is Google, and its TPU architecture. This isn’t just useful for Google and its own products, but ought to be a big boon for Google Cloud. Indeed, the extent to which Google can bring its newest TPUv5 to bear may be the biggest reason for startups to break out of CUDA, simply because they might have a better chance of getting the chips they need on Google’s Cloud than fighting over Nvidia access on everybody else’s.


This Update will be available as a podcast later today. To receive it in your podcast player, visit Stratechery.

The Stratechery Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly.

Thanks for being a subscriber, and have a great day!