It took a few moments to realize what was striking about the opening video for Nvidia’s GTC conference: the complete absence of humans.
That the video ended with Jensen Huang, the founder and CEO of Nvidia, is the exception that accentuates the takeaway. On the one hand, the theme of Huang’s keynote was the idea of AI creating AI via machine learning; he called the idea “intelligence manufacting”:
None of these capabilities were remotely possible a decade ago. Accelerated computing, at data center scale, and combined with machine learning, has sped up computing by a million-x. Accelerated computing has enabled revolutionary AI models like the transformer, and made self-supervised learning possible. AI has fundamentally changed what software can make, and how you make software. Companies are processing and refining their data, making AI software, becoming intelligence manufacturers. Their data centers are becoming AI factories. The first wave of AI learned perception and inference, like recognizing images, understanding speech, recommending a video, or an item to buy. The next wave of AI is robotics: AI planning actions. Digital robots, avatars, and physical robots will perceive, plan, and act, and just as AI frameworks like TensorFlow and PyTorch have become integral to AI software, Omniverse will be essential to making robotics software. Omniverse will enable the next wave of AI.
We will talk about the next million-x, and other dynamics shaping our industry, this GTC. Over the past decade, Nvidia-accelerated computing delivered a million-x speed-up in AI, and started the modern AI revolution. Now AI will revolutionize all industries. The CUDA libraries, the Nvidia SDKs, are at the heart of accelerated computing. With each new SDK, new science, new applications, and new industries can tap into the power of Nvidia computing. These SDKs tackle the immense complexity at the intersection of computing, algorithms, and science. The compound effect of Nvidia’s full-stack approach resulted in a million-x speed-up. Today, Nvidia accelerates millions of developers, and tens of thousands of companies and startups. GTC is for all of you.
The core idea behind machine learning is that computers, presented with massive amounts of data, can extract insights and ideas from that data that no human ever could; to put it another way, the development of not just insights but, going forward, software itself, is an emergent process. Nvidia’s role is making massively parallel computing platforms that do the calculations necessary for this emergent process far more quickly than was ever possible with general purpose computing platforms like those undergirding the PC or smartphone.
What is so striking about Nvidia generally and Huang in particular, though, is the extent to which this capability is the result of the precise opposite of an emergent process: Nvidia the company feels like a deliberate design, nearly 29 years in the making. The company started accelerating defined graphical functions, then invented the shader, which made it possible to program the hardware doing that acceleration. This new approach to processing, though, required new tools, so Nvidia invented them, and has been building on their fully integrated stack ever since.
The deliberateness of Nvidia’s vision is one of the core themes I explored in this interview with Huang recorded shortly after his GTC keynote. We also touch on Huang’s background, including immigrating to the United States as a child, Nvidia’s failed ARM acquisition, and more. One particularly striking takeaway for me came at the end of the interview, where Huang said:
Intelligence is the ability to recognize patterns, recognize relationships, reason about it and make a prediction or plan an action. That’s what intelligence is. It has nothing to do with general intelligence, intelligence is just solving problems. We now have the ability to write software, we now have the ability to partner with computers to write software, that can solve many types of intelligence, make many types of predictions at scales and at levels that no humans can.
For example, we know that there are a trillion things on the Internet and the number things on the Internet is large and expanding incredibly fast, and yet we have this little tiny personal computer called a phone, how do we possibly figure out of the trillion things in the internet what we want to see on our little tiny phone? Well, there needs to be a filter in between, what people call the personalized internet, but basically an AI, a recommender system. A recommender that figures out based on the nature of the content, the characteristics of the content, the features of the content, based on your implicit and your explicit and implicit preferences, find a way through all of that to predict what you would like to see. I mean, that’s a miracle! That’s really quite a miracle to be able to do that at scale for everything from movies and books and music and news and videos and you name it, products and things like that. To be able to predict what Ben would want to see, predict what you would want to click on, predict what is useful to you. I’m talking about things that are consumer oriented stuff, but in the future it’ll be predict what is the best financial strategy for you, predict what is the best medical therapy for you, predict what is the best health regimen for you, what’s the best vacation plan for you. All of these things are going to be possible with AI.
As I note in the interview, this should ring a bell for Stratechery readers: what Huang is describing is the computing functionality that undergirds Aggregation Theory, wherein value in a world of abundance accrues to those entities geared towards discovery and providing means of navigating this world that is fundamentally disconnected from the constraints of physical goods and geography. Nvidia’s role in this world is to provide the hardware capability for Aggregation, to be the Intel to Aggregators’ Windows. That, needless to say, is an attractive position to be; like many such attractive positions, it is one that was built not in months or years, but decades.
This interview has been lightly edited for clarity, and is also available as a podcast; to listen in your podcast player sign-up for a Stratechery account and add your personal podcast feed.
Jensen Huang, I am very excited to get a chance to talk to you. I’m a huge admirer. I used to build computers when I was in college, and I was so pumped to get a TNT card. 3dfx was the one on the market, and I’m like, “No, NVIDIA’s approach is so much better than having this accelerator with a little cable behind it sitting on top of a graphics card.” Here we are, it’s twenty-five years later, and I get a chance to talk to you in person, you’re still at it, so I’m very excited.
Jensen Huang: (Laughing) Well, TNT really put us on the map. RIVA 128 was a big, big risk, and we invented several things to make it possible. Number one was a floating point setup engine, number two was a texture cache, and then number three was really, really wide memory, and pushing memory performance to limits that nobody’s ever seen in PCs up to that point. Those three things turned out to have been really groundbreaking, and with TNT, we doubled everything. We did dual textures and we set up two-pixel pipelines, and it was the start, TNT was the start of the multi-pixel pipeline architecture, and that turned out to have been groundbreaking work that carried graphics architecture for another decade.
Well, it’s interesting because I mean, not to hop ahead, but I was going to ask you about the shift to memory bandwidth and super wide just being more and more important. One of the things that was really striking in your keynote this time was every time whether you talked about chips, or you talked about your new CPU, or you talked about your systems, you basically just spent the whole time talking about memory, and how much stuff can be moved around. It’s interesting to hear you say that that was actually a key consideration really from the beginning. Everyone thinks about the graphics part of it, but you have to keep those things fed, and that’s actually been important all along as well.
JH: Yeah, that’s exactly right. It turns out that in computer graphics, we chew through more memory bandwidth than just about anything because we have to render to pixel, and because it’s a painter’s algorithm, you paint over the pixels over and over and over again, and each time, you have to figure out which one’s in front of which, and so there’s a read-modify-write, and the read-modify-write chews up more memory bandwidth, and if it’s a blend, that chews up more memory bandwidth. So, all of those layers and layers and layers of composition just chews up a ton of bandwidth, and as we moved into the world of machine learning and this new era of computing where the software is not written just by a human, the architecture’s created by the human, but the architecture’s tuned by the machine studying the data, and so we pump in tons and tons of data so that the machine learning algorithm could figure out what the patterns are, what the predictive features are, and what the relationships are. All of that is just memory bandwidth and so we’re really comfortable with this area of computation, so it goes all the way back to the very beginning as served as well.
I have always said the most misnamed product in tech is the personal computer, because obviously the personal computer is your phone and not the PC you leave on your desk, but we’ve wasted such a great name. I feel like GPUs is like the opposite direction. We call this a graphics processing unit, but to your point, the idea of keeping it fully fed, doing relatively simple operations and massively parallel all at the same time, that’s a specific style of computing that happened to start with graphics, but we’re stuck calling it GPU forever instead of, I don’t know, advanced processing unit or whatever it should be. I mean, what should the name be?
JH: Once GPU took off and we started adding more and more capabilities to it, it was just senseless to rename it. There were lots of ideas. Do we call it GPGPU? Do we call it an XPU? Do we call it a VPU? I just decided that it wasn’t worth playing that game, and what we ought to do is assume that people who buy these things are smart enough to figure out what they do, and we’ll be clever enough to help people understand what the benefits are, and we’ll get through all the naming part.
The thing that is really remarkable, if you look at TNT, it was a fixed function pipeline, meaning every single stage of the pipeline, it did what it did, and it moved the data forward, and if it ever needed to read the data from the frame buffer, the memory, if it ever needed to read the memory data back to do processing, it would read the data, pull it back into the chip, and do the processing on it, and then render it back into the frame buffer, doing what is called multipass. Well, that multipass, a simple fixed function pipeline approach, was really limiting, which led to the invention several years later of the programmable shader which —
This is great. You are literally walking down my question tree on your own, so this is perfect. Please continue.
JH: (laughing) So, we invented a programmable shader, which put a program onto the GPU, and so now there’s a processor. The challenge of the GPU, which is an incredible breakthrough, during that point when we forked off into a programmable processor, to recognize that the pipeline stages of a CPU was, call it umpteen stages, but the number of pipeline stages in a GPU could be several hundred, and yet, how do you keep all of those pipe stages and all of those processors fed? You have to create what is called a latency tolerant processor, which led to heavily threaded processors. Whereas you could have two threads in a microprocessor going in any CPU core, hyper-threading, in the case of our GPU, at any given point in time, we could have 10,000 threads in flight. So it’s 10,000 programs, umpteen thousand programs, that are flying through this processor at any given point in time, which really reinvented the type of this new style of programming, and our architecture called CUDA made it accessible, and because we dedicated ourselves to keeping every generation of processors CUDA-compatible, we invented a new programming model. That was all started right around that time.
I’m actually curious about this, because what is fascinating about NVIDIA is if you look backwards, it seems like the most amazing, brilliant path that makes total sense, right? You start by tackling the most advanced accelerated computing use case, which is graphics, but they’re finally tuned to OpenGL and DirectX and just doing these specific functions. You’re like, “Well, no, we should make it programmable.” You invent the shader, the GeForce, and then it opens its door to be programmed for applications other than graphics. NVIDIA makes it easier and more approachable with CUDA, you put SDKs on top of CUDA, and now twenty-five years on NVIDIA isn’t just the best in the world at accelerated computing, you have this massive software moat and this amazing business model where you give CUDA away for free and sell the chips that make it work. Was it really that much on purpose? Because it looks like a perfectly straight line. I mean, when you go back to the 90s, how far down this path could you see?
JH: Everything you described was done on purpose. It’s actually blowing my mind that you lived through that, and I can’t tell you how much I appreciate you knowing that. Just knowing that is quite remarkable. Every part of that you described was done on purpose. The parts that you left out, of course, are all the mistakes that we made. Before there was CUDA, there was actually another version called C for Graphics, Cg. So, we did Cg and made all the mistakes associated with it and realized that there needed to be this thing called shared memory, a whole bunch of processors being able to access onboard shared memory. Otherwise, the amount of multipassing-
Yeah, the coherence would fall apart, yeah.
JH: Yeah, just the whole performance gets lost. So, there were all kinds of things that we had to invent along the way. GeForce FX had a fantastic differentiator with 32 bit floating point that was IEEE compatible. We made a great decision to make it IEEE FP32 compatible. However, we made a whole bunch of mistakes with GeForce FX —
Sorry, what does that mean? The IEEE FP32 compatible?
JH: Oh, the IEEE specified a floating point format that if you were to divide by zero, how do you treat it? If it’s not a number, how do you treat it?
Got it. So this made it accessible to scientists and things along those lines?
JH: So that whatever math that you do with that floating point format, the answer is expected.
JH: So, that made it consistent with the way that microprocessors treated floating point processes. So we could run a floating point program and our answer would be the same as if you ran it on a CPU. That was a brilliant move. At the time, DirectX’s specification of programmable shaders was 24 bit floating point, not 32 bit floating point. So, we made the choice to go to all 32 bits so that whatever numerical computation is done is compatible with processors. That was a genius move, and because we saw the opportunity to use our GPUs for general purpose computing. So that was a good move.
There were a whole bunch of other mistakes that we made along the way that tripped us up along the way as we discovered these good ideas. But each one of these good ideas, when they were finally decided on, were good. For example, recognizing that CUDA was going to be our architecture and that we would, if CUDA is a programmable architecture, we have to stay faithful to it and made sure that every generation was backwards compatible to the previous generation, so that whatever install base of software was developed would benefit from the new processor by running faster. If you want developers, they’re going to want install base, and if you want install base, they have to be compatible with each other. So, that decision forced us to put CUDA into GeForce, forced us put CUDA into Quadro, forced us to put CUDA into data center GPUs, into everything, basically, and today, in every single chip that we make, it’s all CUDA compatible.
Right, now you’re reaping the benefits. I actually do have one more question on the invention of shaders. I think you said at a Stanford talk a few years ago, you were talking about how it almost killed the company, but also that you came to that decision by looking at — you needed to know what a company is about and how the company works, and you were concerned that NVIDIA was becoming a commodity part, that the gain from just accelerating OpenGL or Direct3D was going to come to an end. You needed to give artists a canvas along those lines. I’m really curious the push and pull between, “Wow, we can create this programmable model that can be extended elsewhere,” versus, “We’re driving towards a dead end and we need to look elsewhere.” Where’s the trade off there between the fear of being commoditized versus vision of this bright future?
JH: Well, that’s an excellent question, and the nature of a fixed function pipeline, the nature of a chip that does just one thing, is that it’s super efficient, and unfortunately, there are only so many pixels on the screen, and there are only so many functions you could put into a chip, and yet more transistors are coming. So at some point, the logical assumption is that graphics would be sufficiently fast for anybody’s chip and we would be commoditized. Well, it turns out that that assessment is absolutely true, and the reason for that is because you see integrated graphics with good enough graphics integrated for free all day long today. So that assessment, that prediction that someday, if we don’t reinvent computer graphics, if we don’t reinvent ourselves, and we don’t open the canvas for the things that we can do on this processor, we will be commoditized out of existence, that assessment was spot on. The challenge, of course, is to figure out when do you take action, as you mentioned, how do you find the courage to take action to put something into your processor, into your chip that was somehow programmable?
The disadvantage of programmability is that it’s less efficient. As I mentioned before, a fixed function thing is just more efficient. Anything that’s programmable, anything that could do more than one thing just by definition carries a burden that is not necessary for any particular one task, and so the question is “When do we do it?” Well, there was also an inspiration at the time that everything looks like OpenGL Flight Simulator. Everything was blurry textures and trilinear mipmapped, and there was no life to anything, and we felt that if you didn’t bring life to the medium and you didn’t allow the artist to be able to create different games and different genres and tell different stories, eventually the medium would cease to exist. We were driven by simultaneously this ambition of wanting to create a more programmable palette so that the game and the artist could do something great with it. At the same time, we also were driven to not go out of business someday because it would be commoditized. So somewhere in that kind of soup, we created programmable shaders, so I think the motivation to do it was very clear. The punishment afterwards was what we didn’t expect.
What was that?
JH: Well, the punishment is all of a sudden, all the things that we expected about programmability and the overhead of unnecessary functionality because the current games don’t need it, you created something for the future, which means that the current applications don’t benefit. Until you have new applications, your chip is just too expensive and the market is competitive.
So this is actually a point that I’m very interested in, because one of the arguments I’ve made about Nvidia is that actually the best analogy in tech to Nvidia is Apple. And the reason is because, what is Apple famous for? The deep integration of software and hardware, and that’s basically what Nvidia has going, and this is clearly the genesis of this.
What’s clicking for me is Nvidia starts out as just a hardware company, and not even just a hardware company, you’re a design company because foundries are making your chips for you. So you’re making these chips and lots of people are selling them. I think I had an ASUS or there was an MSI, the whole thing was like, who could best wring out like a couple more megahertz from the chip, and there were crazy cooling solutions, I was into all of it. So that’s just a hardware company.
Then you build this shader model that can be programmable and it sounds like you thought people would leap at the opportunity, but you realized you have to actually build the opportunity, you have to build all the infrastructure, you have to build CUDA, you have to build all the SDKs, and that was almost where Nvidia just flipped from being a hardware company to really being the integrated behemoth you are today. Is that sort of the genesis moment?
JH: That’s exactly right. On the day that you become processor company, you have to internalize that this processor architecture is brand new. There’s never been a programmable pixel shader or a programmable GPU processor and a programming model like that before, and so we internalize. You have to internalize that this is a brand new programming model and everything that’s associated with being a program processor company or a computing platform company had to be created. So we had to create a compiler team, we have to think about SDKs, we have to think about libraries, we had to reach out to developers and evangelize our architecture and help people realize the benefits of it, and if not, even come close to practically doing it ourselves by creating new libraries that make it easy for them to port their application onto our libraries and get to see the benefits of it. And even to the point of marketing, helping them market this version so that there’d be demand for this software that they do on our platform and on and on and on, to having a GTC so that we have developers conference. All of that stemmed out of this particular experience.
And it sounds like it stemmed out of a bit of panic where you’re like, “We created this huge amount of overhead for us. No one’s buying this chip because there’s nothing there”, that was like the mother of all crunch, needless to say, those couple years, I can imagine.
JH: Thank goodness the core GPU itself, ignoring the programmable shading, but the core GPU itself was incredibly good. So running old applications, we really, really rocked it and so we had the best performance in the world, but nonetheless, we had this incredible overhead, a lot of electronics, a lot of transistors and a lot of cost, and so our gross margins was under pressure all the time. Anyways, all of that happened right about that time.
You talked about there being four layers of the stack in your keynote this week. You had hardware, system software, platform, and then application frameworks, and you also have said at other times that you believe these machine learning opportunities require sort of a fully integrated approach. Let’s start with that latter one. Why is that? Why do these opportunities need full integration? Just to step back, the PC era was marked by modularity, you had sort of the chip versus the operating system versus the application, and to the extent there were integrations or money to be made, it was by being that connective tissue, being a platform in the middle and the smartphone era on the other hand was more about integration and doing the different pieces together. It sounds like your argument is that this new era, this machine learning-driven era, this AI era is even more on the integrated side than sort of the way we think about PCs. Why is that? Walk me through that justification.
JH: Simple example. Imagine we created a new application domain, like computer graphics. Let’s pretend for a second it doesn’t run well on a graphics chip and it doesn’t run well on a CPU. Well, if you had to recreate it all again, and it’s a new form of computer science in the sense that this is the way software is developed, and you can develop all kinds of software, it’s not just one type of software, you can develop all kinds of software. So if that’s the case, then you would build a new GPU and a new OpenGL. You would build a new processor called New GPU, you would build a new compiler, you would build a new API, a new version of OpenGL called cuDNN. You would create a new Unreal Engine, in this case, Nvidia AI. You would create a new editor, new application frameworks and so you could imagine that you would build a whole thing all over again.
Just to jump in though, because there was another part in the keynote where I think you were talking about Nvidia DRIVE and then you jumped to Clara, something along those lines, but what struck me as I was watching it was you were like, “Actually all the pieces we need here, we also need there”, and it felt like a real manifestation of this. Nvidia has now built up this entire stack, they almost have all these Lego bricks that they can reconfigured for all these different use cases. And if I’m Nvidia, I’m like, “Of course these must be fully integrated because we already have all the integrated pieces so we’re going to put it all together with you”. But is that a function of, “That’s because Nvidia is well placed to be integrated” or is that “No, this is actually really the only way to do it” and if other folks try to have a more modular approach, they’re just not going to get this stuff working together in a sufficient way?
JH: Well, deep learning, first of all, needed a brand new stack.
Just like graphics once did. Yeah.
JH: Yeah, just like graphics did. So deep learning needed a brand new stack, it just so happened that the best processor for deep learning at the time, ten years ago, was one of Nvidia’s GPUs. Well, over the years, in the last ten years, we have reinvented the GPU with this thing called Tensor Core, where the Tensor Core GPU is a thousand times better at doing deep learning than our original GPU, and so it grew out of that. But in the process, we essentially built up the entire stack of computer science, the computing, again, new processor, new compiler, new engine and new framework — and the framework for AI, of course, PyTorch and TensorFlow.
Now, during that time, we realized that while we’re working on AI — this is about seven years ago — the next stage of AI is going to be robotics. You’re going to sense, you’re going to perceive, but you’re also going to reason and you’re going to plan. That classical robotics problem could be applied to, number one, autonomous driving, and then many other applications after that. If you think through autonomous driving, you need real-time sensors of multiple modalities, the sensors coming in in real-time. You have to process all of the sensors in real-time and it has to be isochronous, you have to do it consistently in real-time and you’re processing radar information, camera information, Lidar information, ultrasonics information, it’s all happening in real-time and you have to do so using all kinds of different algorithms for diversity and redundancy reasons. And then what comes out of it is perception, localization, a world map and then from that world map, you reason about what is your drive plan. And so that application space was a derivative, if you will, of our deep learning work, and it takes us into the robotic space. Once we’re in the robotic space and we created a brand new stack, we realized that the application of this stack, the robotic stack, could be used for this and it could be used for medical imaging systems, which is kind of multi-sensor, real-time sensor processing, used to be traditional numerics.
Right. Well, it’s like you started out with like your GPU like, “Oh, it could be used for this and this and this”. And now you built a stack on top of the GPU and it’s like, it just expands. “It could be used for this and this and this.”
JH: That’s exactly right, Ben! That’s exactly right. You build one thing and you generalize it and you realize it could be used for other things, and then you build that thing derived from the first thing and then you generalize it and when you generalize it, you realize, “Hold on a second, I can use it for this and as well”. That’s how we built the company.
So you’ve articulated this vision, and it’s in such a contrast to the traditional general purpose CPU, this idea coming along where you could do something dedicated and do it really well and then you realize there’s all these opportunities to do these parallel problems, and you’ve built this entire ecosystem on top of it. And I get the point where in this world you might want to have your own CPU, like for example, the Grace Architecture. You really focus, as I mentioned earlier, on the memory. It exists to feed GPUs, that kind of seems to be the point of it, which is very striking compared to say Intel, which traditionally is focused on single thread performance. That’s a very different problem space and your contention is like, “Well, actually memory bandwidth and GPU performance matters most for what we’re doing”. All this makes total sense, and you can license the ARM architecture.
Given that, why did you want to buy ARM? What was the connection there where there’s other folks that just want a general purpose processor. Why even bother dealing with that whole ecosystem when you could just build what you need for your new vision of computing?
JH: Three reasons. First, it was up for sale.
Yeah. On the market.
JH: Right, it was on the market. Number two, it is a singular asset. You’ll never find another ARM asset again. It’s a once-in-a-lifetime thing that gets built, and another one won’t get built. The third reason is that we felt and we believed and it’s true that their primary focus on mobile was fantastic, but their future focus ought to be the data center. But the data center market is so much smaller in units compared to mobile devices, that the economic motivation for wanting to focus on servers may be questionable. If we own the singular asset that’s up for sale, and we were to motivate them, encourage them, direct them in addition to continuing to work on mobile, also work on data centers, it would create another alternative CPU that can be shaped into all kinds of interesting computers. Within our ownership, we would be able to channel them much more purposefully into the world’s data centers.
Well, in the last two years, the two companies spent a lot of time together, not in working together, but in seeing the future together. And we’ve succeeded, I think, and they naturally also were starting to feel that way, that the future of data centers is a real opportunity for them, and if you look at the ARM roadmap since two years ago, the single threaded performance roadmap of ARM has improved tremendously. So irrespective of the outcome, I think the time that we had spent with them has been phenomenally helpful for the whole industry for us.
So the breakup fee was a strategy consulting fee is basically how it turned out.
JH: Well, you know what? We can always go make more money. It was worth a try.
The other thing about it that always struck me about ARM, and this is something that you get if you really go back and look at Intel, and particularly with Pat Gelsinger being in charge, where he was always so articulate about the importance and value of software. And Intel, back in the day, they wanted to switch to RISC, and he was like, “No, the ecosystem is already built up. Moore’s Law will catch us up in any performance issues, the software system is so important.” What always struck me about ARM was the missing piece was software, and that was where I saw Nvidia coming in. I think you said something when they bought it, that you’re going to create all this value for ARM, so you might as well harvest it by owning the asset.
I’m curious though, to the extent that was the case where they just needed more software, they needed to have a more integrated model, is that something that Nvidia is still going to invest in? Or do you feel ARM’s going down a good path, Amazon’s now really investing heavily, particularly on the software side, so your approach now will be “We are going to focus on what’s best for us with with Grace and feeding our GPUs”? Or do you still see Nvidia contributing to this data center project broadly? Or you’re going to just stay focused on what works for you?
JH: We’re going to have to stay focused on what works for us now. ARM sees the importance of investing in software as well and separately, they’ll do what’s in the best interest of ARM overall, but our focus will be on what’s in the best interest of Nvidia of course.
Makes sense. I was going to start with this. A friend of mine told me my first question should be, is this the real Jensen Huang, given your antics at previous keynotes! But speaking of the real Jensen Huang, I wanted to ask you a little bit about your background. You were born in Taipei, which obviously I’m very familiar with, and then you, I believe, came to Kentucky when you were nine. Did you speak English at that point?
JH: We had started to learn English and my parents had put us in international schools to learn English. So we learned a lot more English after we got to the United States, but we were able to converse.
Got it, so you weren’t just fresh off the boat. How much of an impact do you think did the experience of being an immigrant have on you? There are lots of immigrants in Silicon Valley, but you’re sort of interesting in that you were an immigrant as a kid. Nine years old is old enough to remember where you came from, but you’re also sort of joining school in fourth or fifth grade. How much of an impact did that have on your worldview? Or maybe I’m just overthinking it.
JH: (laughing) I don’t know that I had a worldview when I came, we came in 1973, there weren’t that many Chinese people in Kentucky, and we were quite strange. And the boarding school we were in, we were fortunate to be able to go because they were open to students from many different backgrounds and many different countries. Oneida Baptist Institute was really wonderful for us because one, they were quite affordable and you could just imagine coming from Asia to United States and being able to go to a boarding school, the affordability was important. But everybody had chores, I was nine years old and my older brother was eleven and he had to work in the tobacco farms and I had to clean bathrooms.
Who had the better deal there?
JH: You know, hard to say. I don’t know.
(laughing) Neither were great.
JH: I was the only kid cleaning bathrooms, that was a lot of bathrooms. I think that we learned hard work, but it never occurred to us that it was hard work, we just thought that’s what kids do. There’s something about being normalized to a particular condition, you just think that’s the way it is. So we worked hard, there were tough kids, there was a lot of tough talk, one hundred percent of the kids smoked — I didn’t, but everybody else did. There were a lot of kids in trouble and there were a lot of troubled kids, but that was our normal environment and somehow we managed to ignore all that and stuck to what we did.
Two years later, our parents came to the United States. My only recollections of Kentucky were just being happy, of course, we missed our parents. My older brother and I both played soccer, I was on the swim team, I played table tennis, we had a lot of sport, we did a lot of work. There was a Vietnam vet who was the handyman in the dorm and every night, he had his chores to do and he would ask me if I wanted to follow him around. So I followed him around and helped him with his chores, at the end of the night, he’d give me a soda pop. I thought that was fantastic.
When you’re a kid, you just don’t know any better. I think that your expectations are set by your surroundings and my expectations were set at that time. And that was a great set, I think I was biased for hard work, I expect very little and it doesn’t matter what the circumstances are, you make the best of it. Those kind of upbringings, I think it’s good.
It’s interesting because this is the first time we’ve talked, but there’s an aspect of being a long time observer of Nvidia where it almost makes me feel like I have a handle on who Jensen Huang is, and that’s because as far as I can tell, Nvidia is basically Jensen at scale. You’ve noted in the past that Nvidia doesn’t have a mission statement. It’s actually very funny, if you go on Google and search for Nvidia’s mission statement, it pulls this completely random text off the internet because the automated systems don’t know what to do about it. But I think you’ve said in other places you just want to give people room to innovate and that you want to be able to do new things and do things with friends. It almost sounds like you set out to create an extension of yourself, you just want to do cool new stuff and it succeeded to the point where you have 24,000 friends doing it along with you. Can you separate Nvidia and Jensen Huang? Am I sort of tapping into something that’s the reality of what Nvidia is?
JH: I would say that my greatest gift is surrounding myself with amazing people and giving them the opportunity to do amazing work and helping them achieve more than they thought was possible, and together as a team, do the impossible. Ben, that’s my gift. Ever since I was a kid, I was surrounded by smart kids and when I was working, I was always surrounded by smart people. And if not for Chris [Malachowsky] and Curtis [Priem], Nvidia wouldn’t be here. I carried the torch longer than anybody, but that’s just because I’m resilient. Once I get on a road, I can stay on that road for a very, very long time and believe in it for a very long time, I’m just resilient that way.
I think what Nvidia embodies is really quite unique in the sense that I don’t know of one computing company in the world today that has the breadth and depth of talent that Nvidia has, just absolutely bar none from our chief scientist to our engineering leads, to our software leads, to our chip design, to system design. They’re all the world’s best, I mean, utterly the world’s best. That just doesn’t exist anywhere today. In no place on earth can a CEO start from a blank sheet of paper, have a dream about something that we want to make together, and from every single transistor and every lick of software and every component of the system build it completely from scratch and it will be utterly expected to work and it will work and it will be world class. I just don’t know that that has ever existed in the world of computing. That’s what Nvidia’s about, somehow we created this environment where people could do their life’s work and they attract and bring along with them incredible people. Somehow in that environment magic happens.
Does that make other companies intimidated? I mean, Nvidia doesn’t always have the best sort of partnership reputation. Or is it just like “We can do it better than you”? Or is it just actually “We have to do the whole thing to make it work”? I mean, it’s like you’re on your road, you see the destination, the destination is far further, it’s going to take far longer than most people can handle, but you are just going to get there and if people don’t want to come along, then that’s fine because you know where you’re going.
JH: Well, first of all, I think that few companies have had the length and depth of partnerships that Nvidia has held and I’ve held in technology. I am probably TSMC’s longest running CEO partner, and surely our company is one of their longest running partners. It’s the same way with SPIL, it’s the same way with Foxconn, it’s the same way with Asus and MSI. The length of partnerships that we’ve held is unprecedented, really. It is the case, that we fork off into our own ways, beyond PCs as you mentioned earlier, somehow we continue to nurture our PC presence and continue to expand and expand our vision for PCs. The work that we’ve done in PCs is something I’m very proud of, but we’re not bound by that. We forked off and started working on accelerated computing for data centers and we forked off and started working on AI and we forked off and started working on robotics. We’re not limited by the partnerships that we’re in, but we nurture the partnerships for a very long time, but we keep growing new partners. I think maybe from the outside in we look self-propelled, almost autonomous.
Because you are.
JH: Because we are, yeah. And that’s kind of one of our natures. And we reinvent ourselves at will.
You’re making these massive supercomputers, you talked about them in the keynote this week, and you’re launching sort of the outline of a cloud service, is there a future where you’re fully integrated into being a service and most people use Nvidia stuff by renting it from you for all intents and purposes?
JH: If we ever do services, we will run it all over the world on the GPUs that are in everybody’s clouds, in addition to building something ourselves, if we have to. One of the rules of our company is to not squander the resources of our company to do something that already exists. If something already exists, for example, an x86 CPU, we’ll just use it. If something already exists, we’ll partner with them, because let’s not squander our rare resources on that. And so if something already exists in the cloud, we just absolutely use that or let them do it, which is even better. However, if there’s something that makes sense for us to do and it doesn’t make for them to do, we even approach them to do it, other people don’t want to do it then we might decide to do it. We try to be very selective about the things that we do, we’re quite determined not to do things that other people do.
Well, I’ll leave you the opportunity for the last question. I think the theme that you’ve been touching on, particularly over the last few months is this idea of manufacturing intelligence or being an intelligence manufacturer where the AI is making AI, and how that’s part of your vision. Is that a good summary? I’ll give you a chance to do a better summary and close us out here.
JH: Ben, in no time in history have humans have the ability to produce the single most valuable commodity the world’s ever known, which is intelligence. We now have a structure of a model, a structure of a computer science program called a deep neural network, that has the ability to scale up quite tremendously. It’s doubling every six months, I mean, this is not your Moore’s Law where it’s doubling every two years, it’s doubling every six months. The rate of doubling is incredible, the compounded effect of that on computing is incredible.
The results of the capabilities of these neural networks and the software, another way of saying it, the software that is being created by computers is expanding and growing and achieving spectacular things at incredible rates. Our company is building the computers necessary to continue to advance that journey. I think that what companies are going to come to realize is that what they’re really all about is producing intelligence, and that’s what Nvidia’s really all about is producing intelligence. Some part of every company will automate the production of their intelligence, they’ll codify the production of their intelligence, which is one of the reasons why I believe every company will be an AI company, every company will produce intelligence at some level, all of that AI will be augmenting humans with humans in the loop, and the rate of that progress is accelerating and compounding at 2x every six months.
And it’s all going to run on Nvidia chips.
JH: Well, a lot of it is going to run on Nvidia chips, we hope.
(laughing) That’s a good thing to hope for, particularly when you have a good reason to believe that it’s going to happen. What do you say to people that they hear you talk AI, like “Oh, AI is always 10 years away. Nvidia’s is just making money on cryptocurrency.” What’s your response to be like, “No, actually it’s a real thing and you should sort of hop on board”?
JH: Well, it’s a real thing because first of all, what is intelligence? Intelligence is the ability to recognize patterns, recognize relationships, reason about it and make a prediction or plan an action. That’s what intelligence is. It has nothing to do with general intelligence, intelligence is just solving problems. We now have the ability to write software, we now have the ability to partner with computers to write software, that can solve many types of intelligence, make many types of predictions at scales and at levels that no humans can.
For example, we know that there are a trillion things on the Internet and the number things on the Internet is large and expanding incredibly fast and yet we have this little tiny personal computer called a phone, how do we possibly figure out of the trillion things in the internet what we want to see on our little tiny phone? Well, there needs to be a filter in between, what people call the personalized internet, but basically an AI, a recommender system. A recommender that figures out based on the nature of the content, the characteristics of the content, the features of the content, based on your implicit and your explicit and implicit preferences, find a way through all of that to predict what you would like to see. I mean, that’s a miracle! That’s really quite a miracle to be able to do that at scale for everything from movies and books and music and news and videos and you name it, products and things like that. To be able to predict what Ben would want to see, predict what you would want to click on, predict what is useful to you. I’m talking about things that are consumer oriented stuff, but in the future it’ll be predict what is the best financial strategy for you, predict what is the best medical therapy for you, predict what is the best health regimen for you, what’s the best vacation plan for you. All of these things are going to be possible with AI.
This is going to sound very familiar to my readers; one of the things I write about is this idea of the way value accrues on the Internet, in a world of zero marginal costs where there’s just an explosion and abundance of content, that value accrues to those that help people navigate that content. What I’m hearing from you is, yes the value accrues to people that navigate that content, but someone has to make the chips and the software so that they can do that effectively. It’s almost like it used to be it was Windows was the consumer-facing layer, and Intel was the other piece of the Wintel monopoly. This is Google, Facebook on the consumer side and a whole host of companies the other sides, and they’re all dependent on Nvidia, and that sounds like a pretty good place to be.
JH: Well, we try our best to be of service to everybody.
A long way from cleaning toilets!
JH: One of the things Ben that I’m really proud of is we’re the only AI company in the world that works with every AI company in the world. We’re a good partner, we make great contributions to other people’s success and our technology is excellent and our progress is incredible. As a result, people enjoy working with us and partnering with us. We can contribute from fundamentals, building chips, to all of the operating system layers of AI, to the AI algorithms themselves. So I think a company of our type has never existed before and I’m really proud of it.
Well, it’s been an absolute pleasure to talk to you. This is one of the longest CEO interviews ever done, but I think it was well worth it. Thanks for taking the time.
JH: Thank you Ben. It’s great to talk to you.
This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery.
The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly.
Thanks for being a supporter, and have a great day!