An Interview with Daniel Gross and Nat Friedman about the AI Product Revolution

Good morning,

A quick bit of housekeeping:

  • First, I missed this tweet from Elon Musk that clarified that the ‘For You’ tab will also include accounts you follow; that was not clear to me, although I should have checked, and reduces my evaluation of the approach to “probably not a good idea” from “truly terrible idea”, albeit for the same reasons I laid out in yesterday’s Update.
  • Second, we discuss Twitter, that open letter about AI, and TikTok on the latest episode of Sharp Tech, which will be released later today; you can add the podcast to your podcast player using the link at the bottom of this email.
  • Third, as I noted yesterday, I will be on vacation next week; the next Update will be on Monday, April 10.

I first interviewed Daniel Gross and Nat Friedman last October, where a major theme was the lack of AI products, despite the clear capabilities of AI models like GPT 3. We checked in again in December after ChatGPT completely changed the conversation. Well, it’s been three months and the product explosion is well and truly here, so I wanted to chat with Gross and Friedman again to discuss exactly that.

Gross founded Cue, a search engine that was bought by Apple and incorporated into iOS, and led machine learning efforts at Apple from 2013-2017, before becoming a partner at YCombinator and then transitioning into angel investing. Friedman co-founded Xamarin, an open-source cross-platform SDK which was bought by Microsoft in 2016; Friedman led Microsoft’s acquisition of GitHub of 2018, and was CEO of the developer-focused company until last year; he too is now focused on angel investing.

I did want to call out two neat projects that we didn’t get to in the interview:

  • First, Friedman set up the Nat.dev sandbox which is like the OpenAI sandbox, but you get access to non OpenAI models as well.
  • Second, Gross and Friedman created the Vesuvius Challenge to incentivize teams to leverage machine learning to read scrolls from ancient Rome buried under ash from Mount Vesuvius in 79 AD.

I really regret forgetting to ask about the Vesuvius Challenge — there was a lot to get to! — but the website gives a great overview of the project that should be of interest to everyone.

To listen to this interview as a podcast, click the link at the top of this email to add Stratechery to your podcast player.

On to the interview:

An Interview with Daniel Gross and Nat Friedman about the AI Product Revolution

This interview is lightly edited for clarity.

The AI Product Revolution

Nat and Daniel, welcome back for what is now our quarterly AI catch-up, and I just have to put a disclaimer at the top of this interview, which is we are talking on Monday night and this won’t publish until Thursday, so I apologize in advance for the nine, ten major announcements that we’re probably going to miss in these few days. So I just want to get that out up top.

Nat Friedman: Obsolete before it even goes out!

It’s incredible. I was putting together this list and I made it through just last week and I’m like, “How are we going to get through this in an hour?” and that’s not even going back to things like Bing or Bard. However, before we get to all the major announcements, Nat I wanted to go back to our podcast from six months ago, because it kind of ties into this explosion. Your whole thesis was we need to be talking about products, not just papers and that was a goal behind AI Grant. I think we’re now talking about a lot more products, so where do you think we are now? Is there still a gap?

NF: Yeah, and it’s amazing. I think Daniel and I were both last summer in this situation where we had spent at that point years playing with these new GPT models and just being blown away by their capabilities, and I’d been in this lucky position at GitHub to get to put together Copilot and put that out, and I expected after that just a flurry of new products as other people went through that same process and discovered, “Oh my goodness, GPT-3 can do all these incredible things. We should build into this product or that product.” That didn’t happen. And so by last summer, early fall, we were scratching our heads saying, “Where is everybody?” That’s why we re-launched AI Grant with this call to action, this credo saying, “Hey, where are all the product developers? It’s time to pay attention to AI.”

Obviously since then a ton has changed, and really it was ChatGPT in December that fired the starting gun, and so I think you could really consider us to be in month three or four now of the AI product revolution. It would be hard to imagine what it would look like for more people than are currently doing so, to be integrating these models into products, it feels like we’re on maximum overdrive.

That said, I think even if the researchers stopped right here and they didn’t produce any more capabilities, it would take us something like five or ten years to digest just what GPT-4 can do and all the other state-of-the-art models can do, into products. There are so many variations and variants and workflows and user experiences that need to be invented and reinvented and permutations that need to be tried, and we’ve just started to scratch the surface, and right now we have this narrative that’s out there about value capture accruing to incumbents. But I think part of the reason for that is that we’re just doing the obvious thing, we’re just bolting these models into existing products.

But I think operating systems will need to be rebuilt around these capabilities. The things that we can do with voice now, like incredible voice recognition, super high performance on-device, incredible language models that can do reasoning, the self-checking, the data lookup capabilities, the integrations, the voice synthesis, which is now hyperrealistic and multiple startups have demonstrated that, I think you could take a decade and rebuild the entire computing platform on this. So I would say still, we’re in the state where the researchers are way ahead and there’s a lot of digesting to do, but it’s hard to imagine how we could possibly go faster.

Daniel Gross: Nat, just a question on that. The Internet took two decades I think in order to fully reach a point of maturation and saturation. On the other hand, the rate of growth of some of the companies that find product-market fit in AI is incredible now, I think in part because everything is already fully networked and connected. So my question to you is, do you think it would take a decade? Things that work, work so quickly now. It does seem like all reality that we’re living is at 2x to 10x.

NF: Yeah, things definitely feel like they’re going fast and being able to code with GPT-4 certainly makes it faster. I don’t think the diffusion will be slow. I think the thing that still will take time is figuring out what AI-native software actually looks like and not just incrementally improving the existing workflows and software, but building the really AI native things.

I agree with you Nat. I think what Daniel’s driving at or you were driving at is there’s always the V1 of any new technology and that technology basically says, “Oh, we can do what we did before, but we can do it in this new format.” AI is so compelling that there’s going to be huge businesses that do just that. I think the most obvious one — which is exactly why I found the demo compelling — is the Microsoft Office stuff where it’s like, “Oh, your word processor can now write by itself.” Duh, it makes total total sense.

But does that mean that’s actually the optimal productivity application of AI? I think probably not, but just like Daniel, you mentioned the Internet — yeah, we had a decade to figure out that it should be a feed, for example, is the optimal way to deliver content. I don’t think that was a function of there being an insufficient number of people using it, it is a function of it just takes time to reset and figure things out and it sometimes takes a new generation that isn’t coming in with the paradigms of the old one.

DG: Yeah, I think that’s right, and everyone’s worried about job displacement, and I think that’s a plausibly real and interesting problem, but to me what’s exciting is the marginal cost of building software will go to zero and so there’s all these things that are never being built just because they’re too much of a schlep to even consider building. You need a software engineer in order to build it, but if making software can be done at the same ease as literally sketching on a notepad, then there will be just more weird and interesting software and that non-consumption angle, I think everyone undervalues, which should be really exciting.

I just saw a thing on Twitter today about non-consumption — everyone uses the Uber example where people were valuing the market based on taxis, but there’s a better example which I just saw today: they were talking about some analyst notes when Apple was valued at I think it was $200 billion or whatever it was, and they had a sell attached to it because they’re like, “Look, if Apple takes 100% of phone market share, they’re not going to live up to their valuation.” The analysis didn’t appreciate that number one, they would dramatically expand the market, because while the analyst was using the smartphone market share, they would basically take over all of phones, and number two, their pricing power would be so huge they would go from hundred dollar phones to people buying thousand dollars phones. It’s such a tangible example of the mistake analysts make about technology areas again and again and again, which is they look at what’s there and then they map it to the new thing. And non-consumption, to your point, is exactly what makes billion, trillion dollar companies.

DG: Trillions, that’s right.

NF: The other thing I would just say is that the capabilities are not going to stop here, they are going to keep going, and the dramatic improvements we’ve seen in capabilities over the last year or two I think are very likely to continue. And those are these big step changes, so even if you do design for March 2023 native-AI capabilities, March 2024 may present you with completely different primitives and tools and it’s going to be a whole new wave of things to digest into products.

The SaaS Hangover

DG: I thought it was interesting because I think once when we were on here Nat you were making, I think, the very credible case that things weren’t actually moving that quickly. I think GPT-3 was a year and a half old and I think it was Andy Grove who had that metaphor that technology is sort of like a river with rapids at different speeds so you have decades where nothing happens and then you’re going very quickly. Do you think things have accelerated since we spoke last and what do you think was the catalyst for that?

NF: Yeah, I think one of the things we talked about last year was this idea that if ChatGPT was your first encounter with language models, then what came next would feel very fast because GPT-4 came out just a few months later, even though we now know OpenAI’s had it for seven-plus months, under wraps, and it’s been almost two-and-a-half years since GPT-3. That said, I don’t know how anybody could feel like anything’s going slow right now, the most common sense that I have from talking to people is vertigo, people feel there’s this sort of dizzying pace of change.

To Ben’s question, where do you plant your stake if the ground is shifting beneath you all the time? That is very common. Daniel, you and I have encountered some founders who are just completely overwhelmed by this and don’t even know where to start and some feel a little despondent because they think we’re back in one of these waves that we’ve previously been in, that Ben will certainly remember, where there’s this feeling the leader could never be caught up with, or is just going to do every single possible business or product. So we definitely hear — we’re doing some therapy with some founders who’ve been exposed to this and they’re not quite sure what to do.

Yeah, people forget, but there’s a bit where your reputation always lasts longer than the reality, and you saw this with Microsoft in the 90s, where everyone was rightfully terrified of Microsoft, and then that terror extended much longer than it should have, by probably ’98 or ’99, it was kind of a spent thing in retrospect, but people didn’t stop being scared until probably the early-2000s or maybe even mid-2000s.

I think that runs the opposite direction where Microsoft was not anything worth worrying about or caring about for a long time after that. I still think the Teams versus Slack thing should have been a massive wake-up call to everyone. I think what is part of it is you could chalk that up to Microsoft having better distribution and Teams being free, so there was an excuse to continue with your old viewpoint and there was an under-appreciation that there’s an integration aspect here that’s super meaningful. I think the importance of that integration is really going to come to the forefront with the business chat thing, where if you have meaningful data for your enterprise and you’re not in the Microsoft ecosystem, you’re going to get shipped out real quickly, particularly when you layer on top of the fact that Microsoft will have an alternative that is “free”.

So yeah, it does feel tough to be a startup founder now, because if we’re right that the paradigm-shifting innovation will take a while to figure out that’s probably — I don’t know, I use your guys’ perspective as you’re investing in these companies — you need the founders that are going to start from scratch, not try to do what’s already done, but with AI.

NF: Yeah, I think that’s right. I think we’re excited about the founders who are doing new things that literally couldn’t be done before, maybe with a completely new workflow, maybe something that seems a little too weird for the mainstream companies or the large companies to want to approach it, and it’s sort of “best of times, worse of times”. Yes, the incumbents are active and they can leverage these large user bases, but there are now an entire new field of companies that are possible that couldn’t have been built before, and I think the really excited and active founders are going to go find those, and then they’ll have to probably man the tiller pretty aggressively to navigate the new capabilities as they come out, but the best founders are going to do that and be excited to do it. So yeah, probably the lazy and obvious startups might be much harder to do than they normally have been.

There’s a bit where just taking an existing enterprise functionality and make it SaaS and boom, you have a billion-dollar company — I don’t think that’s going to be the case with this AI stuff.

DG: It’s true and I think there’s been a generation of founders that have been bred by that era, which I think also to some extent was a just general zero interest rate or very low interest rate market boom era, and one thing we have seen is I think it is taking longer for the innovation pipeline of Silicon Valley to produce phenotypes that are both aware enough of the technology to be interested in it, but also building deeply enough. In a way I’ve often wondered if the AI revolution was sort of happening with your 1980s, 1990s cohort of founders, I actually think progress might be a little bit faster.

Silicon Valley is really rich with people that are doing things on the margin, things that Microsoft is clearly going to do and are little despondent that now Microsoft is doing it, and I think that’s a byproduct from a lot of these incredibly successful SaaS businesses actually being relatively thin layers. But that’s changing and I think this is sort of a different kind of revolution, but the market will adjust. And to Nat’s point, we’re really only at day one and we haven’t seen any of this native — we’re still at the cameras pointed at radio shows era of television, not at your native made-for-TV era. That that’ll happen over time, it just takes a while for that to get generated.

I think it’s a really good and important observation though about the nature of Silicon Valley. One thing that’s worth noting is when did we actually figure out the internet as an industry? It was after the bubble burst, the feed, search, all those sorts of things. Search did start in the late 90s, but by and large it became a thing after the bubble. The auction model I think was 2002 or so, Facebook comes along in 2004, 2005. That’s probably not an accident, where when the focus is money and the money seems easy, you’re going to take shortcuts to get there and the most obvious shortcut right now is take a thing that people do and add AI to it. If it’s not so easy, then you actually have to go back to first principles, and while no one’s cheering for a recession or for a bubble bursting, whatever it might be, but it is striking to look back at the timing of us figuring out the Internet.

DG: By the way, I think it’s not just a zero interest rate phenomena thing. I think programming has gotten much easier over the years, and so that’s changed the phenotype of person that starts a company. The degree of technical excellence that was required in order to make a consumer facing product, even in 2002, 2003, 2004, 2005, 2006, that was drastically different than it is today. We did a wonderful thing where we built multiple layers on the cake that make it simpler and simpler to build a technology, and you have AWS and you have React, and so you end up getting a different type of person.

I think now to really excel in AI, you have to be a little bit deep. Back when we were all starting startups, it was a hyphenated term, it was not a proper noun, it was start-dash-up, it was an obscure thing to do, and now it’s a very normal thing to do. That means that we end up seeing a lot more things that we don’t do just because the selection effects that Silicon Valley had just in terms of it really attracting these technical, brilliant, savants, weirdos, are less strong now, and so there’s more people in the pool and so selection is a little bit harder as a result. The fact that AI is so hot of course doesn’t help I think anyone, but that’s a reality.

One of the theses that I put forward before is that everyone talks about tech having a Big Five, but actually there’s a Big Six. The Big Five are obviously Apple, Amazon, Microsoft, Facebook, and Google, but the Big Sixth is basically Silicon Valley Inc. which is the SaaS producing machine, where everyone knows the playbook. You get to call yourself a startup founder and feel great about it, but the level of risk is actually very low, the level of technical execution is very low, it’s actually about building a sales team and doing sales, which to your point, this ties into the zero interest rate environment as well, where you can be encouraged to actually get super far ahead of your skis to give, “Fifteen years out, imagine what this cohort is going to be producing for us.” The startup scene was completely different, it was a corporate scene in many respects, and that’s probably the first one of the Big Six that has taken a big hit these last couple years.

DG: That’s right.

GPT-4

Speaking of though, maybe there is a new sixth. I was using the various GPTs, various flavors of them yesterday, and Nat you mentioned that it was sort of an oddity that ChatGPT came along when GPT-3 was basically already obsolete and became this huge hit and then boom, suddenly GPT-4 comes out. I think at first blush it feels fairly similar, but I have to say GPT-3.5, the default model and legacy model, feels really ancient when you’re actually using just ChatGPT. It’s really hard to get it to hallucinate, it’s just in general much more cognizant and coherent about things in general. And I don’t know, it seems like a pretty meaningful shift from an end user perspective. What’s your perspective from the API side — you have been using it longer than I think any of us? So what do you think?

NF: Yeah, Daniel and I had — OpenAI was good enough to give us early access a few months ago. So we’ve had a chance to play with it for a while and I think I agree with you that it is just smarter.

You feel like you’re talking to a higher IQ person. That’s what it feels like.

NF: Yeah, it’s just smarter. It’s always been a bit slower, I think there have been sort of spurts where it got really fast, I guess they might have provisioned a lot more GPUs to it at various points for demos or something, but even when it was slow during those last few months, I found myself in that position of asking, “Okay, am I going to reach out to my pretty smart fast friend, or my noticeably smarter but much slower friend and ask them?” And I found myself reaching for the smarter friend almost all the time and I’ll just tab away and I’ll come back to that browser window with the answer in a minute and a half and that’s just fine.

DG: Nat, question for you on that, one of the innovations of Copilot was the trade-off of a faster model that was dumber and the idea of auto-complete helping people be the product in a pre-AGI world. How does that factor in to a smarter, slower model? Is there a trade-off point for this stuff?

NF: Yeah, I think we’re figuring out what the value of intelligence is a little bit here, and it’s interesting what we’re finding. The scaling laws that lead to GPT-4 being better than GPT-3 have a logarithm built into them, so you have to put exponentially more money in to get linear returns in model quality or improvements in loss, and that’s just something everyone knows to be true. But I think what we’re also finding out is that small improvements in the model’s IQ probably lead to not super-linear improvements in the value of the model. So you have kind of this sublinear improvement in loss, but maybe a linear improvement in model value or even a super-linear improvement, it might swamp fully that logarithm, and so that’s kind of what I found in the last few months.

The thing that I find myself using it for the most — and many people have had this observation, it was hard to keep this quiet over the last few months — is just writing code is unbelievable. With GPT-4, my friction to start a project is almost zero now. I’m fearless, I’ll write whatever programming languages I’ve never used before, concepts I don’t fully understand. I still have to guide it. Everyone loves the one-shot examples where you just ask it to do something and it works out the box, I find that’s very rare. It does happen, but for really useful things it’s quite rare; it’s more like ten or twenty back-and-forths with the model.

DG: What is Wolverine?

NF: So that’s the back-and-forth that people talk about, and I think here we start to get into, maybe we’ll touch on this later, but things that are exciting and slightly frightening.

One of the ways in which you use GPT-4 is you ask it to write some code, and then eventually after you begin to trust it, you just copy-paste that code into your editor and run it without really fully understanding it, and then an error message pops up, and because you didn’t really understand the code, you don’t fully understand the error. So you could go in and understand it, but then you copy paste the error back into GPT-4 and say like, “What’s up?” And then GPT-4 says, “Oh excuse me, I made this error, here’s the updated code.” And then you copy the updated code back into your editor and now it works. So there’s a moment there when you’re copying and pasting between two windows, that’s your role in this entire system here is the copy-paster where you think like, “Shouldn’t the computer be doing this mechanical moving things back and forth?”

So there’s a Twitter account, I think it’s an anon — all the best AI accounts are anons — and I think the handle is @Bio_Bootloader, and he or she or they came out with a system in Python that just automates this back-and-forth and they call it Wolverine.py. So you can basically run Wolverine.py and then any Python script, if it throws an exception, it will ask GPT-4 to rewrite the code to fix that, and it’ll continuously do that self-healing your code until it works. This is one of those demos that’s incredible, but you can already feel the degree to which we’ve sort of taken our hands off the wheel and let the AI drive and we don’t know exactly where it’s headed.

We’ve talked previously that code is particularly well suited to this in part because it’s well-structured, it’s a fairly bounded space, and because it has to actually run, there’s error checking inherently built-in, and I think we’re definitely seeing that play out. But this bit about you just give in and learn to trust it and take your hands off the wheel, I think is a super important point, because that’s part of why GPT-4 I think is so compelling to me — I feel more inclined to trust it.

Then you add in the plug-in stuff. If GPT-4, if I know that it invoked Wolfram|Alpha, no question about its accuracy at all, right? People are understandably hesitant, everyone hears about the errors and they hear about the hallucinations, and people are like, “Oh, that’s going to hinder adoption” but people used to say, “People would not buy stuff online because your credit card’s going to get stolen”. I think that tipping point comes pretty quickly and once it comes it’s like a tidal wave going over it.

NF: I think this is right, and a few thoughts there. One is one thing we learned when we were developing Copilot — and this was nigh two years ago, back in the dark ages of large language models — is that as you’re trying to find an AI product — the demo’s always mind-blowing — and so you can easily cherry-pick a couple of outputs and produce an unbelievably mind-blowing demo. The proof is in the daily use and how you use it daily.

I will say GPT-4 is very, very good, it’s an enormous step change, but it does still hallucinate and it does still make mistakes. It does it less, noticeably less, but it can’t one-shot every problem, and you do have to be in the loop. But the big deal, I think, and we get to some of the recent Microsoft releases here, the big deal is going back-and-forth between this browser window that has GPT-4 in it through ChatGPT or the OpenAI playground and then your code editor window, you’re just begging for this new workflow. Copilot, to me, started to feel obsolete.

So now I think last week the GitHub team released Copilot X, which integrates the chat into VS Code. I don’t think it’s out yet, but they at least showed some videos and teased it, so I guess it’s coming soon. It’s probably, my guess is constrained by GPU capacity for inference, which is why I think GPT-4 — you’ve seen OpenAI throttle more and more access to GPT-4 as demand as dramatically exceeded their expectations, I’m sure.

OpenAI and the Consumer Opportunity

Actually on that point, I do want to get to Copilot X, but I mentioned the Big Five tech companies, it does feel like the plug-in announcement sort of felt like it cemented this to a certain extent, where we are well on our way to a Big Six. The big surprise is where OpenAI was, as you said in our first podcast, very research-oriented and more about producing this output as opposed to products per se — nope! Turns out you’re a consumer tech company now, whether they wanted to or not.

DG: The free market had its way.

Yeah, I mean ChatGPT, just the speed and intensity of the adoption basically left them no choice. It’s like, nope, sorry, you’re in Apple’s league now.

DG: American capitalism. That’s right.

The way to think about it is we’ve talked about industry structure — is there going to be a centralized player, is there going to be an aggregator, that sort of idea. The crazy thing about the plug-in thing is not only does it just in my estimation, fundamentally change the answer to that question, but I go back to the Wolfram|Alpha thing, it changes my perception and feeling and confidence of using it in a super meaningful way.

I was trying to explore this idea yesterday, I’m not sure how well I did it, but to the extent large language models are so human-like, they have the same limitations as humans in that they do make stuff up, they don’t know everything and I need a computer to figure stuff out. Now ChatGPT has its own computer to figure stuff out, which is this plug-in architecture. But you can play that all the way through to a business model. Consumers could buy plug-ins or they could install plug-ins or if they don’t choose a plug-in, suddenly you can bid to be the default plug-in. So if someone does a travel search, is it going to be Expedia or it’s going to be Kayak, they’re going to have to bid for that and they’ll pay an affiliate fee. How can this not be a huge consumer tech company at this point?

NF: Yeah. I think the advice I would give, if being asked, is that probably OpenAI’s platform where they’re selling these API tokens is not the future of that business, it’s this lowest denominator Home Depot selling lumber type of business, where every token has to be sold for the same price no matter how valuable it is. ChatGPT clearly could be a multi-billion user product that eventually gets integrated into people’s devices and used in many different ways.

OpenAI could build a phone. Like that is actually a potential branch on this tree, given where they’re at right now in March 2023, which is an insane thing to say because possibly anyone other than the current incumbents building a device has seemed nuts for years but that speaks to, I think, where they’re at.

NF: It’ll be completely unsurprising if it has a billion monthly users and maybe 300 million daily users at the end of this year. I don’t know if it’s true or not, they haven’t told me, but I’ve heard that they have between one and two million subscribers for ChatGPT+. Again, don’t know, can’t tell if that’s true or not.

At $20 a month!

NF: Yeah, I mean that would mean it’s kind of getting between $200 and $400-500 million a year. If so, that must eclipse the API revenue dramatically and not to mention it’s just a more valuable ecosystem position to be able to roll out these features and use the data that comes from ChatGPT. So yeah, I think maybe they should rename the company ChatGPT or something.

I saw that on Twitter, I think it’s a good idea. Well also the other thing is that I thought the Codex cancellation, and then they walked it back, but I thought even if they wanted an API business, I think that might’ve killed it in the womb because why would you build on ChatGPT — see I called it ChatGPT — why would you build on OpenAI when Microsoft is going to have the same API and they’re not going to kill anything?

There’s a weird sense where OpenAI is in a competition with Microsoft in the API space, that they structurally just cannot compete with, and also it’s not even a good business for them. It’s sort of a distraction, and the margins are not going to be anything close to what they’re going to get on the consumer front. They’re sitting on top of the most difficult thing to build in the world, which is a dominant consumer platform. That seems exactly where they should go. (laughing) I’ll take the silence as agreement.

NF: I agree.

DG: I think it’s the obvious thing to do. I imagine internally there’s a cultural digestion moment that’s happening now where the consumers are really telling them what they need to do, and at some point after enough days of, “Chewing glass and staring into the abyss“, to quote Elon Musk, they’ll choose success.

It’s a very different kind of company, a consumer company versus “We’re just going to build this model and have an API”. The latter is easy in a way. It feels like they need to build a completely new organization. There should be a ChatGPT app by now — sorry, it’s been four or five months, there’s been a million people that have built wrappers to date.

DG: But one thing though that my boss at Apple used to say, Eddy Cue was, “It’s important to make the important things good.” Which was his way of saying implicitly, not everything needs to be great. So I think the OpenAI perspective on this would probably be, “Look, if the assistant’s really good, people are going to use it from their browser”. Even if it doesn’t have browser rendering, people are just going to use it.

And that’s true.

DG: He would say this when we would talk about App Store performance, because at the end of the day, in the early days of Apple, to some extent still now, as everyone knows, the App Store was terrible, did not load fast at all, but if the phone is good, it really doesn’t — I mean no one likes saying this publicly — but it really won’t matter. The phone just needs to be really good.

So I think in OpenAI’s case, their organizational truth is, look, at the end of the day, the polishes around the ChatGPT website and the app just don’t matter. What matters is that it’s the best agent with the widest plug-in ecosystem, and the smartest, most accurate, fastest advice. That’s all that matters. I think what’s happening now, with every single day that goes by, is not really a network effect from a data standpoint nor a network effect from a user standpoint, it is a network effect from a brand standpoint. People are walking around and they’re saying ChatGPT.

It’s Google, it’s the new Google.

DG: Google, that’s right. It’s a word and just like the word Google, it’s a little bit weird, but it sticks in your head. And so when the second, third, fourth and fifth place come up, unless they come up now this month or next month, it’s just going to be too late, I think, for the consumer thing, because you’re going to be an afterthought, unless you have a particular niche or specialty or whatever, LexisNexis equivalent in Google parlance, but that’s what I think is going on now. It’s a fight to become a box in the customer’s brain of the agents that you talk to, and every single day they’re acquiring more people that are just “ChatGPT it” — it’s a verb, it’s a proper noun and that’s what they’re winning and that’s all that matters, I think.

Microsoft and Bing

That’s exactly right. I think I completely agree, I’m glad we waited for you to weigh in, because that was an observation that was worth it. But to that point, Microsoft’s in the opposite boat, whereas when it comes to Bing, well there’s a few angles on Bing. I think big picture, Daniel, your observation is the most important one, which is if you asked any consumer, number one, they probably don’t know about Bing Chat. Number two, if they do, they know that Bing has “ChatGPT”, which sort of gets to the point.

Nat, I am curious, are you surprised that Microsoft has sort of stuck with it, even though it’s only been a few weeks, but obviously it was a very hairy first week — I might have contributed to that a bit — but has that been a surprise to you?

NF: It has and it’s taken me a little time to try to understand what’s going on over there, because when Bing Chat had those sort of moments of amusing or even slightly frightening behavior, I thought we’d see kind of a little more caution for Microsoft afterwards or some apologetics, or things like that, and we really didn’t. They’re just kind of at a fever pitch over there, obviously going gung-ho and rolling this stuff out as aggressively as they can. Frank Shaw tweeted last week that this is going to be another busy week, so probably by the time people are hearing this, there are more announcements that we don’t know about yet. But I’ve tried to think about psychologically what may be happening there, obviously this is armchair remote psychologist, but —

Right, but you were there.

NF: Yeah, I was there. So the company that Satya [Nadella] joined decades ago now was a company that was absolutely holding all the cards. They had DOS, they had Windows, they had Word, they had Excel. They were really kind of standing astride the entire industry in a very, very dominant position and he got to enjoy that for a while. Then Microsoft spent multiple decades on defense, and it was defense against the Internet and defense against the web and web apps, and defense against phones and defense against cloud, and web search of course as well. Microsoft has been a kind of number two player underdog in each of those categories and they’ve settled into a recent equilibrium as not a consumer technology company, but a business-to-business technology company playing defense on all these trends, but really helping incumbent players stay relevant, in the same way Microsoft itself has managed to stay relevant.

What’s happened now is that Satya, I think, finds himself at a company that’s much more similar to the company he originally joined. It’s got all these great things, it’s got GPT-4, it’s got Copilot and this whole concept of a Copilot, and so I think he feels like they’re back and they’re going to behave with the same aggression, excitement, and optimism that they had thirty years ago when he joined the company.

They were like a young handsome man that sort of got fat, couldn’t fit in the old jeans anymore, but could never bring themselves to throw them away. And now they’ve gotten right and tight and in sort of shape, and the jeans are sliding right back on, they’re ready to go.

DG: Because of Ozempic!

NF: Then the other thing, a couple other sort of psychology points here may be worth being aware of is that Satya, he had many large jobs at Microsoft before becoming CEO, but one of his first big ones was running Bing, and he ran Bing in the era I think of 1% market share and grew it, but it was a tough fight against the dominant Google, and so there’s a way in which I think he’s back and Microsoft’s got a chance to take share and I think they’re excited about that. Then by the way, this is all my speculation, and I know and respect all these people over there, so I’m just guessing based on what I know, the other thing though is that there’s a degree to which he and the Microsoft leadership team are kind of playing with house money in the sense that, what was the stock price when he joined, $40, $50?

$40. I know very well because I had to forego my remaining stock grants and then it immediately went much higher.

NF: Now it’s $270, $280, I don’t know exactly what it is, but it’s a lot higher, and he’s added trillions in market cap or at least a trillion-and-a-half or so, so I think there’s probably an element of legacy and all of that here too. So I was surprised when they weren’t a little more shrinking at some of the criticism, but they do seem gung-ho and it’s showing up everywhere, it’s clearly kind of the paradigm.

Yeah, but I’m not sure that Bing is going to ultimately be a thing, just to Daniel’s point, ChatGPT seems like the clear winner here in the consumer space. I think the plug-in architecture feels much more elegant than whatever it is Bing’s trying to do. Bing, I think, is limited by trying it to have it be a part of search, not just because Bing search is bad, which it continues to be bad because I’m using it much more and being reminded, but also the UI is just weird, it feels tacked on because it is.

But that doesn’t mean the technology won’t be meaningful and it’s like Apple and Google back in the day, where if both of them could have just done what they were good at, and they got in this unfortunate fighting where they infringed on each other’s space; OpenAI and Microsoft, they’re obviously partners, they’re joined at the hip regardless, but it does feel like from a product perspective, look, there’s an obvious way to split this pie.

NF: Yeah. I don’t think Microsoft is amazing at new user experiences and things that require a lot of taste and aesthetic tuning, but they’re great at B2B and so I expect that pattern to play out in this new AI era also.

And they’re great at being a platform. I’ve been relatively more familiar with and positive towards Microsoft in part from having been there, and this is the company I’ve been the most right about so I get to be biased in that regard, but if I’m building a startup, I would rather build on the equivalent Microsoft API than basically any other company in the world, because that’s literally what they do is they build APIs and support them for forty years.

NF: Yep.

DG: I mean you’re totally right that it’s sort of funny that the company that should be doing consumer is really excited about enterprise and vice versa.

NF: Exactly.

Google

What about Google? Bard, congratulations to Google, they have finally launched a product. It’s out there, it’s able to be used. It does feel like they are feeling the weight of being second super heavily, I think in two regards. Number one was they announced the integration of AI into Google Docs and no one cares because it’s like, “Yeah, let’s see you ship something”. Whereas Microsoft has gained, because of Bing, gained the benefit of the doubt, “Yeah, okay, this is definitely coming. I can see it.” But then number two, without Bard being astronomically better than ChatGPT, it is basically, by default, going to be considered worse, whether it’s actually worse or not, just because it’s feels like it’s coming in late.

DG: It’s late and the insiders think it’s a joke and so that spreads to outsiders. It did have a couple of tricks up its sleeve in the fact that no one noticed them, I think speaks to your point, meaning no one cared or noticed that it was current, which is a big deal. ChatGPT’s current as of November, 2021, which I wish was true for my brokerage account. But obviously, a lot has happened since and Bard is current up to today, but no one cared. Bard is faster, but it doesn’t stream the tokens out so it actually appears slower, no one cared.

That’s really interesting. I can’t decide which one I like better. I feel like intellectually I like the Google one better because I can see that it’s faster overall, but there is a feeling where ChatGPT immediately starts writing and you sort of sit there and watch it.

DG: The big psychological irony is, at least when I was at Apple, we all looked to Google as a business that really understood the value of very fast response time and in search, you can really quantify it. The difference between 300ms and 100ms is a big deal and Google can see it in the number of subsequent queries people make and then subsequent ad spend. The big irony to me that sort of reeks of whatever metastasized cancer is working its way through Google is the business that was obsessed with speed was unable to deliver on this very simple trick to make things appear faster. I don’t know, that’s sort of like a diamond dealer who is losing his vision and you’re wondering well, the key thing you need is an obsession with speed if you’re a search engine, and the sense of slowness is a real sign of lack of health.

NF: My theory about why they did that is, I’m sure you did notice, Ben, I think I saw some videos from you of this, but Bing Chat would say something offensive and then it would start deleting its own words and say, “I didn’t mean to say that.” Sort of like someone with a really bad temper who’s constantly apologizing for blowing their stack.

Deleting their tweets.

NF: Yeah. Deleting their tweets all the time, and I think that’s exactly why Bard did that is just so they don’t have this thing where it says terrible things and deletes them, they wanted something a little safer.

But that makes Daniel’s point. That speaks to a company that is concerned more about screwing up than about winning, and it’s like you talked about Microsoft being on the defensive, Bing may not win this space, but it’s a whole lot more fun to have nothing to lose and Google has everything to lose and having everything to lose is a tough place to be.

DG: Yeah, right. And there’s all those, I think valid, observations about Google built on the understanding that search was forever going to be a high margin business and as we sort of shift search from a very cheap I/O operation to a synthesis of information, which is I think more CPU and GPU bound, the cost of every, or most queries, go up and therefore the margin goes down, which is not an issue from Microsoft, which has been monetizing Office forever, and for them this is an afterthought, but could be an issue for Google. Not a material issue, but an issue that the Street would notice and so I think Satya has a lot of interviews now where he seems to have noticed this, this very crucial fact.

He wants to make sure everyone else notices it.

DG: Yeah, and wants to make sure everyone is watching extremely closely in the next earnings, and I do think it’s sort of a material issue for Google, I think the market is a little bit overreacting to it. One thing we’re learning from OpenAI is just how efficient you can make the models, given time, because I think there’s a massive GPU shortage, there is actually a lot of market pressure on making things like TurboGPT, which is OpenAI’s slightly dumber, slightly faster model. So I actually think Google could make this work and not all queries need a GPU spin up to begin with.

But the real issue is Google is a company seemingly without a founder acting like a founder, and there are, throughout history, free market examples where the non-founder had founding moments, maybe Howard Schultz with Starbucks, but for the most part these moments in history I think really probably come down to four or five people in a room and what they decide to do. And seemingly in Google’s case, those people have not gotten together and decided to do the thing, and if they don’t it’s sort of a car on autopilot and it’s just going to go where it’s going to go.

This isn’t a critique of Sundar Pichai the person, it is the tangible human example of everything to lose and nothing to gain. Literally, the only possible long-term legacy for him is that he lost search, right? Because Larry [Page] and Sergey [Brin] are going to always have the credit for building it, and so when you have a company in general that is large and dominant — and you go back say the same thing about Steve Ballmer, and if anything, this is a reason, sort of as an aside, where Tim Cook probably deserves even more credit than he gets by virtue of it’s such a trap — it’s not even a trap because it’s like an inevitability, it’s like your car is hurtling along and there’s a cliff in front of you, you’re going to go over the cliff. It’s just what happens to big companies.

DG: That’s right.

NF: Yeah. It really feels like Google was actually built for this moment and just because of internal issues, culture, leadership, they’re just unable to seize it.

Yeah, it’s tough. But again, you’ve seen it happen before, you’ll see it happen again, I’m sure.

Open Source and Apple

What about Apple and Amazon? As long as we’re here, both seem like they should be heavy investors in the open source ecosystem, that sort of fits, I think, broadly their models. At the same time, can you get away with not having your own model? Anthropic launched Claude, which hasn’t gotten much buzz, maybe because it’s not broadly available. But isn’t there going to be a bidding war at Apple and Amazon for Anthropic? That seems sort of the obvious outcome to me.

NF: I’ll let Daniel take the Apple part of this.

DG: Oh okay, I’ll take the Apple part. Well, Apple will famously tell you if you ever call them for M&A that they don’t do a lot of M&A and then they go off and buy Beats, so there’s clearly exceptions to the rule. Look, I think Apple’s culture and philosophy, at least I haven’t been there for a couple of years, but to the extent I remember it, is very much last, not first-mover advantage. Apple was not the first music player, PC, or mobile phone, just the best, and so I don’t think there are in any particular rush. I think if OpenAI were to launch a phone and it were suddenly start stealing share from Apple, which I don’t think —

I don’t think it’s happening any time soon to be clear, I was more remarking about how incredible their mind share already is.

DG: I have all these little thoughts in my head of crazy things that can happen, and then there’s a little voice in my head that says, “That’s for sci-fi. That can’t really happen.” And then time and time again, Brexit happened, Trump happened, COVID happened, AI is happening. So I really don’t know. But that said, I think Apple is just going to wait.

Now, in terms of an Anthropic bidding war thing, how would you say, M&A proclivities aside, I think there’s an open question as to how long it will take the incumbents to believe that they can’t build it internally. And usually, in any market cycle, there are a couple of quarters where they have to have the internal thing that has to fail before they can really pay up the price. Who knows? But I think they’re probably going through that exercise now.

I also think it’s not really clear even to me how far ahead a company Anthropic is of open source. I actually don’t know, maybe Nat would disagree with me on this. But I think we go through these fits and starts where open source, everyone feels like it’s five years behind, then it turns out it’s two years behind, then it turns out it’s a year behind, so we go through these phases where the gap widens and narrows. I don’t even know if you are Apple, and you know need to do a Anthropic-like model today, I don’t know that you can’t do it. Nat, I don’t know if you would disagree with me.

NF: Yeah, I think the thing I’m thinking about more with Apple is just that we’ve barely begun to use the capabilities of the existing hardware for running these networks.

DG: Definitely.

NF: So I think one of the big events that a lot of us have been talking about and waiting for was the release of an open source text model that you could run locally and play with, the sort of Stable Diffusion moment for text, and we had that just a couple of weeks ago. It came from a totally unexpected place, two totally unexpected places. One was Meta released this LLaMA model with a non-open source license, but they made the weights available to researchers and it was trained using the best available techniques, and they had every size of it up to 65 billion parameters and it’s very, very good. And of course, the weights immediately made their way into Torrent and then what happened afterwards?

Now they’re sitting in a hard drive! Fell off a truck…

NF: What happened next was exactly what happened with Stable Diffusion last summer, which is the open source community started optimizing and tweaking and toying and playing with it and I think one of the big events was Georgi Gerganov at Sofia [University], came out with llama.cpp. He had previously released this optimized inference engine for Whisper, called Whisper.cpp. He took some of the same techniques he used to build that and this technique of four-bit quantization of these language models, which he had learned from Fabrice Bellard who did this for TextSynth and got LLaMA running on a MacBook, an M1 MacBook and an iPhone and a Raspberry Pi, as a consequence. So we had the combination of a state-of-the-art language model with weights available and the creativity of the open source community. Subsequently some folks at Stanford fine-tuned it using RLHF and some available human feedback data sets into a model they called Alpaca.

Well, they trained it using OpenAI’s API.

NF: Did they? Okay, I didn’t know that. I thought they used LoRA.

DG: You both are correct. LoRA was the actual fine-tuning method, I think it is true that some of the prompts it was fine-tuned on were generated using TurboGPT.

NF: Oh yeah, sure. Okay. Yeah, that’s right. They did take some tokens from OpenAI models and use them to fine tune.

The great thing is that all these are in flagrant violation of licenses, which does feel like the old days of Silicon Valley. So maybe we’re back, baby.

NF: Yeah, but the consequence of that is that you can run fully locally on your laptop, a 13 billion parameter model that is ChatGPT tuned, and you can talk to it on an airplane, and people have reported doing this. So the interesting thing about that, it’s pretty fast actually, but it doesn’t even use Apple’s neural engine. It’s only using the Metal performance shaders and some other tricks. So there’s probably another, I don’t know, 3 to 5x left in there once someone optimizes that. This is part of the capability overhang we often talk about is all these permutations that haven’t yet been tried, someone will figure out how to run at least some layers of these models on the neural engine and get huge performance improvements, and we’ll have even more powerful local models in the near future.

DG: And we should mention local models are exciting, not just because they’re privacy-friendly and local and available to anyone, but there are also use cases that only emerge at very low response times and even in the conversation we’re having now, you can interrupt me instantly, I will stop talking, I will listen to you. Those modalities are pretty hard to do when you’re talking to someone that has a half a second or two second delay. So I actually think the comfort level that people will have with a lot of these models will grow when they become more local, just new products are possible that you just wouldn’t really want to have if they weren’t.

Yeah and I think I’m pretty optimistic about Apple. We talked about this previously so I don’t need to rehash it but their speed of response to Stable Diffusion of not just releasing their own modification to run on their hardware, but actually releasing an operating system update to make sure that it was used meaningfully, the natural extension of that down the line is actually tuning down to the chip level, whatever their preferred model ends up being. Just the fact that we all thought, “Well, when’s the LLM moment going to happen? Is that going to be possible locally?” Sure can, Raspberry Pi here we go!

Why do you think that the LLaMA model is where this happened instead of FLAN? Is there a difference in quality or was there the sort of illicit bit that “This comes from Facebook”, or what do you think?

DG: That’s an awesome question.

NF: I don’t know. I thought Georgi was going to do it for FLAN-T5 first and then I think LLaMA came out and it’s bigger.

The hot new thing.

NF: And it’s the hot new thing. Yeah, exactly. So I think that’s just where it happened first. I’m not sure if there’s something about the architecture of the LLaMA model that made it easier to do this, I don’t know the answer, Daniel may know.

DG: I think the market at the point at which you take these sort of raw, unformed pieces of clay and turn them into a useful jar, this fine-tuning thing, the market of people that have the lexical ability to do that and to slave over the fine-tuning data but also have the ability to run these PyTorch models is really small, so you end up with these inefficiencies of “Why X over Y?”. Well, just the scene was in another space that week. To your point, LLaMA was the hot thing, it had a funny name, and so a bunch of people at Stanford did it. I am pretty sure if those people at Stanford truly applied themselves to T5 just like FLAN did to T5, they could have done a better instruct version model of it.

AI Personality

DG: It’s sort of funny that the market’s not efficient that way, it’s just we’re in this era where the truth is fine-tuning and taking again, this raw model and making it something you can converse with that has a personality. That is actually a design problem, that is not a hardcore engineering problem, there are no design tools for it yet, this will emerge over time. There’ll be the equivalent of Word for these models where people who have the design sensibility, your Aaron Sorkin’s of the world, will be able to write instructions to those models, but that doesn’t exist. So you end up having a very small number of people that can live in both worlds and do both. It’s very reminiscent of the early days of iOS where there were just very few people that knew how to make really polished apps but also knew Objective-C.

That was a real edge for the companies that had it, a lot of them were old school Mac developers, and that grew over time. Now with React Native, designers can make beautiful apps, and apps just get more beautiful. So the reason I think the market’s so inefficient now is you just don’t have tools for fine-tuning, which is ultimately a very much sort of lexical aesthete job where you have to look at the right words, really slave over the fine-tuning data — is it nudging the model in the right way? Very few people have both yin and yang in their head.

You mentioned before there’s no Walt Disney or Steve Jobs. Is that sort of what you’re driving at?

DG: Yeah, I think we’re entering this sort of odd area of AI where things are getting pretty big. ChatGPT, we were saying might have a billion users at some point in the next 12 months, and the sad thing to me, and actually the really alarming thing to me, is not the capability of the models or whether it’s connected to the Internet or not. To me, it’s the fact that the models, no one has really spent time making them sort of wonderful and fun in a Pixar way. We don’t have a John Lasseter or a Walt Disney who’s really focused on the technology but also the enjoyment of the model.

The problem is that every day on the Internet billions of tokens are being issued from these AI systems, which is by the way, a reflexive effect, meaning future AI systems will be training on the output of current AI systems. The underlying genome of these sims is not sort of whimsical, funny, joyful and that’s a real issue in my view because every day, the problem gets worse at sort of an exponential rate as the norm for AI is very strict, very guarded, very much trying not to offend anyone but also being extremely offensive in some ways once it’s “jailbroken”.

I think we’re missing, just because these people are rare — Steve Jobs is a rare thing — people that can really think deeply about how to make a very funny LLM. I’ve been shouting at Nat and anyone else who will listen to me that we need to find someone making a really funny language model, which is not easy to do by the way and I think on a relative basis, there are many more papers about LLMs doing math than LLMs being funny. But I think actually being funny is much more important and I would argue, broadly, a very important direction, if you think about broader AI safety risk and all that sort of stuff, it should feel like as if we’re creating the world’s best pet, not the world’s smartest actuary.

We don’t have that spirit now. I’m hoping the market will produce it at some point because I think that’s something we really need and I think the one corner where we do see this is there are these businesses that generate, not words, but they generate images, businesses like Lexica or Midjourney. Those can be used, obviously for good or bad but when you see a really funny Midjourney image, you laugh. “That’s funny, that’s awesome.” I think we need much more of that and I’m personally interested in funding much more of that and much less of can we solve the Riemann hypothesis or whatnot.

Midjourney does seem the one that is like that. Although it’s interesting, their recent model, which kind of got buried under everything else — they’re just sort of over in their little Discord world and I think it’s actually really to their benefit, because their userbase, by all accounts, is astronomical and their revenue is incredible and they kind of get to just skate by all the criticism because who wants to install Discord? But V5, the photorealism, is unbelievable, but still, at the root of Midjourney is the whimsy that was V1, V2, V3.

DG: That’s right.

That’s something that’s just part of what Midjourney means now and I had David Holz on for an interview, it’s kind of no surprise that that’s downstream from him.

NF: Yeah, I saw him a couple days ago in San Francisco and I asked him, “Hey David, how much is your thumb on the scale now when it comes to the aesthetics of the model? Because I know you were heavily involved in making sure it was fantastical and imagination tool, that sort of thing at the very beginning.” He said kind of not at all anymore, he said that the human feedback they have now, they have to nudge it a little bit out of some gullies it might otherwise land in. But the human feedback does it. By the way, on the funny image front, the first Midjourney image that fooled me was just the other day, it was the Pope in that giant puffy jacket, the Balenciaga Pope, I don’t know if you saw that one.

I thought that was real. I thought the joke was that this looks like an AI image.

NF: That was a Midjourney! If you look in the corner near his hand, you can see he’s carrying a ghostly, I don’t know, Gatorade.

DG: Yeah, looks like a Starbucks. Looks like someone is generating like a celebrity coat thing, which always has a Starbucks coffee cup. That’s what it looks like.

NF: I think by Midjourney V6, probably that’s gone and you just won’t be able to tell without detailed forensic analysis and within a year or two, 80% of the population will be functionally insane because we won’t know what’s real on the Internet or not.

DG: The good news is I think the people have been doctoring words and images forever on the Internet and too many people I think have been sort of aloof to that, and if these models make people default skeptical about things they see because they think it’s from this “AI model”, “Oh, I won’t want to get fooled,” that’s fantastic in my view. We’ve been looking for a way to sort of up-level the degree of thinking people have on the Internet and I think it’s great if everyone would be a little bit more suspicious.

People are believing too much crap, so we’re going to completely and utterly immerse you in crap until you realize it’s all crap. That’s one of the potential risks of AI, but I actually agree with you, Daniel, that’s a potential benefit.

AI Risk and Opportunity

I have heard, I think, a lot more chatter from folks, I think including you, Nat, that have generally been somewhat skeptical of the AI alignment movement, particularly to the extent it seems more concerned about political positions than maybe about actual existential risk, that actually maybe there is an existential risk question here. How has that shifted for you?

NF: This is sort of an uncomfortable conversation for me and I’ve had to face some of my long-held elements of my identity and beliefs, to think kind of first principles about it and not just by analogy, but I think there are some elements of the risks here that are real and we’re thinking about, and the way I arrive at that is — and I’m still trying to determine exactly what I think, and so I’m trying to talk to smart people here. But the way I arrive at that is, number one, just take GPT-4. We know how to make it, but we don’t really know how it works. There’s no one on Earth that really knows, truly, what’s going on inside of that thing. It’s probably arguably the most complex artifact we’ve ever created in our civilization. It’s got over a trillion parameters. Just inferencing it, I’m guessing, takes, I don’t know, sixteen or twenty-four A100s. A single A100 can do something like 300 trillion floating point operations per second, so if you’ve got sixteen of them, you’re doing, I don’t know what that is, four or five quadrillion floating point operations per second. That’s a lot, that’s crazy. So they’re kind of these big blobs of math and we don’t actually know what’s going on. Literally, I’ve tried to find the person who knows what’s happening inside of there, we can explain it at the quantum mechanical level, but the phenomenology above that, no one really understands.

Then, I’ve had access to it for months but people are still showing me things, every day, on Twitter and elsewhere that it can do that I didn’t realize. I’d played with it, but I hadn’t found them. So there’s hidden capabilities in there that we don’t know about. It’s big, it’s complicated, we don’t know how it works, it’s got this capability overhang built into it. Then there are clear examples where we’re still learning to control it, where it’s going to go off and do something we don’t want it to do. I think the Bing Chat Sydney example was a very public demonstration of how even Microsoft, who’s very closely partnered with OpenAI and is full of smart researchers, they can even have one of these things slip the leash and go off and do things it’s not supposed to. Then I’d just say, we’re really eager to hook these things up, we talked about the Wolverine.py example, or just running code without really reading it, I think that is going to be normal.

Whenever I got access to plug-ins, the access web plug-in was gone, I’m not sure what happened there, but I can imagine — it didn’t seem to exist for long.

NF: Somebody said there’s a way in which the plug-ins were like seeing the humanoid robot power on, and this frisson of energy just passed through the body and it sits up on the table or whatever. It’s not just talking to you, it’s now taking action in the world on your behalf, or God knows exactly what it’s doing and that was sort of a moment.

Then the other thing that I’ve started to think about is, maybe human intelligence is not so impressive. Maybe our intelligence, which has seemed so singular and unique, maybe it just can be exceeded. This is not a theoretical thing or a fun sci-fi idea, but maybe it’s just something very practically we can do, and very, very quickly. You take GPT-4, you add a little scratch pad, you add some memory, you throw it into a loop, you give it the Internet and a bank account. Like, what’s going to happen? It’s sort of hard to predict exactly.

I think then my final thought is that over time, these things will not just respond to queries but they’ll be agents, they’ll execute plans. People are already in the sort of hustle GPT corner of the internet, empowering them to run e-commerce businesses autonomously to see what happens. So there are these agents that are copyable, and they’re also subject to some kind of selection pressure. Just where does that go from a Darwinian point of view? I don’t know, I’m still sort of thinking this through.

But I think these are really legitimate, they sound crazy, but they’re legitimate concerns and I think there’s a lot of people, outside of the tech folks, who are just naturally creeped out by AI. I think whatever that instinct is, it has something in it that’s essentially worth paying attention to, and those of us who are just purely boosters, which I have certainly been and probably still am largely, should think about it a little bit.

Number one, I don’t know. I think that is sort of the big picture takeaway for literally all of this, is we are in completely unprecedented, uncharted territory and I think part of the reaction against AI existentialism is some of the people that talk about AI risk are so absolutist in their statement that there’s a natural pushback against that —

NF: Yeah, definitely.

—you don’t want to grant any sort of room there. But at the same time, you mention the selection pressure. We’ve talked about the open source models, there was a model that came out in China that is not as capable, but it exists and if we know anything in tech over the last forty years, it’s that the level of capability on the day something is launched is much less important than whether or not it exists. The fact that LLaMA runs on a Raspberry Pi is meaningful, not because you’re going to use it on a Raspberry Pi, it’s because it shows that, inevitably it’s out there, and it’s going to happen and that’s the reality of this stuff. It’s all a matter of timing at this point.

There’s a bit about the Internet, like you can talk about the Internet, was this actually good for society? It doesn’t really matter. That’s a nice discussion to have around a fire late at night or whatever your recreational choice is, because it is here and the only way to figure out these issues is to go forward. It feels like we’ve already crossed that Rubicon with this AI stuff, so let’s push forward and figure it out and if we blow ourselves up in process, well, it was probably going to happen regardless.

NF: It definitely does seem like there’s just unlimited enthusiasm, and that’s probably where we’re headed. But yeah, I mean, I think it’s an interesting question and no longer theoretical and no longer fictional.

To your point, the GPT-4 was built on A100s, chips that are out there and exist, right? And to your point, GPT-4 has so much capability that it’s going to take us years to digest it and figure out what can be done with it. Even if you think the door to the barn should be shut, the horse is over the horizon, it’s a moot point at this point.

NF: Yeah, I think it’s interesting. I do think there will probably be a market for tools to better — we have dog training businesses, to Daniel’s pet point. You can send your dog off to finishing school to get trained or there’s techniques for teaching your pet to be house-trained, or even having guard dogs that will attack intruders, but not you. So, yeah, I think there’s going to be a real market need for tools and techniques just to do quality assurance on these things, validate them, ensure they do the things you want and not the things you don’t want. Probably the more powerful they are, and the more we hook them up, the greater the demand for those techniques will be, and it would be cool if in some areas we got ahead there and didn’t merely react. Although we’ll probably mostly react.

Well, the other issue is the fact it’s language is super important and meaningful, because I think the vector of risk is much less the Terminator, it figures out how to operate machines and do X, Y, Z. It’s going to be it convincing people to do things. The brain-blood barrier sort of has already been broken in that regard, because language is the ultimate mode of virality, it’s already there.

DG: One of the reasons I keep on harping about the output of the models really mattering now, is we’ve had these systems for creating tokens that drive humans insane for ten years now, and it’s called social media, and it does drive people insane. The nice thing about the system that we have now is we have the opportunity to control the types of tokens it emits to people in a much more specific way, I think, than we did with social media, where you would just get random tweets and retweets and you can’t really shadow ban people because they find out they’ve been shadow banned. I do think we have a choice, whether these models drive people to be more open and happy or whether they drive people insane, based on what they say.

I agree with you that the main risk is not that they shut down the Azure data center, but just that they start driving people to do wild things. I do sort of think if you could force all AIs to just be funnier, from some dream regulation where that actually would work, which is impossible. But in that hypothetical world, everything would be better. Because of the centralization dynamic with these language models, the tokens are ultimately outputted by one of five companies now. I think you can actually do much better than the current status quo, which is social media emitting tokens that really drive people up the wall, and have obviously have hurt our country quite a bit and the world as well, just in terms of balkanizing everyone.

It’s a phenomenal point, because I think there’s a good chance we look at this twenty-year run as a total aberration, that on one hand was useful to give these models the sort of raw language that they needed to learn. But was actually, all things considered, a pretty crappy experience for everyone.

DG: Right?

I mentioned this on a recent Sharp Tech episode, but I feel like I’m living in the future, in this regard specifically, where basically all of my social interaction is in encrypted chat apps. It’s in a high trust environment with people that I like hanging out with, and the way things can be perceived and whatever is just so much better, and it’s a genuine, meaningful improvement in my life. This idea of public social media was absolutely insane. Like, how can AI be worse than this idea that you’re going to just put your stuff out there on the Internet? Anyone can drive by and take shots at you or criticize you, or you can become the current thing of the day. It’s just an all-around bad idea. To your point, Daniel, it’s just going to be more pleasant to interact with this AI, and people view that as a bad thing, but I think that’s just an inherent distrust of the future and improperly evaluating where we’re at right now. Every time I say that I think Twitter is a real big problem people get upset and they’re like, “Oh, you’re so anti-Twitter.” I’m like, “No. It’s not good! Humans were not meant to interface with the entire world in real time.”

NF: We are encountering, by the way, more and more companies that are really interesting, that are, in a way we’ve kind of come to believe, competing with social media, with AI. You can think about these parasocial relationships that people have set up on social media, where you follow, I don’t know, The Rock on Instagram and The Rock writes a post and hundreds of thousands of people reply, and The Rock reads approximately zero of those replies and responds to even fewer. So what is that relationship that you have with The Rock? It’s not a real relationship. Then if you have an AI friend, online, that you can talk to, whether it’s ChatGPT or character.ai.

Bring back Sydney!

NF: Yeah, or Sydney, whoever it is that gets you excited, then at least you know they’re going to read and reply, at least you have a two-way relationship. The numbers on these, even just pure entertainment, non-productivity chatbots like Replika, or Character.ai, are crazy. People are spending hours a day.

DG: And they’re not even funny, just imagine how good they would be if they were funny. I’m serious! There was one effort we know of that, the way they set it up is particularly convenient and I find myself reaching for it. And I say this as a person who has the great pleasure in being able to WhatsApp Nat Friedman and Ben Thompson at any time of the day, it’s helpful to talk sometimes to these other things. I think it’s one of those ideas that sounds really out there to people. In my view, it’s better, to Nat’s point, than the current status quo, which are these parasocial relationships that drive loneliness, because it is like interacting with a human that never responds or has eye contact with you. That’s what responding and not getting a response back feels like, I think, at a deep, basic level. That’s why people are walking around, dazed, confused and depressed. Whereas these agents, especially if they’re made well, especially if they’re funny and enlightening and whimsical, I think, will be much better.

To your point, Ben, we may treat the last two decades as a giant training run in order to create this era of sort of infinite companionship, which hopefully is good. Now, look, it can get just as dystopian as it is good. Fire can be used for arson, every technology has two sides. But this is where I think there’s some element of responsibility, and everyone making these models to make them funny. I think that really, really matters.

It’s a really good point. Nat, to take this full circle, six months ago, you were like, “Where are the people that are making the products? There’s just a lot of research.” Well, we answered that, but I do think there’s almost like the bizarro Eliezer sort of optimist angle, which is actually, this is good and it’s not good because it’s more productive, it’s good because this is going to make people happier. Again, the truth is probably going to be somewhere in the middle, it’s not going to be one extreme one way or the other. But I see no reason why, if you think about all the expected value outcomes, yes, I grant there is this very dark outcome where the machines are just so much smarter and somehow they gain volition, despite the fact it’s not clear how that bridge is going to be crossed. But it’s just as reasonable to have not utopian, but a very, relative to the terrible 2010s view that actually, people are happier, they’re more productive, they actually have better real world relationships because this distinction between online and offline is actually more meaningful.

DG: That’s right.

This is the bit about me giving them more shit, right? It’s almost a good thing to make online and digital separate from the real. The more distinction the better, because that is going to incentivize you to invest more in the real world, which makes people happier. So I think this is a valid case to make.

NF: There’s definitely upside scenarios here, which are extreme. And I recently re-watched that movie Her.

I did too, yeah.

NF: I think Daniel and I have talked about this. Did you watch it?

Yeah. I rewatched it after the Sydney experience.

NF: Yeah. I’d remembered it as this sort of dystopian kind of world and there were certainly parts of it that still felt that way, but I saw a different story when I watched it this time, which was …

He was happy!

NF: Yeah! There was this AI robot that very patiently performed therapy on this traumatized man and then released him at the moment he was ready to kind of re-enter the world of human relationships. It’s an incredible story when you view it that way. He was greatly improved by his interactions with this AI. I mean, hopefully that’s what we’re building, I think probably we’re building that right now, to a much greater degree than most people appreciate.

That’s a perfect note to end it on. Good to check in. I think we made it through most of the list; we didn’t get to Nvidia, but that’s fine. We can leave that for a few months from now. Nat, Daniel, it’s great to have you. And I’m sure we’ll check in again soon.

DG: Thanks for having us.

NF: Thanks, man.


This Daily Update Interview is also available as a podcast. To receive it in your podcast player, visit Stratechery.

The Daily Update is intended for a single recipient, but occasional forwarding is totally fine! If you would like to order multiple subscriptions for your team with a group discount (minimum 5), please contact me directly.

Thanks for being a supporter, and have a great day!