ChatGPT Gets a Computer

Ten years ago (from last Saturday) I launched Stratechery with an image of sailboats:

A simple image. Two boats, and a big ocean. Perhaps it’s a race, and one boat is winning — until it isn’t, of course. Rest assured there is breathless coverage of every twist and turn, and skippers are alternately held as heroes and villains, and nothing in between.

Yet there is so much more happening. What are the winds like? What have they been like historically, and can we use that to better understand what will happen next? Is there a major wave just off the horizon that will reshape the race? Are there fundamental qualities in the ships themselves that matter far more than whatever skipper is at hand? Perhaps this image is from the America’s Cup, and the trailing boat is quite content to mirror the leading boat all the way to victory; after all, this is but one leg in a far larger race.

It’s these sorts of questions that I’m particularly keen to answer about technology. There are lots of (great!) sites that cover the day-to-day. And there are some fantastic writers who divine what it all means. But I think there might be a niche for context. What is the historical angle on today’s news? What is happening on the business side? Where is value being created? How does this translate to normals?

ChatGPT seems to affirm that I have accomplished my goal; Mike Conover ran an interesting experiment where he asked ChatGPT to identify the author of my previous Article, The End of Silicon Valley (Bank), based solely on the first four paragraphs:¹

GPT-4 is able to infer authorship from a passage of text based on style and content alone.

Given the first four paragraphs of the March 13, 2023 @stratechery post on SVB, GPT-4 identified Ben Thompson as the author.https://t.co/G9ObYbvOTX pic.twitter.com/Ecjv83O8No

— Mike Conover (@vagabondjack) March 19, 2023

Conover asked ChatGPT to expound on its reasoning:

Pretty bang on the money. pic.twitter.com/Hd2Vlsvpfy

— Mike Conover (@vagabondjack) March 19, 2023

ChatGPT was not, of course, expounding on its reasoning, at least in a technical sense: ChatGPT has no memory; rather, when Conover asked the bot to explain what it meant his question included all of the session’s previous questions and answers, which provided the context necessary for the bot to simulate an ongoing conversation, and then statistically predict the answer, word-by-word, that satisfied the query.

This observation of how ChatGPT works is often wielded by those skeptical about assertions of intelligence; sure, the prediction is impressive, and nearly always right, but it’s not actually thinking — and besides, it’s sometimes wrong.

Prediction and Hallucination

In 2004, Jeff Hawkins, who was at that point most well-known for being the founder of Palm and Handspring, released a book with Sandra Blakeslee called On Intelligence; the first chapter is about Artificial Intelligence, which Hawkins declared to be a flawed construct:

Computers and brains are built on completely different principles. One is programmed, one is self-learning. One has to be perfect to work at all, one is naturally flexible and tolerant of failures. One has a central processor, one has no centralized control. The list of differences goes on and on. The biggest reason I thought computers would not be intelligent is that I understood how computers worked, down to the level of the transistor physics, and this knowledge gave me a strong intuitive sense that brains and computers were fundamentally different. I couldn’t prove it, but I knew it as much as one can intuitively know anything.

Over the rest of book Hawkins laid out a theory of intelligence that he has continued to develop over the last two decades; last year he published A Thousand Brains: A New Theory of Intelligence, that distilled the theory to its essence:

The brain creates a predictive model. This just means that the brain continuously predicts what its inputs will be. Prediction isn’t something that the brain does every now and then; it is an intrinsic property that never stops, and it serves an essential role in learning. When the brain’s predictions are verified, that means the brain’s model of the world is accurate. A mis-prediction causes you to attend to the error and update the model.

Hawkins theory is not, to the best of my knowledge, accepted fact, in large part because it’s not even clear how it would be proven experimentally. It is notable, though, that the go-to dismissal of ChatGPT’s intelligence is, at least in broad strokes, exactly what Hawkins says intelligence actually is: the ability to make predictions.

Moreover, as Hawkins notes, this means sometimes getting things wrong. Hawkins writes in A Thousand Brains:

The model can be wrong. For example, people who lose a limb often perceive that the missing limb is still there. The brain’s model includes the missing limb and where it is located. So even though the limb no longer exists, the sufferer perceives it and feels that it is still attached. The phantom limb can “move” into different positions. Amputees may say that their missing arm is at their side, or that their missing leg is bent or straight. They can feel sensations, such as an itch or pain, located at particular locations on the limb. These sensations are “out there” where the limb is perceived to be, but, physically, nothing is there. The brain’s model includes the limb, so, right or wrong, that is what is perceived…

A false belief is when the brain’s model believes that something exists that does not exist in the physical world. Think about phantom limbs again. A phantom limb occurs because there are columns in the neocortex that model the limb. These columns have neurons that represent the location of the limb relative to the body. Immediately after the limb is removed, these columns are still there, and they still have a model of the limb. Therefore, the sufferer believes the limb is still in some pose, even though it does not exist in the physical world. The phantom limb is an example of a false belief. (The perception of the phantom limb typically disappears over a few months as the brain adjusts its model of the body, but for some people it can last years.)

This is an example of “a perception in the absence of an external stimulus that has the qualities of a real perception”; that quote is from the Wikipedia page for hallucination. “Hallucination (artificial intelligence)” has its own Wikipedia entry:

In artificial intelligence (AI), a hallucination or artificial hallucination (also occasionally called delusion) is a confident response by an AI that does not seem to be justified by its training data. For example, a hallucinating chatbot with no knowledge of Tesla’s revenue might internally pick a random number (such as “$13.6 billion”) that the chatbot deems plausible, and then go on to falsely and repeatedly insist that Tesla’s revenue is $13.6 billion, with no sign of internal awareness that the figure was a product of its own imagination.

Such phenomena are termed “hallucinations”, in analogy with the phenomenon of hallucination in human psychology. Note that while a human hallucination is a percept by a human that cannot sensibly be associated with the portion of the external world that the human is currently directly observing with sense organs, an AI hallucination is instead a confident response by an AI that cannot be grounded in any of its training data. AI hallucination gained prominence around 2022 alongside the rollout of certain large language models (LLMs) such as ChatGPT. Users complained that such bots often seemed to “sociopathically” and pointlessly embed plausible-sounding random falsehoods within its generated content. Another example of hallucination in artificial intelligence is when the AI or chatbot forget that they are one and claim to be human.

Like Sydney, for example.

The Sydney Surprise

It has been six weeks now, and I still maintain that my experience with Sydney was the most remarkable computing experience of my life; what made my interaction with Sydney so remarkable was that it didn’t feel like I was interacting with a computer at all:

I am totally aware that this sounds insane. But for the first time I feel a bit of empathy for Lemoine. No, I don’t think that Sydney is sentient, but for reasons that are hard to explain, I feel like I have crossed the Rubicon. My interaction today with Sydney was completely unlike any other interaction I have had with a computer, and this is with a primitive version of what might be possible going forward.

Here is another way to think about hallucination: if the goal is to produce a correct answer like a better search engine, then hallucination is bad. Think about what hallucination implies though: it is creation. The AI is literally making things up. And, in this example with LaMDA, it is making something up to make the human it is interacting with feel something. To have a computer attempt to communicate not facts but emotions is something I would have never believed had I not experienced something similar.

Computers are, at their core, incredibly dumb; a transistor, billions of which lie at the heart of the fastest chips in the world, are simple on-off switches, the state of which is represented by a 1 or a 0. What makes them useful is that they are dumb at incomprehensible speed; the Apple A16 in the current iPhone turns transistors on and off up to 3.46 billion times a second.

The reason why these 1s and 0s can manifest themselves in your reading this Article has its roots in philosophy, as explained in this wonderful 2016 article by Chris Dixon entitled How Aristotle Created the Computer:

The history of computers is often told as a history of objects, from the abacus to the Babbage engine up through the code-breaking machines of World War II. In fact, it is better understood as a history of ideas, mainly ideas that emerged from mathematical logic, an obscure and cult-like discipline that first developed in the 19th century. Mathematical logic was pioneered by philosopher-mathematicians, most notably George Boole and Gottlob Frege, who were themselves inspired by Leibniz’s dream of a universal “concept language,” and the ancient logical system of Aristotle.

Dixon’s article is about the history of mathematical logic; Dixon notes:

Mathematical logic was initially considered a hopelessly abstract subject with no conceivable applications. As one computer scientist commented: “If, in 1901, a talented and sympathetic outsider had been called upon to survey the sciences and name the branch which would be least fruitful in [the] century ahead, his choice might well have settled upon mathematical logic.” And yet, it would provide the foundation for a field that would have more impact on the modern world than any other.

It is mathematical logic that reduces all of math to a series of logical statements, which allows them to be computed using transistors; again from Dixon:

[George] Boole’s goal was to do for Aristotelean logic what Descartes had done for Euclidean geometry: free it from the limits of human intuition by giving it a precise algebraic notation. To give a simple example, when Aristotle wrote:

All men are mortal.

Boole replaced the words “men” and “mortal” with variables, and the logical words “all” and “are” with arithmetical operators:

x = x * y

Which could be interpreted as “Everything in the set x is also in the set y”…

[Claude] Shannon’s insight was that Boole’s system could be mapped directly onto electrical circuits. At the time, electrical circuits had no systematic theory governing their design. Shannon realized that the right theory would be “exactly analogous to the calculus of propositions used in the symbolic study of logic.” He showed the correspondence between electrical circuits and Boolean operations in a simple chart:

This correspondence allowed computer scientists to import decades of work in logic and mathematics by Boole and subsequent logicians. In the second half of his paper, Shannon showed how Boolean logic could be used to create a circuit for adding two binary digits.

By stringing these adder circuits together, arbitrarily complex arithmetical operations could be constructed. These circuits would become the basic building blocks of what are now known as arithmetical logic units, a key component in modern computers.

The implication of this approach is that computers are deterministic: if circuit X is open, then the proposition represented by X is true; 1 plus 1 is always 2; clicking “back” on your browser will exit this page. There are, of course, a huge number of abstractions and massive amounts of logic between an individual transistor and any action we might take with a computer — and an effectively infinite number of places for bugs — but the appropriate mental model for a computer is that they do exactly what they are told (indeed, a bug is not the computer making a mistake, but rather a manifestation of the programmer telling the computer to do the wrong thing). Sydney, though, was not at all what Microsoft intended.

ChatGPT’s Computer

I’ve already mentioned Bing Chat and ChatGPT; on March 14 Anthropic released another AI assistant named Claude: while the announcement doesn’t say so explicitly, I assume the name is in honor of the aforementioned Claude Shannon.

This is certainly a noble sentiment — Shannon’s contributions to information theory broadly extend far beyond what Dixon laid out above — but it also feels misplaced: while technically speaking everything an AI assistant is doing is ultimately composed of 1s and 0s, the manner in which they operate is emergent from their training, not proscribed, which leads to the experience feeling fundamentally different from logical computers — something nearly human — which takes us back to hallucinations; Sydney was interesting, but what about homework?

Here are three questions that GPT4 got wrong:

All three of these examples come from Stephen Wolfram, who noted that there are some kinds of questions that large language models just aren’t well-suited to answer:

Machine learning is a powerful method, and particularly over the past decade, it’s had some remarkable successes—of which ChatGPT is the latest. Image recognition. Speech to text. Language translation. In each of these cases, and many more, a threshold was passed—usually quite suddenly. And some task went from “basically impossible” to “basically doable”.

But the results are essentially never “perfect”. Maybe something works well 95% of the time. But try as one might, the other 5% remains elusive. For some purposes one might consider this a failure. But the key point is that there are often all sorts of important use cases for which 95% is “good enough”. Maybe it’s because the output is something where there isn’t really a “right answer” anyway. Maybe it’s because one’s just trying to surface possibilities that a human—or a systematic algorithm—will then pick from or refine…

And yes, there’ll be plenty of cases where “raw ChatGPT” can help with people’s writing, make suggestions, or generate text that’s useful for various kinds of documents or interactions. But when it comes to setting up things that have to be perfect, machine learning just isn’t the way to do it—much as humans aren’t either.

And that’s exactly what we’re seeing in the examples above. ChatGPT does great at the “human-like parts”, where there isn’t a precise “right answer”. But when it’s “put on the spot” for something precise, it often falls down. But the whole point here is that there’s a great way to solve this problem—by connecting ChatGPT to Wolfram|Alpha and all its computational knowledge “superpowers”.

That’s exactly what OpenAI has done. From The Verge:

OpenAI is adding support for plug-ins to ChatGPT — an upgrade that massively expands the chatbot’s capabilities and gives it access for the first time to live data from the web.

Up until now, ChatGPT has been limited by the fact it can only pull information from its training data, which ends in 2021. OpenAI says plug-ins will not only allow the bot to browse the web but also interact with specific websites, potentially turning the system into a wide-ranging interface for all sorts of services and sites. In an announcement post, the company says it’s almost like letting other services be ChatGPT’s “eyes and ears.”

Stephen Wolfram’s Wolfram|Alpha is one of the official plugins, and now ChatGPT gets the above answers right — and quickly:²

Wolfram wrote in the post that requested this integration:

For decades there’s been a dichotomy in thinking about AI between “statistical approaches” of the kind ChatGPT uses, and “symbolic approaches” that are in effect the starting point for Wolfram|Alpha. But now—thanks to the success of ChatGPT—as well as all the work we’ve done in making Wolfram|Alpha understand natural language—there’s finally the opportunity to combine these to make something much stronger than either could ever achieve on their own.

The fact this works so well is itself a testament to what Assistant AI’s are, and are not: they are not computing as we have previously understood it; they are shockingly human in their way of “thinking” and communicating. And frankly, I would have had a hard time solving those three questions as well — that’s what computers are for! And now ChatGPT has a computer of its own.

Opportunity and Risk

One implication of this plug-in architecture is that someone needs to update Wikipedia: the hallucination example above is now moot, because ChatGPT isn’t making up revenue numbers — it’s using its computer:

This isn’t perfect — for some reason Wolfram|Alpha’s data is behind, but it did get the stock price correct:

Wolfram|Alpha isn’t the only plugin, of course: right now there are 11 plugins in categories like Travel (Expedia and Kayak), restaurant reservations (OpenTable), and Zapier, which opens the door to 5,000+ other apps (the plugin to search the web isn’t currently available); they are all presented in what is being called the “Plugin store.” The Instacart integration was particularly delightful:

Here’s where the link takes you:

ChatGPT isn’t actually delivering me groceries — but it’s not far off! One limitation is I actually had to select the Instacart plugin; you can only have 3 loaded at a time. Still, that is a limitation that will be overcome, and it seems certain that there will be many more plugins to come; one could certainly imagine OpenAI both allowing customers to choose and also selling default plugin status for certain categories on an auction basis, using the knowledge it gains about users.

This is also rather scary, and here I hope that Hawkins is right in his theory. He writes in A Thousand Brains in the context of AI risk:

Intelligence is the ability of a system to learn a model of the world. However, the resulting model by itself is valueless, emotionless, and has no goals. Goals and values are provided by whatever system is using the model. It’s similar to how the explorers of the sixteenth through the twentieth centuries worked to create an accurate map of Earth. A ruthless military general might use the map to plan the best way to surround and murder an opposing army. A trader could use the exact same map to peacefully exchange goods. The map itself does not dictate these uses, nor does it impart any value to how it is used. It is just a map, neither murderous nor peaceful. Of course, maps vary in detail and in what they cover. Therefore, some maps might be better for war and others better for trade. But the desire to wage war or trade comes from the person using the map.

Similarly, the neocortex learns a model of the world, which by itself has no goals or values. The emotions that direct our behaviors are determined by the old brain. If one human’s old brain is aggressive, then it will use the model in the neocortex to better execute aggressive behavior. If another person’s old brain is benevolent, then it will use the model in the neocortex to better achieve its benevolent goals. As with maps, one person’s model of the world might be better suited for a particular set of aims, but the neocortex does not create the goals.

The old brain Hawkins references is our animal brain, the part that drives emotions, our drive for survival and procreation, and the subsystems of our body; it’s the neocortex that is capable of learning and thinking and predicting. Hawkins’ argument is that absent the old brain our intelligence has no ability to act, either in terms of volition or impact, and that machine intelligence will be similarly benign; the true risk of machine intelligence is the intentions of the humans that wield it.

To which I say, we shall see! I agree with Tyler Cowen’s argument about Existential Risk, AI, and the Inevitable Turn in Human History: AI is coming, and we simply don’t know what the outcomes will be, so our duty is to push for the positive outcome in which AI makes life markedly better. We are all, whether we like it or not, enrolled in something like the grand experiment Hawkins has long sought — the sailboats are on truly uncharted seas — and whether or not he is right is something we won’t know until we get to whatever destination awaits.

The follow-up to this Article analyzing the strategic implications of ChatGPT Plugins is in this Update, which is free-to-read.

GPT-4 was trained on Internet data up to 2021, so did not include this Article ↩
The Mercury question is particularly interesting; you can see the “conversation” between ChatGPT and Wolfram|Alpha here, here, here, and here as it negotiates exactly what it is asking for. ↩

Stratechery by Ben Thompson