My first job was as a paper boy:
The job was remarkably analog: a bundle of newspapers would be dropped off at my house, I would wrap them in rubber-bands (or plastic bags if it were raining), load them up in a canvas sack, and set off on my bike; once a month my parents would drive me around to collect payment. Little did I appreciate just how essential my role writ large was to the profitability of newspapers generally.
Newspapers liked to think that they made money because people relied on them for news, furnished by their fearless reporters and hard-working editors; not only did people pay newspapers directly, but advertisers were also delighted to pay for the privilege of having their products placed next to the journalists’ peerless prose. The Internet revealed the fatal flaw in this worldview: what newspapers provided was distribution thanks to infrastructure like printing presses and yours truly.
Once the Internet reduced distribution costs to zero, three truths emerged: first, that “news”, once published, retained no economic value. Second, newspapers no longer had geographic monopolies, but were instead in competition with every publication across the globe. Third, advertisers didn’t care about content, but rather about reaching customers.
I illustrated these three truths in 2015’s Popping the Publishing Bubble:
Editorial and ads used to be a bundle; next, the Internet unbundled editorial and ads, and provided countless options for both; the final step was ads moving to platforms that gave direct access to users, leaving newspapers with massive reach and no way to monetize it.
The Idea Propagation Value Chain
As much as newspapers may rue the Internet, their own business model — and my paper delivery job — were based on an invention that I believe is the only rival for the Internet’s ultimate impact: the printing press. Those two inventions, though, are only two pieces of the idea propagation value chain. That value chain has five parts:
The evolution of human communication has been about removing whatever bottleneck is in this value chain. Before humans could write, information could only be conveyed orally; that meant that the creation, vocalization, delivery, and consumption of an idea were all one-and-the-same. Writing, though, unbundled consumption, increasing the number of people who could consume an idea.
Now the new bottleneck was duplication: to reach more people whatever was written had to be painstakingly duplicated by hand, which dramatically limited what ideas were recorded and preserved. The printing press removed this bottleneck, dramatically increasing the number of ideas that could be economically distributed:
The new bottleneck was distribution, which is to say this was the new place to make money; thus the aforementioned profitability of newspapers. That bottleneck, though, was removed by the Internet, which made distribution free and available to anyone.
What remains is one final bundle: the creation and substantiation of an idea. To use myself as an example, I have plenty of ideas, and thanks to the Internet, the ability to distribute them around the globe; however, I still need to write them down, just as an artist needs to create an image, or a musician needs to write a song. What is becoming increasingly clear, though, is that this too is a bottleneck that is on the verge of being removed.
This image, like the first two in this Article, was created by AI (Midjourney, specifically). It is, like those two images, not quite right: I wanted “A door that is slightly open with light flooding through the crack”, but I ended up with a door with a crack of light down the middle and a literal flood of water; my boy on a bicycle, meanwhile, is missing several limbs, and his bike doesn’t have a handlebar, while the intricacies of the printing press make no sense at all.
They do, though, convey the idea I was going for: a boy delivering newspapers, printing presses as infrastructure, and the sense of being overwhelmed by the other side of an opening door — and they were all free.1 To put in terms of this Article, I had the idea, but AI substantiated it for me — the last bottleneck in the idea propagation value chain is being removed.
In a previous iteration of the machine learning paradigm, researchers were obsessed with cleaning their datasets and ensuring that every data point seen by their models is pristine, gold-standard, and does not disturb the fragile learning process of billions of parameters finding their home in model space. Many began to realize that data scale trumps most other priorities in the deep learning world; utilizing general methods that allow models to scale in tandem with the complexity of the data is a superior approach. Now, in the era of LLMs, researchers tend to dump whole mountains of barely filtered, mostly unedited scrapes of the Internet into the eager maw of a hungry model.
Roon’s focus is on text as the universal input, and connective tissue.2 Note how this insight fits into the overall development of communication: oral communication was a prerequisite to writing and reading; widespread literacy was a prerequisite to anyone being able to publish on the Internet; the resultant flood of text and images enabled by zero marginal distribution is the prerequisite for models that unbundle the creation of an idea and its substantiation.
This, by extension, hints at an even more surprising takeaway: the widespread assumption — including by yours truly — that AI is fundamentally centralizing may be mistaken. If not just data but clean data was presumed to be a prerequisite, then it seemed obvious that massively centralized platforms with the resources to both harvest and clean data — Google, Facebook, etc. — would have a big advantage. This, I would admit, was also a conclusion I was particularly susceptible to, given my focus on Aggregation Theory and its description of how the Internet, contrary to initial assumptions, leads to centralization.
The initial roll-out of large language models seemed to confirm this point of view: the two most prominent large language models have come from OpenAI and Google; while both describe how their text (GPT and GLaM, respectively) and image (DALL-E and Imagen, respectively) generation models work, you either access them through OpenAI’s controlled API, or in the case of Google don’t access them at all. But then came this summer’s unveiling of the aforementioned Midjourney, which is free to anyone via its Discord bot. An even bigger surprise was the release of Stable Diffusion, which is not only free, but also open source — and the resultant models can be run on your own computer.
There is, as you might expect, a difference in quality; Dall-E, for example, had the most realistic “newspaper delivery boy throwing a newspaper”:
Stable Diffusion was on the other end of the spectrum:
What is important to note, though, is the direction of each project’s path, not where they are in the journey. To the extent that large language models (and I should note that while I’m focusing on image generation, there are a whole host of companies working on text output as well) are dependent not on carefully curated data, but rather on the Internet itself, is the extent to which AI will be democratized, for better or worse.
The Impact on Creators
I told Bors that what I felt worst about was how mindless my decision to use Midjourney ultimately had been. I was caught up in my own work and life responsibilities and trying to get my newsletter published in a timely fashion. I went to Getty and saw the same handful of photos of Alex Jones, a man who I know enjoys when his photo is plastered everywhere. I didn’t want to use the same photos again, nor did I want to use his exact likeness at all. I also, selfishly, wanted the piece to look different from the 30 pieces that had been published that day about Alex Jones and the Sandy Hook defamation trial. All of that subconsciously overrode all the complicated ethical issues around AI art that I was well apprised of.
What worries me about my scenario is that Midjourney was so easy to use, so readily accessible, and it solved a problem (abstracting Jones’ image in a visually appealing way), that I didn’t have much time or incentive to pause and think it through. I can easily see others falling into this like I did.
For these reasons, I don’t think I’ll be using Midjourney or any similar tool to illustrate my newsletter going forward (an exception would be if I were writing about the technology at a later date and wanted to show examples). Even though the job wouldn’t go to a different, deserving, human artist, I think the optics are shitty, and I do worry about having any role in helping to set any kind of precedent in this direction. Like others, I also have questions about the corpus used to train these art tools and the possibility that they are using a great deal of art from both big-name and lesser-known artists without any compensation or disclosure to those artists. (I reached out to Midjourney to ask some clarifying questions as to how they choose the corpus of data to train the tool, and they didn’t respond.)
I get Warzel’s point, and desire to show solidarity to artists worried about the impact of AI-generated art on their livelihoods. They are, it seems to me, right to worry: I opened this Article discussing the demise of newspapers which, once the connection between duplication and distribution was severed, quickly saw their business models fall apart. If the connection between idea creation and idea substantiation is being severed, it seems reasonable to assume all attendant business models might suffer the same fate.
There are, though, two rejoinders: the first is that abundance has its own reward. I am uniquely biased in this regard, seeing as how I make my living on the Internet as a publisher effectively competing with the New York Times, but I would argue that not just the quantity but, in absolute terms, the quality of content available to every single person in the world is dramatically higher than it was before the distribution bottleneck was removed. It seems obvious that removing the substantiation bottleneck from ideas will result in more good ones as well (along with, by definition, an even greater increase in not so good ones).
The analogy to publishing also point to what will be the long-term trend for any profession affected by these models: relatively undifferentiated creators who depended on the structural bundling of idea creation and substantiation will be reduced to competing with zero marginal cost creators for attention generated and directed from Aggregators; highly differentiated creators, though, who can sustainably deliver both creation and substantiation on their own will be even more valuable. Social media, for example, has been a tremendous boon to differentiated publishers: it gives readers a megaphone to tell everyone how great said publisher is. These AI tools will have a similar effect on highly differentiated creators, who will leverage text-based iteration to make themselves more productive and original than ever before.
The second rejoinder is perhaps more grim: this is going to happen regardless. Warzel may be willing to overlook the obvious improvement in not just convenience but also, for his purposes, quality proffered by his use of Midjourney, but few if any will make the same choice. AI-generated images will, per the image above, soon be a flood, just as publishing on the Internet quickly overwhelmed the old newspaper business model.
Moreover, just as native Internet content is user-generated content, the iterative and collaborative nature of AI-generated content — both in the sense of being a by-product of content already created, and also the fact that every output can be further iterated upon by others — will prove to be much more interesting and scalable than what professional organizations can produce. TikTok, which pulls content from across its network to keep users hooked, is the apotheosis of user-generated content; Metaverses may be the apotheosis of AI-generated content.
I wrote a follow-up to this Article in this Daily Update.
Beyond the $600 annual fee I paid to Midjourney to have access to the fully rights-unencumbered Corporate plan ↩
In the case of these image applications, noise is added to an known image and then the model is trained on backing out the image from pure noise; the resultant model can then be applied to any arbitrary text applied to pure noise, based on further training of matched text and images ↩