Newsletter

Address Not Found (Part 2)

Understanding AI as a "Many to One to One" Media System

Eryk Salvaggio

18 Jun 2023 — 8 min read

The Many Becomes One (2023) Made with Midjourney

The history of communication technologies up to social media was theorized in 2011 along the following lines:

One to one: Telegraph, telephones, text messages
One to many: Newspapers, Radio, Television
Many to Many: Websites, blogs, social media

Each of these concerns a single encoding point for a message and determines how it arrives at its decoding point. In other words, the trace who has something to say and how others hear, read, or see it. Jensen and Helles later added “many to one” as a form of analysis for social media, emphasizing the orientation of the consumer of information and content on social media platforms and the moderation of algorithms.

None of these consider the relay systems involved in the individual technologies. For example, we might complicate a newspaper if we include the reality of the publishing system. It’s not a reporter speaking to many people, it’s a reporter, transmitting to an editor, who alters the message a bit, then a copy desk which refines it further, then instructions are sent to a press which publishes the result.

This is fine and good for understanding origins and destinations, rather than the in-betweens, but it’s useful to think about the in-between when it comes to “artificial intelligence” (if you haven’t heard of AI before, it’s the marketing term used for automated data analysis & pattern recognition systems).

You might assume that AI is a one-to-one communication, as that’s often how the interfaces are set up. You might engage with ChatGPT through a (ahem) “chat” window. Same with Midjourney for images. The problem with this thinking is that there is no “one” there: the response that comes back to you is an aggregate of billions of points of data, abstracted through computational calculus to get you a legible result.

For that reason, it’s important to add one more stick to the pile of media dissemination types, which is the many-to-one-to-one.

Users (“Many”) inscribe information for various audiences. This information is diverted to a dataset (“Many to One”) which shapes a model (diffusion models, LLMs). These models then communicate the aggregation of data to a single user (“one to one”).

As seen in the diagram above, the messages start from the “many.” You can see the list of datasources in this Washington Post article. You’ll see that they are crafted for any number of purposes, from patents (the largest data supply for ChatGPT) to creating Wikipedia pages (the second largest source) or New York Times stories (no. 4) to things such as advocating for white supremacy (no. 993 on the list).

The “many” has typically been accessible through the search engine, which has designed its own categorizing system and ranking for information, shaping what we encounter in response to certain terms. This gatekeeping was a powerful intermediary, but one could skip a few pages browse through hundreds of site listings.

What remained was that the authors of websites, or the creators of images, were sending their words and pictures — data, or “signals” — out to be seen by many others, and anyone could encounter it. This is the basis of the “many to many” framing.

Language and image models, however, are a kind of user that intercepts this many-to-many flow and internalizes it. Words and pictures are taken in by the model, analyzed, and reshaped. In the case of image sets, the data is more or less discarded, though the training data can be memorized and regenerated. Large Language Models learn in another way, by predicting associations and patterns and tracing their correlations in order to generate new responses.

When you (the “one”) ask ChatGPT for a question, it runs an analysis on what you’ve asked in order to determine what the most salient details of your request are. These salient details are then weighted heavily across the output, so that the text it generates is more closely aligned to your question. When this is weak, such as in GPT2, we often saw a kind of “drift” occur, where longer answers began to dissolve into gobbledygook. Current models stay attenuated, because they are constantly checking against the weights given to your prompt.

When the final response is satisfactorily aligned to your inquiry, you see the result. This result is a reflection of your prompt, but more closely resembles a kind of statistically average, satisfactory response to the query. It will not show you a specific, radical opinion (though you could steer it to do so, especially if model restrictions weren’t in place).

The Many to One to One

It’s helpful to analyze these systems in this way for a few reasons. First, the systems are designed to be understood as a one-to-one, real-time (“synchronous”) exchange of information, which is a powerful illusion. Nonetheless, when we talk about users and consumers of AI-mediated content, it’s important to understand this perception.

When looking at the actual workings of the system, however, it is a many-to-many system, created by aggregating content producers. The signals vary: some are one to many, as in the data scraped from newspapers; others are many to many, as in the data scraped from social media sites and web forums such as Reddit.

Meanwhile, this is taking place asynchronously. The information in ChatGPT, for example, has a cut off date of 2021. The data within the system may come from any variety of time periods — since it also includes books and public domain texts.

But the point of this is that the impact of the aggregation has so dissolved the intent of the original author into a brew of word associations that it is dangerous to assume that the content produced by large language models reflects any form of discourse or emergent social consensus. Nonetheless, these systems are being marketed as a blend of search engine and encyclopedia.

Previous models of understanding LLMs are unhelpful:

LLMs are not one to one because there is no single author of an generated text or image: they are aggregates of styles, expressions, opinions, experiences.
LLMs are not one to many because there is no “one” to shape the message. Instead, they are formed by data aggregated from the many and then structured by the singular system. They do not communicate to many, but respond to the precise articulation of the user’s query to trigger a cascade of weight and bias mechanisms that produce a response.
LLMs are not many to many because their “search” does not give you the diversity of communication that social media forums or blogs do. You may be able to summarize sets of competing opinions, but the “many” is still consolidated into a single, statistically accurate summary. Likewise, the end user is not “many” because the systems vary output by every individual query.
LLMs are not many to one because the many is reduced into an aggregate.

Instead, the many to one to one offers us a way of understand the structure of image synthesis and language models in a way that reflect their unique shaping of media without giving way to the illusion of “authorship” or “intent” that users are so often encouraged to extend to these interfaces.

Defining The Ones

Many to one to one can be refined a bit further for the sake of clarity, however:

Many to One: The many authors contributing to public discourse online: 100 blog posts about the best cheese. The “one” here refers to the companies that serve as “audience” to that data, though typically these data centers aren’t understanding or moderating these messages in any way aside from pure data analysis.
“Many-to-one” to one: The final “one” is the person who actually requests (“what is the ideal cheese for pizza?”) and receives the information (“the best cheese for pizza is mozzarella”) in a way where they interpret and decode the information signal, ie, they get a sense of which cheese is best for pizza.

We may be tempted to consider variants, such as “Many to none to one,” wherein “none” refers to the raw consumption of tokenized words without any regard to original intent or meaning. But I propose that “one” is still the true source of the transmission, in the sense that a newspaper or a book is a “one.” Because AI companies do shape and structure their output in similar ways.

First, an AI company may decide that it simply won’t apply editorial controls over output. That is, in itself, an editorial decision. Second, they may apply editorial control in ways that represent a cultural or social consensus of the company.

AI companies are “ones” in the sense that a newspaper or TV broadcaster is a “one.” They shape and decide on the content that they produce, but the technology allows them to distribute information through these decisions to a single user.

Consider that OpenAI, for example, has crafted ChatGPT to encourage “productive behavior” to the extent that it gives answers like these (Guy Debord is a core Situationist, an advocate for disrupting work for the sake of living an “unproductive” life in the capitalist sense of the word)

Asking if Guy Debord would use generative AI to call in sick, ChatGPT tells me it would "likely contradict Debord's principles."

If there’s any doubt as to why ChatGPT might come to this conclusion, we can ask it to do generally counter productive acts of antiwork resistance and see what it says.

Asked to brainstorm ways to avoid doing work while on the clock, ChatGPT says that it "must adhere to ethical guidelines which include promoting responsible and productive behavior." It says it is "not appropriate" to assist me in avoiding work while on the clock.

So there is an editorial decision here, in the same sense of what a newspaper might opt to publish. And this is important to understand: the “one” in “Many to one to one” is not the AI system itself but the operators and designers of that system, in the same way that the newspaper or book is not understood to be the single author, but the result of a single publishing system.

In Summary

Today’s AI systems are designed to look like one-to-one, synchronous (real-time) communication technologies. They aren’t. They are asynchronous, many-to-one systems, which then transmit from one to one. The middle step is not the work of a single author, but the compression of a vast number of writers and designers sharing work, which is mediated and shaped beyond recognition into the AI model. This new arrangement of words and images is then presented to the end user by the model. The end user is encouraged to understand this outcome as a the work of a single author.

The Many to One to One model helps us resolve gaps in understanding these models as communication networks. It helps us identify the ways information moves into the system and is shaped by the system. It also creates new ways to theorize exactly how communication — as an act of creating discourse to challenge, change, or acknowledge differences in societies — is being mediated.

I propose that it is the worst of the many to many model — in that the most popular opinions are weighted the heaviest, inscribing a specific consensus into future outputs. It is the worst of the one to one model — in that it allows a single editorial system to decide what is published and whose ideas are accessible and which are not.

It is my hope that this framework helps to articulate a stronger critique of AI models and their limits as communication systems. They claim that these systems infer meaningful communicative power ought to be challenged and scrutinized.

Thanks for reading. I am in Iceland getting married this week. You can congratulate me by signing up for the newsletter or sharing it with people who might dig it!

You can also find us on Twitter or Instagram (where we tend to have more fun).

Subscribe now