Newsletter

Language is a SimCity

Building GPT4 in 1949

Eryk Salvaggio

02 Jul 2023 — 10 min read

“Language is a city, to the building of which every human being brought a stone; yet he is no more to be credited with the grand result than the acaleph which adds a cell to the coral reef which is the basis of the continent.” — Ralph Waldo Emerson

Years ago I remember reading about a player’s experience with the Sims, a video game without goals or plots, but where human characters react to other characters. These decisions are automated unless players intervene: a simulated human terrarium. A player described a scenario with a married couple. The wife was an avid piano player, but died. During the state of grief after her death, her partner began to play the piano. This was a moving story, and it was questioned whether this was part of the code — but it was not. The behavior was statistically motivated; it was the human observation of those statistics that created meaning.

In October 1949, JR Pierce writes a piece for Astounding Science Fiction magazine, inspired by Claude Shannon, about the shocking (relative) complexity of thought that emerges from connecting statistics to randomized words, based on their appearances in a random novel. He finds he is able to generate sentences that seem to have an author even though the process is mechanical: choosing a word, scanning the book for it to appear again, and then placing the word from its second occurence beside it.

The first result is a challenge to parse, but Pierce notes that he’s intrigued anyway, writing that he “would like to ask the author more about him. Unfortunately, there is no author to ask. I should hear no more unless, perhaps, chance should answer my questions.”

Pierce pieces together another experiment. Rather than putting individual words together, he looks at the previous three words, and makes a statistical inference about them?

“English and its statistics reside in the human brain and they can be tapped at the source. One has only to show a list of the latest three words of a passage to a person unfamiliar with those preceding and ask him to make up a sentence including these three words and to write down the word which, in that sentence, follows the three. The statistics linking four word combinations are automatically evoked in this process. The word chosen can, and is likely to, follow the three.”

So Pierce begins with a prompt — “When the morning,” and offers it to 21 friends, each of whom adds one word. The next person sees only the previous three words, to which they add the fourth. It’s a parlor game, exquisite corpse, applied to writing instead of Dadaist sketch-making.

With each addition, the context begins to wane.

“When the morning” → “When the morning broke” → “the morning broke” → “the morning broke after” → “morning broke after” → “morning broke after an” → etc.

The final sentence to emerge: When the morning broke after an orgy of abandon he said her head shook quickly vertically aligned in a sequence of words follows what.

Not great. But, you can see the context is quite strong at the beginning (“when the morning broke after an orgy of abandon”) and then breaks apart, though certain clusters of words make sense to us: “morning broke,” “shook quickly,” etc.

What Pierce had described was also a limitation of early, transformer-based generative AI text engines — think GPT2 — and natural language processors, which had limited scope of attention as they aimed to predict the next word based on words that had come before. Eventually these models would “drift” as they moved further from the start of the sentence, but also as the subject and focus of these sentences became more muddled: “A wandering of the mind,” Pierce writes, “But whose mind wanders?”

This is in a room of humans making statistical inferences about language: no automation involved. But already Pierce begins to ask the obvious questions:

“It is a little disturbing to think that an elaborate machine, taking longer-range statistics into account, could have done still better. The passage seems to us to have meaning, and yet the true and only source of this quotation is a small part of the statistics of the English language — and chance.”

Finally, Pierce creates another aspect to the trick — adding a topic to the bottom of the slip of paper before each guest adds their word to the previous three. This mimics “attention,” in that each guest is now meant to orient themselves to a particular topic, and the production of the sentence is now biased toward words that reflect it. The sentences now seem to hold a greater degree of internal consistency. Again, each writer only sees the previous three words on the paper. Yet, they create full paragraphs that are more legible as stories, such as this one:

“Money isn’t everything. However, we need considerably more incentive to produce efficiently. On the other hand, too little and too late to suggest a raise without reason for remuneration obviously less than they need, although they really are extremely meagre.”

A stark improvement, but not perfect. Pierce concludes:

“We see that the statistics involved are sufficient to give ‘meaning’ frequently, but they are scarcely adequate to insure ‘truth.’ But, if we mean by truth merely that which we are likely to find written in encyclopedias, statistics could, presumably, supply it, too. With the statistics which we have included, however, any merit of such compositions is more apt to be aesthetic than factual. … Here there is no creator or artist. The structure of the words is based merely on statistics, or, on the likelihood of their occurring in a certain order. Yet, they may have ‘meaning’ for the reader.”

More contemporary, in a paper published this week, Kidd and Birhane write that today’s models invite the same response, but toward risky ends:

In a classic study, people read intentionality into the movements of simple animated geometric shapes (6). Likewise, people commonly read intentionality— and humanlike intelligence or emergent sentience—into generative models even though these attributes are unsubstantiated (7). This readiness to perceive generative models as knowledgeable, intentional agents implies a readiness to adopt the information that they provide more rapidly and with greater certainty. This tendency may be further strengthened because models support multimodal interactions that allow users to ask models to perform actions like “see,” “draw,” and “speak” that are associated with intentional agents. The potential influence of models’ problematic outputs on human beliefs thus exceeds what is typically observed for the influence of other forms of algorithmic content suggestion such as search.

Thought Webs

Confusion between the spontaneous production of words and the conscious selection of words is everywhere. A recent paper by a team of Microsoft employees, testing a Microsoft product, was highlighted by This American Life. Unsurprisingly, the Microsoft engineers all said the new Microsoft product was astounding. The paper they published about it — Sparks of Artificial General Intelligence — ends by thanking OpenAI “for creating such a marvelous tool and giving us early access to experience it,” so you get a sense of the kind of critical thinking being applied.

I’m not a total naysayer either — GPT4, and even 3, are useful tools, and as Pierce writes (in 1950): “one should be happy to achieve anything new through mathematics.” What is frustrating is the sense of awe for the thing that transcends what is actually cool about it — and instead ascribes “intelligence,” that is, a sense that the tool understands the world, rather than simply producing something new through mathematics. In the piece, it’s suggested that GPT4 must understand the world if it is going to write consistent text.

As Pierce shows, though, a lot can be done through statistics, random selection, and biases. GPT4 works on a query, and so this produces a kind of anchor — the text, written at the bottom of the paper in the party trick. GPT4 can access far more than 3 words at a time and more than one “guiding weight” written at the bottom of the page. It’s also able to check, refine, rewrite, and test the text it generates in a matter of seconds. The possibilities of this statistical choreography are quite something, but too many believe that something else is dancing. Many are falling into the trap described by Pierce in 1949: The passage seems to us to have meaning, and yet the true and only source of this quotation is a small part of the statistics of the English language — and chance.

LLMs like GPT4 are also learning systems, and they may learn to “streamline” neural network pathways that are more frequently used. They are, technically speaking, more advanced than any machine that Pierce or Claude Shannon could have conceived of in 1949, enough to simulate the human-processing of language sequences which is, alone, a marvel. Despite the scaling up of parameters and layers of weighting and revision, though, the fundamentals of generating that text remain the same.

Emily Bender, Timnit Gebru and others call them “Stochastic Parrots,” writing:

“Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader’s state of mind. … This can seem counter-intuitive given the increasingly fluent qualities of automatically generated text, but we have to account for the fact that our perception of natural language text, regardless of how it was generated, is mediated by our own linguistic competence and our predisposition to interpret communicative acts as conveying coherent meaning and intent, whether or not they do. The problem is, if one side of the communication does not have meaning, then the comprehension of the implicit meaning is an illusion arising from our singular human understanding of language … an LM is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot.

Pierce references “Stochastic English” to describe the output of the word game he plays, and later describes “Stochastic Music.” In both cases it’s worth remembering that Pierce is clear that the source of these artistic expressions is statistics. The term “Artificial Intelligence” had yet to be coined.

Today, there is some idea that the human brain itself is simply a tool for putting word associations together into sense. But this concept of language is just one theory. Even within this theory, there is a range of human biases derived from experiences and learning and wisdom that radically shift the trajectory of those word choices.

Language in any case is not one pebble being placed after another. Ideas are like cracks hitting a windshield, teasing out all directions at once. The difference is that we can choose which ones best suit our ideas and follow it to its conclusion. Neural Networks — biological, or their digital models — ripple and shimmer. They branch out and then back in, carrying the strongest of signals forward and out. The crack on one end of the shatter may make another tear longer or shorter.

Synthetic Language runs like SimCity. SimCity may have held a small grid, by relative standards. Buildings on one side of the city had ripple effects through small modulations to the buildings surrounding it, and soon, gamers observed complex emergent behaviors. The attribution of meaning to these behaviors was anchored by the graphics — streets and buildings, parks and trees — but we do the same even when viewing the engine — Conway’s “Game of Life” — down to its most basic form, such as this “Glider,” which is an activation of squares turning on and off, but is perceived as a coordinated agent.

Conway notes that the glider transmits information about its state to anything it connects with, which then respond to that state, and any on-off behavior goes a-shimmer on the grid when the glider contacts it.

The difference between biological human communication and its digital representation — and emergence — is this: if humans choose to transform our (biological) signals into words, we have a choice of whether to express them through statistics — empty speech, Lacan might call it, speech that does nothing but signify language itself. This is language for language’s sake, and our ChatGPTs are quite good at this: bureaucratic text, formulaic emails, fixed speech for rituals and ceremonies.

Humans have another choice. We can also use words in an attempt to align others with our internal experience — Lacan’s full speech, “which aims at, which forms, the truth such as it becomes established in the recognition of one person by another.”

Full speech is a circuit, it requires intent, perception, and modulation by whomever it is addressed to. Another hears and acknowledges — even confirms — that the language rises to its goals. Full speech is also incomplete for this reason, and is not always virtuous. Coming to terms with another’s view of our own internal experiences can also distort our sense of self and alienate us from our own desires. But critically: a machine without desires or selfhood is incapable of full speech. Today’s machines are capable only of signification: lining up semantic pebbles, based on the traits of those that came before it, all biased toward a particular destination.

The meaning, the interpretations we place on the language it produces is reflective of our own desires, biased by the anchors we provide to the machine through prompts and queries. As we read GPT’s text responses, we are entering into the SimCity of Language: a seductive complexity and set of relationships that convincingly simulates agency, but is ultimately the meaning we assign to whatever emerges.

Things I Have Been Doing

I was away for three weeks, (getting married!) and sending out scheduled posts, but unable to give updates on any news or other fun things. So, here’s a few, belated!

Interview with Dirt about AI, Art, and Sarah Palin Forever

Last week Terry Nguyen and I spoke for Dirt Magazine about AI, art, and Sarah Palin Forever, which was in consideration for RAIN, the first European film festival for AI generated cinema. (I was also interviewed about the festival, in Catalan, for the newspaper, Ara.) The film is back online as of today after being available exclusively on the festival’s streaming platform for most of June.

Read It

Watch the Film

Thanks for reading and subscribing! If you’d like to share this post, please do: I rely on word-of-mouth and recommendations to grow my readership. If you can share it to platforms other than Twitter (which reduces the reach of Substack posts) I’d appreciate that, too! Thanks so much.

Subscribe now