On Flocks

An image shared by an Anthropic employee discussing their coding agent workflows.

💡

Nerd Rating: Not nerdy but probably boring, as it is an attempt to respond thoughtfully to a social media conversation that I found challenging to address on BlueSky's very limited reply functions.

A few weeks ago, after a very successful "noisy systems" conference in Rome, I was getting on a $40 RyanAir flight when the stress of that low-budget flying experience was compounded by going viral on BlueSky. I posted the above image, from a VP at Anthropic, a systems diagram of how Claude Code's agents work. In the text I pointed out that the image was a vindication of my framing of "stochastic flocks" direct from Anthropic themselves.

I also said I was "stepping back from the stochastic parrot frame" – specifically to describe agentic systems. Stochastic Parrots refers to a 2021 paper by Emily M. Bender, Timnit Gebru, Angelina McMillan-Major and Margaret Mitchell. "Parrots" is one part of a larger text, offerred as a graspable metaphor for how the machines produce text without understanding it – by referencing statistical patterns of language (grammar, word associations, and so on) derived from training data.

Why I Hesitated in Line at the Airport

While I don't disagree with parrots, a few things have happened that gave me pause about using it as a frame in my own work.

I defended "parrots" to a conference room in San Francisco full of people working on Agentic Design for AI, and got critiqued by a researcher who refers to themselves as "we" because they have multiple agents that they consider a part of their identity, and I realized that once people hear "stochastic parrots" they often just stop listening. I am unsure what to do about this.
Margaret Mitchell, a co-author of the Parrots paper, wrote "AI is not a stochastic parrot" which has given me pause about whether stochastic flocks is the right extension of the metaphor for agentic systems (Not a critique, Mitchell's piece says more than the headline, I highly recommend reading it!).
Emily Bender was silent about "flocks," but also told me last summer that my project to reclaim AI as a field was potentially extending the fascist framing of the term "AI." I think that is not wrong and it was not an argument per se, but seemed like a strong indicator that maybe I was not the best vessel for defending her work with my own words.

I also made a mistake in the RyanAir line: I said "I am kind of done with the stochastic parrot framing because it just makes everyone too weird about it."

I said what I meant, but in context, I was linking "kind of done with the stochastic parrot framing" to the specific application of "agentic AI," for the reasons explained above. This was taken out of context (BlueSky!) and made into a whole thing.

All of the below texts are public, Emily specifically said at the end of the convo that their being public was probably why I wasn't answering her questions. Fair point, and I take it to mean it's ok to share these things here, since you could just go look at them anyway if you're on BlueSky.

RyanAir Gate, Ciampino International Airport

The conversation that follows is with Emily Bender. It is a result of that misunderstanding. I don't think there's much argument here, just a bit of confusion, on my part, for what seems to be talking past each other.

Me, posting: "[If you say Stochastic Parrots] ... you get a hundred AI bros telling you it’s not how it works and it is not worth it to me when other metaphors are available."

Emily Bender: That this particular metaphor has become such a shibboleth is, I think, not unrelated to misogyny.

Me: I agree. ... it is not lost on me that, among countless metaphors called upon to explain the mechanism of LLMs, the one that’s been the center of sustained irrational vitriol has been the one in a paper written by a group of women.

Emily Bender: Thank you. Give that the vitriol is directed near but not at you, there room here to use privilege for good rather than abandoning the field, because people get "weird".

Me: To be clear, it’s the frame shift from stochastic parrots to “stochastic flocks,” which was my attempt to explain recent design changes in LLMs, that I’m talking about moving away from: my use of the parrot metaphor to explain so-called “agentic” systems, which is the subject of the above thread.

I’m not sure I understand “abandoning the field,” but while my way of thinking has never been embraced by it, speaking its language has recently started to feel like appropriation. And it’s not disrespect of this work but respect for it that moves me to say: maybe I need to say it differently.

So for me, defending your metaphor *in this case* is fraught, because my sense of it is probably different than yours & that can easily turn into *misrepresentation.* I try hard not to do that, but I also want the flexibility to makes my own sense of things.

And, frankly, some time ago you suggested my general approach was (and I may be misremembering the exact wording) “a rough extension of a fascist project.” A critique I take seriously, but also take to mean that my particular defense / use of “parrots” might not be welcomed or helpful.

Emily Bender: I was trying to get you to let go of "AI" as a name worth reclaiming – while affirming that you were trying to target fascism. You are quite right that I am sensitive to words and how they are used, and I have been tracking how people talk about "stochastic parrots" — I've replied to a bunch of misconceptions here.

Of course everyone is free to develop, use, extend, etc any metaphor as they like. But if you are specifically walking away from "stochastic parrots" because you find the misogynistic vitriol that it triggers annoying or distasteful, and would rather not see it in your mentions, well then I hope you will at least examine the privilege that gives you that choice.

I did respond to the above bit, basically restating what I'd said before. You can read the thread by clicking on the following image. My previous responses are above it, and what follows is below it.

A Week Later

When you first published it, I was intrigued by the title but then didn't get past the first sentence: "AI technology is advancing." Ugh. That sounds like the start of SO MANY bad research papers I get stuck reading. — New Thread

I'll summarize what follows, but you can go see the full context of the questions she has posed; I think I did a fair job of presenting them.

What do I mean by "AI" and "Advancing?"

**What do you mean by "AI"? As in, what specific systems are you referring to? And how are you measuring "advancing" given the state of evaluations?**

The question is answered in the line after "AI technology is advancing," where I write "Anyone thinking critically about large language models and their impact on society now faces a more complex challenge: the agentic turn."

So, what I mean by AI here is "Large Language Models" and in particular, the recent move into coding via language production promised by Anthropic and OpenAI. I mention them a bit further down.

As far as advancing, I want to be clear that the Nazis also advanced toward Leningrad; this does not mean I was cheering for them. I almost immediately suggest that the advancement is a feat of marketing and UX improvements. I don't mention evals because they are unrelated to my claim about what's advancing. That's kind of my point.

Central to my piece's critique is that the models are being pushed specifically as "they write code now" and, as I noted, were / are being marketed that way. Because they do write code. They do produce more convincing language. They are changing. "Something is changing in AI" was all over the media, and I wanted to reach people and explain what was changing so I could explain what wasn't.

Who am I arguing with?

If you're not arguing with us, who are you arguing with?

The flock piece is arguing with an audience I did, in fact, name, and link to. I write:

"These developments are producing real improvements in the LLM's user experience — but is that truly a vindication of the AI project?"

I think that is as clear an articulation of a thesis as I could make.

The word "vindication" is a link to the specific person I am arguing with: Dan Kagan-Kans' viral piece from late 2025, called "The left is missing out on AI." In it, he claims that critics of AI are missing out on the possibilities of these systems because the left refuses to use them or take them seriously.

In that piece, Kagan-Kans references Stochastic Parrots and then dismisses the entire critical project as warped by it:

"This idea, that large-language models merely produce statistically plausible word sequences based on training data, without having any idea about what the words refer to, has become the baseline across much of the left-intellectual landscape. Thanks to it, fundamental questions about AI’s capabilities, now and in the future, are considered settled."

In that piece – again, which I explicitly say I am addressing, and link to – he cites parrots and writes,

"In 2023, when chatbots were more toy than tool, AI-as-autocomplete was maybe a defensible position. But now?"

The flow of my first three paragraphs as I intended them is a response to that question:

There is something changing about the user experience of these models.
People (like Kagan-Kans, and others linked) are claiming this means prior critique has been disproven and that critics need to hop on the bandwagon. Is that really the case?
I agree these systems are changing, and we don't have to deny that to insist that our critiques remain valid, because "useful" and "capabilities" have never been the basis of critical arguments.

"Flocks" is a counterpoint to the claim that "AI critics only ever said the tech was useless." This is wrong. My piece argues that the critics of AI are not goal post moving or in retreat after some imagined comeuppence.

My point is, look at all of these ongoing critiques of AI, and notice how none of them ever focused on whether or not AI was useful. Sub-thread: All of these critiques not only engage with the material reality of AI, but directly engage with the material consequences of its use.

Who am I arguing with when I cite Jenny Davis?

**Who are you arguing with here, when we raised them all of these concerns in chapter 7 of The AI Con?**

I am arguing, still, with Kagan-Kans.

I didn't cite Emily & Alex's book, because I haven't read it – it's in Rochester, NY while I am in Cambridge, UK; I hope to read it this summer. It hadn't occurred to me that I was arguing with them, nor did I think I had to insist that Bender, specifically, was the one asking such questions when there is an entire community of scholars and activists doing so.

That is my point. In the section she capped here, I cited Jenny Davis, one of my formative instructors at ANU whose 2020 book, "How Artifacts Afford: The Power and Politics of Everyday Things," was important to my thinking about how design can meet critical thinking. She even brought me in to make the explainer video when the book came out with MIT Press.

That I gave so much real estate to the Parrots paper was intended to be a form of praise for its centrality to critique and establish that many critics were asking questions that transcend use: I also cite Abeba Birhane, Olivia Guest, Iris van Rooij, Sasha Luccioni, Sarah T. Roberts, and could have cited many more: all excellent scholars whose work is shaping the conversation. None of whom require AI to be useless in order for their critiques to hold legitimacy. (Ofc, their personal views may vary).

In "flocks," I wrote:

"The [Stochastic Parrot] paper’s central question, “Can Language Models Be Too Big?” is as relevant as it was five years ago with massive investments into data collection and processing."

So clearly I don't think the paper is irrelevant. My point is that the shift to "flocks" of stacked parrots compounds those problems, while boosters are arguing it has eliminated those problems.

The questions I raise are intended to show that a critical relationship to AI looks like it always has looked. It isn't missing, it's being sidelined by critiques asking where it is.

"Without denying what they do" – what do I think they can do?

"This reads to me as you saying that they *can* do something that we (the SP authors) deny. And what do you think they *can* do?"

I answered Emily directly:

"They can generate text, code or images in response to the prompt. Sometimes, that image, text, or code gives people what they ask for. As the piece says, it often does not (slopware) or comes at unanticipated or invisibilized costs. I then try to show what those costs are."

As I say at the top of the conclusion: These are just a handful of the problems that derive from agent-based systems under the assumption that they are useful tools.

I know that they are not thinking machines, and have written about this exact thing several times, including in two pieces that Emily herself recirculated and praised on social media; and many more pieces that address it across fields, including more theoretically. In my presentation in Berkeley, I say as much:

"Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell’s 2021 paper described LLMs as stochastic parrots—systems that reproduce statistically likely patterns from training data. Though many in the industry and likely in this room have misinterpreted this description, it is worth reading closely. It was not an indictment of what the system could do. It was a technical description aimed at assessing what we could reliably infer about its output."

The result of stochastic parroting, as I see it, is that compelling illusions of textual craftsmanship from a machine can nonetheless function (key word!) as operational text. This is a really interesting, nerdy theoretical tension worthy of investigation! It cuts into real problems of use. We have no idea what we're interacting with, or what happens when we read intention into language from a machine that isn't following our rules.

My one quibble is the claim that LLMs operate "without any reference to meaning," and it's a small one, mostly about how "meaning" is defined across fields. Bender probably didn't write the paper for a cultural studies department, and by some definitions "meaning" is very much still what LLMs reference. It's resolvable, not wrong, and until I can articulate the difference, I sidestep it by saying "I think the text has meaning" to audiences I speak with.

For most people who touch these things, the simulated approximation of code and the thoughtful generation of code are indistinguishable if the product is indistinguishable, and the risks that get carried into these systems through that position are many. In fact, the centerpiece of the "flocks" piece discusses five of them: slopware, the compounding of technical failures, the loss of clear accountability, an emphasis on computational solutionism, and scaling up extraction and waste. This is not a disagreement with Bender.

Who do I think is 'denying what they do'?

My argument is that serious, socially engaged critics have always been engaging with deeper concerns than pure systems evaluation. The argument that these critical scholars were proven wrong by Claude Code is nonsense.

At the same time, that strawman is not from nowhere: lots of lazy critique centers an entire theory of change on the market becoming rational overnight, or people waking up to realize they're delusional for thinking LLMs do anything for them. That anonymous "BlueSky" anti-AI poster exists, and periodically pops up to tell me I am evil (really), but they aren't the people to pay attention to or equate with the real and rigorous academic scholarship that supports critiques of AI systems.

My entire argument is that the expert critiques of LLMs were never dependent on them being useless, so debating whether they're useful changes none of those critiques. When it is applied as lazy criticism, it increases the resistance to real engagement: if someone believes they have a genuine use for Claude Code, but is open to hearing about the invisible labor or environmental costs, why would we focus on convincing them Claude doesn't work? It isn't necessary and is extremely counter-productive.

The moral of this story: try to avoid flying RyanAir whenever possible and if you do, for the love of God, don't post.

On Flocks

Why I Hesitated in Line at the Airport

RyanAir Gate, Ciampino International Airport

A Week Later

What do I mean by "AI" and "Advancing?"

Who am I arguing with?

Who am I arguing with when I cite Jenny Davis?

"Without denying what they do" – what do I think they can do?

Who do I think is 'denying what they do'?

Read more

It's Not Just X. It's Y.

Entropy Studies

The Computer Science Fetish

Toward a Critical Agentic Systems Design Practice