The Future of Everything is Lies, I Guess: Annoyances
Table of Contents
This is a long article, so I'm breaking it up into a series of posts which will be released over the next few days. You can also read the full work as a PDF or EPUB; these files will be updated as each section is released.
The latest crop of machine learning technologies will be used to annoy us and frustrate accountability. Companies are trying to divert customer service tickets to chats with large language models; reaching humans will be increasingly difficult. We will waste time arguing with models. They will lie to us, make promises they cannot possible keep, and getting things fixed will be drudgerous. Machine learning will further obfuscate and diffuse responsibility for decisions. “Agentic commerce” suggests new kinds of advertising, dark patterns, and confusion.
Customer Service
I spend a surprising amount of my life trying to get companies to fix things. Absurd insurance denials, billing errors, broken databases, and so on. I have worked customer support, and I spend a lot of time talking to service agents, and I think ML is going to make the experience a good deal more annoying.
Customer service is generally viewed by leadership as a cost to be minimized. Large companies use offshoring to reduce labor costs, detailed scripts and canned responses to let representatives produce more words in less time, and bureaucracy which distances representatives from both knowledge about how the system works, and the power to fix it when the system breaks. Cynically, I think the implicit goal of these systems is to get people to give up.
The Future of Everything is Lies, I Guess: Information Ecology
Table of Contents
This is a long article, so I'm breaking it up into a series of posts which will be released over the next few days. You can also read the full work as a PDF or EPUB; these files will be updated as each section is released.
Machine learning shifts the cost balance for writing, distributing, and reading text, as well as other forms of media. Aggressive ML crawlers place high load on open web services, degrading the experience for humans. As inference costs fall, we’ll see ML embedded into consumer electronics and everyday software. As models introduce subtle falsehoods, interpreting media will become more challenging. LLMs enable new scales of targeted, sophisticated spam, as well as propaganda campaigns. The web is now polluted by LLM slop, which makes it harder to find quality information—a problem which now threatens journals, books, and other traditional media. I think ML will exacerbate the collapse of social consensus, and create justifiable distrust in all kinds of evidence. In reaction, readers may reject ML, or move to more rhizomatic or institutionalized models of trust for information. The economic balance of publishing facts and fiction will shift.
Creepy Crawlers
ML systems are thirsty for content, both during training and inference. This has led
to an explosion of aggressive web crawlers. While existing crawlers generally
respect robots.txt or are small enough to pose no serious hazard, the
last three years have been different. ML scrapers are making it harder to run an open web service.
As Drew Devault put it last year, ML companies are externalizing their costs
directly into his
face.
This year Weird Gloop confirmed
scrapers pose a serious challenge. Today’s scrapers ignore robots.txt and
sitemaps, request pages with unprecedented frequency, and masquerade as real
users. They fake their user agents, carefully submit valid-looking headers, and
spread their requests across vast numbers of residential
proxies.
An entire industry has sprung up to
support crawlers. This traffic is highly spiky, which forces web sites to
overprovision—or to simply go down. A forum I help run suffers frequent
brown-outs as we’re flooded with expensive requests for obscure tag pages. The
ML industry is in essence DDoSing the web.
The Future of Everything is Lies, I Guess: Culture
Table of Contents
This is a long article, so I'm breaking it up into a series of posts which will be released over the next few days. You can also read the full work as a PDF or EPUB; these files will be updated as each section is released.
ML models are cultural artifacts: they encode and reproduce textual, audio, and visual media; they participate in human conversations and spaces, and their interfaces make them easy to anthropomorphize. Unfortunately, we lack appropriate cultural scripts for these kinds of machines, and will have to develop this knowledge over the next few decades. As models grow in sophistication, they may give rise to new forms of media: perhaps interactive games, educational courses, and dramas. They will also influence our sex: producing pornography, altering the images we present to ourselves and each other, and engendering new erotic subcultures. Since image models produce recognizable aesthetics, those aesthetics will become polyvalent signifiers. Those signs will be deconstructed and re-imagined by future generations.
Most People Are Not Prepared For This
The US (and I suspect much of the world) lacks an appropriate mythos for what “AI” actually is. This is important: myths drive use, interpretation, and regulation of technology and its products. Inappropriate myths lead to inappropriate decisions, like mandating Copilot use at work, or trusting LLM summaries of clinical visits.
Think about the broadly-available myths for AI. There are machines which essentially act human with a twist, like Star Wars’ droids, Spielberg’s A.I., or Spike Jonze’s Her. These are not great models for LLMs, whose protean character and incoherent behavior differentiates them from (most) humans. Sometimes the AIs are deranged, like M3gan or Resident Evil’s Red Queen. This might be a reasonable analogue, but suggests a degree of efficacy and motivation that seems altogether lacking from LLMs.1 There are logical, affectually flat AIs, like Star Trek‘s Data or starship computers. Some of them are efficient killers, as in Terminator. This is the opposite of LLMs, which produce highly emotional text and are terrible at logical reasoning. There also are hyper-competent gods, as in Iain M. Banks’ Culture novels. LLMs are obviously not this: they are, as previously mentioned, idiots.
The Future of Everything is Lies, I Guess: Dynamics
Table of Contents
This is a long article, so I'm breaking it up into a series of posts which will be released over the next few days. You can also read the full work as a PDF or EPUB; these files will be updated as each section is released.
ML models are chaotic, both in isolation and when embedded in other systems. Their outputs are difficult to predict, and they exhibit surprising sensitivity to initial conditions. This sensitivity makes them vulnerable to covert attacks. Chaos does not mean models are completely unstable; LLMs and other ML systems exhibit attractor behavior. Since models produce plausible output, errors can be difficult to detect. This suggests that ML systems are ill-suited where verification is difficult or correctness is key. Using LLMs to generate code (or other outputs) may make systems more complex, fragile, and difficult to evolve.
Chaotic Systems
LLMs are usually built as stochastic systems: they produce a probability distribution over what the next likely token could be, then pick one at random. But even when LLMs are run with perfect determinism, either through a consistent PRNG seed or at temperature T=0, they still seem to be chaotic systems.1 Chaotic systems are those in which small changes in the input result in large, unpredictable changes in the output. The classic example is the “butterfly effect”.2
In LLMs, chaos arises from small perturbations to the input tokens. LLMs are highly sensitive to changes in formatting, and different models respond differently to the same formatting choices. Simply phrasing a question differently yields strikingly different results. Rearranging the order of sentences, even when logically independent, makes LLMs give different answers. Systems of multiple LLMs are chaotic too, even at T=0.
The Future of Everything is Lies, I Guess
Table of Contents
This is a long article, so I'm breaking it up into a series of posts which will be released over the next few days. You can also read the full work as a PDF or EPUB; these files will be updated as each section is released.
This is a weird time to be alive.
I grew up on Asimov and Clarke, watching Star Trek and dreaming of intelligent machines. My dad’s library was full of books on computers. I spent camping trips reading about perceptrons and symbolic reasoning. I never imagined that the Turing test would fall within my lifetime. Nor did I imagine that I would feel so disheartened by it.
Around 2019 I attended a talk by one of the hyperscalers about their new cloud hardware for training Large Language Models (LLMs). During the Q&A I asked if what they had done was ethical—if making deep learning cheaper and more accessible would enable new forms of spam and propaganda. Since then, friends have been asking me what I make of all this “AI stuff”. I’ve been turning over the outline for this piece for years, but never sat down to complete it; I wanted to be well-read, precise, and thoroughly sourced. A half-decade later I’ve realized that the perfect essay will never happen, and I might as well get something out there.
This is bullshit about bullshit machines, and I mean it. It is neither balanced nor complete: others have covered ecological and intellectual property issues better than I could, and there is no shortage of boosterism online. Instead, I am trying to fill in the negative spaces in the discourse. “AI” is also a fractal territory; there are many places where I flatten complex stories in service of pithy polemic. I am not trying to make nuanced, accurate predictions, but to trace the potential risks and benefits at play.