Weighing in on Casey Newton's "AI is fake and sucks" debate
There’s been a bit of an internet-fight over the past few days in my digital neighborhood.
-Casey Newton wrote a piece last Thursday titled “The phony comforts of AI skepticism” that defined two camps: “AI is fake and sucks” vs “AI is real and dangerous.” He then made the case for why the first camp is immune to evidence, and the second camp is basically right.
-Gary Marcus had some things to say about that characterization.
-Edward Ongweso had quite a few more things to say. (Highly recommended, btw. I read his piece and thought, “well, he nailed it. I don’t need to write on this topic.)
Both Marcus and Ongweso were basically saying “hey that’s a pretty glib framing of our critique. And if you take what we’re saying more seriously, we’re also pretty clearly right.”
-And then Casey wrote a follow-up last night. He… still thinks the glib framing is totally fair, and that these guys are kidding themselves.
I am generally a Casey Newton fan. His reporting is generally excellent. He is well-sourced and tenacious. He has developed a solid bullshit-meter when dealing with crypto folks and Musk acolytes. His Hard Fork podcast is great.
But I think he’s missing the point here. And there’s some real depth to what he’s missing. So, here goes, I’m wading in to the internet drama. I have a couple notes.
First, on the “AI is fake and it sucks” framing:
I am, generally speaking, an AI skeptic.
Sam Altman says that in the next few years his company will assemble a digital God that, among other things, will “solve all of physics.” I do not believe him.
Sam Altman is a billionaire who pals around with other tech billionaires. He is an exceptionally powerful individual. I have long believed that ridicule is appropriately targeted at the powerful. So I have been known to make fun of Sam Altman and his products.
So if you combed through everything I’ve posted or reskeeted on Bluesky, you could surely find me saying some version of “AI is fake and it sucks,” probably in the midst of cackling about some headline. I say a lot of things online. Much of what I say is glib.
But the reason why labeling the entire AI skeptic camp according to our most-glib retorts doesn’t sit right is that people in this camp (myself included) have written plenty of more thorough and serious critiques. We, broadly speaking, think that generative AI is very real and very dangerous, specifically because it does not work as-advertised. (Or, as Brian Merchant once wrote, “I’m not saying don’t be nervous about the onslaught of AI services — but I am saying be nervous for the right reasons.”)
I wrote an essay along these lines in April 2023, titled “Two Failure Modes of Emerging Technologies.”
The first failure mode is when a technology works as intended, but at a much larger scale, with unexpected results This is effectively the approach that Casey labels “AI is real and dangerous.”
The second failure mode is when the market for a new technology expands, and it is incorporated into critical social systems, even though the underlying flaws are not resolved.
There is a substantial difference between “it’s all fake and it sucks” and “this cannot actually replace your radiologist, your lawyer, or, like, the entire EPA.”
Shotspotter is a good example. Shotspotter built a gunshot detection system, meant to help police identify and geolocate the origins of gunfire. The New York Times covered Shotspotter in 2012 (“Shots fired, pinpointed, and argued over”). Here’s how they described it then:
The detection system, which triangulates sound picked up by acoustic sensors placed on buildings, utility poles and other structures, is part of a wave of technological advances that is transforming the way police officers do their jobs.
But like other technologies, including license plate scanners, body cameras and GPS trackers, the gunshot-detection system has also inspired debate.
In at least one city, New Bedford, Mass., where sensors recorded a loud street argument that accompanied a fatal shooting in December, the system has raised questions about privacy and the reach of police surveillance, even in the service of reducing gun violence.
That type of criticism belongs in failure mode 1. Accept the premise that Shotspotter’s technology works as advertised. Pause to reflect on whether that sort of pervasive surveillance is something we should build. It’s a Jeff-Goldblum-in-Jurrasic-Park sort of criticism. (“Your scientists were so preoccupied with whether or not they could that they didn’t stop to think if they should.”)
But there’s a much more immediate problem with Shotspotter, one that the New York Times finally covered in 2024 (“Gunshot detection system wastes NYPD officers’ time, audit finds”). It doesn’t actually fucking work! Police departments spent millions on a product whose error rates were always too high, and never got good enough to rely upon. They wasted a ton of money, failed to solve crimes, harassed a lot of black people, and tried to put some of them in jail.
If Shotspotter simply didn’t work in a lab setting, then that’s the company and its investors’ problem. But governments bought the marketing pitch, aided by tech journalists who were, in hindsight, too credulous. They bought a broken tool. They believed that it would inevitably get better with time. It didn’t. And so it became our problem.
We should not expect Shotspotter’s marketing efforts to accurately portray the limitations of its product. They are a have a product to sell. Left to their own devices, they are going to overpromise and underdeliver. So we need tech journalists and regulators (and also, yes, academics like myself) to apply a hefty amount of skepticism.
Now of course, Generative AI works better than Shotspotter. But its proponents are also aiming so much higher than Shotspotter did.
Geoffrey Hinton insisted in 2016 that we ought to stop training radiologists, because generative AI would completely replace them within five years. It is a good thing that we did not listen to him.
Palmer Luckey wants to put AI in charge of, basically murder-drones. We ought to have a failure mode 1 conversation about whether the future Luckey is fundraising for is good or bad (HINT: it’s bad). But we also REALLY need tech journalists to ask hard questions about whether his product actually fucking works. Because faulty murder drones, deployed at the border by a government that is largely indifferent to excess casualties and a company that certainly won’t be eager to share data on its failure rates is a disaster.
Labeling that entire branch of skeptical tech criticism “it’s all fake and sucks” seems too dismissive. Massive companies are burning extraordinary amounts of capital to build these generative AI models. The return on investment doesn’t come from $20/month subscription fees. It comes from disrupting large, existing industries like health care, law, higher education, and military defense. That’s the ambition. So, before we fire all the radiologists, we should probably ask some skeptical questions about whether GPT5/OpenAI 03/Claude 7/Grok 2.0 will really be able to live up to the marketing promises.
And, FWIW, I know from reading Casey’s reporting that he doesn’t think generative AI can replace radiologists, lawyers, scientists, or (*ahem*) journalists. So this doesn’t seem to me like a fight that he needed to have picked. <shrug emoji>
Here’s my second note: whose predictions deserve a little leeway?
I think Casey’s main point is that GPT4 is far superior to GPT2, and the skeptics don’t admit how much progress has been made. He finds that frustrating, and it makes it a little hard for him to keep taking them seriously.
And that’s fair. I get it. From a certain vantage point, the story of the past few years has been skeptics saying “well sure AI can handle text, but it can’t do images… Well sure, but it can’t do video… Well sure, but it can’t do…,” all the while refusing to admit the pace of improvement.
But the AI boosters have also had a long string of mistaken predictions over that same timeframe. AI Hallucinations were a minor problem that would be solved imminently. Scaling laws were infinite — with enough compute and enough data, artificial general intelligence would be inevitable. GPT5 would be as big of a leap as GPT4, and also it would be available last summer. (Oops.)
All the while, skeptics like Gary Marcus were insisting the techniques that produced GPT4 were limited, and there would be decreasing returns from further scale. Sam Altman ridiculed the guy and insisted that the single most salient fact about human society today is that “deep learning worked, got predictably better with scale, and we dedicated increasing resources to it.”
And now we’re hearing that GPT5 is delayed and actually this new o1 model is much more promising because there are so very many problems that can’t be solved with just more scale and resources. Point goes to Marcus, I think.
(We are, conveniently, hearing these things after OpenAI closed its latest funding round. Altman has a product to sell, and the product happens, among other things, to be a cash furnace.)
So the real question is whether we should offer more grace to the underestimating skeptics or the overestimating tech evangelists.
It seems like Casey’s instinct is to give the tech optimists a pass because (a) they’re trying to build something. Good for them. And (b) the rhetorical conventions of technologists attracting money and talent naturally permit a high amount of blustering overconfidence and optimism.
Geoffrey Hinton might have said AI will replace all radiologists in five years, but if it ends up being 10-20 we’ll still award him full credit. Elon Musk has been promising Full Self Driving cars for over a decade. If Tesla ever reaches that goal, no one except the skeptics will fault him for getting the dates wrong.
And, of course, the reason Elon keeps promising “full self driving next year” is to keep investors excited. The reason Sam Altman says his product will “solve all of physics” is that he is hoping to eventually raise seven trillion dollars. These are load-bearing predictions. Their intent is to set a vision and bring in resources. If Sam Altman becomes a centibillionaire and OpenAI becomes a tech monopoly, but his product never solves all of physics, he won’t have to give any of the money back. He told people what they needed to hear so he could get the job done. That’s how the game is played.
My instinct, and I suspect this is true for much of my intellectual camp, is to calibrate how much grace I offer to incorrect predictions based on how much power was behind them.
If a startup guy with a dream and a pitch deck thinks we can reform all of education in a decade with his not-yet-at-all-real company, I’ll be generally skeptical but also won’t care much. Sure, man, shoot your shot. But if the Department of Education is considering awarding him (or, more likely, someone like Sal Khan) a multi-billion dollar grant, then I’m going to focus real hard on the guy.
And, for the companies and executives with multi-billion dollar resources and ambitions — the “tech barons,” broadly speaking, I think we should pay very close attention the promises they make about a glorious, inevitable technological future that never quite arrive. (And hey, someone should really write a book on that topic, amirite?)
Because when Elon Musk says “don’t invest in high speed rail. My Boring Company has a radical solution to traffic woes, and it’ll trains irrelevant before construction would even be finished,” we should understand that this isn’t an actual prediction. It is an attempt to exercise power, to block a public works project that would compete with one of Musk’s companies. And when Musk then never tries to build the hyperloop, we should maybe update our priors and judge his next promises more skeptically. (h/t Paris Marx)
This all brings to mind Ted Chiang’s marvelous essay, “Will A.I. Become the New McKinsey.”
AI is real, and the ways that it is being deployed are, generally, quite shitty. This is not inherent to the technology. It could be much better than it is. But it is hardly surprising that the version of generative AI that best fits the whims, preferences, and ROI expectations of early investors and private equity-types is a version that cuts costs while degrading quality. It was all so very obvious and predictable, so long as you knew to distrust the words coming out of Sam Altman’s mouth.
And the tech billionaires who have bet their companies futures on these products keep insisting that we not fixate on the shit futures they are in the process of manufacturing, because instead we ought to marvel at their latest product demo while *imagining the endless possibilities.*
That’s not to say that GPT4 isn’t much more capable than GPT3. The industry has been very good at hitting benchmark goals and celebrating those successes. I, as a tech critic, am willing to admit those successes have happened.
I’m just not convinced they are central to the trajectory of AI as an industry.
AI is real, but the hype is manufactured.
We know how this story goes. Casey, too, knows how it story goes. But his affinity still lies with the optimists and the builders. And I think that has influenced how he interprets and understands the critics.
That’s it, that’s my contribution to the internet-drama. Thanks for reading.