Imagine your company just hired some hot new talent, a rising star in the executive suite so alluring that a rival firm just hired a lookalike. The buzz around them is intoxicating. Everyone seems to agree, from the CEO to the shareholders, this person is the future of the entire business.
Then you learn the executive has what is politely termed a “hallucination problem(Opens in a new tab).” Every time they open their mouth, there’s a 15 to 20 percent chance they might just make stuff up(Opens in a new tab). A professor at Princeton calls the guy a bullshit generator(Opens in a new tab). They literally cannot tell truth from fiction(Opens in a new tab). They’re going on stage to unveil a new product in five minutes. Do you still push them into the spotlight?
For Microsoft and Google this week, the answer was yes. Fired up by the success of OpenAI’s ChatGPT, the Artificial Intelligence chatbot with 100 million monthly active users two months after its launch, Microsoft held a last-minute-surprise event to announce OpenAI would bring ChatGPT-style search to the Bing search engine and Edge browser. Google announced an AI search tool of its own, Bard, the day before, and unveiled it at an event in Paris the day after — but ran into a hallucination problem of its own.
“A new race starts today,” Microsoft CEO Satya Nadella told reporters who’d been summoned to the Redmond, Wash. campus Tuesday. Yes, isn’t it pretty to think so(Opens in a new tab)? Microsoft, the perpetually uncool kid on the tech block, would love you to think that Bing — sorry, “the New Bing” — is in a race with Google search on anything.
Google’s pre-response announcing Bard(Opens in a new tab) dripped with condescension: “We re-oriented the company around AI six years ago,” wrote Google CEO Sundar Pichai.
Google and the ‘hallucination problem’
Which is a telling point. Google, the world leader in search, has had years to incorporate AI, and still its ChatGPT rival, Bard, is barely at the beta stage with a tiny group of testers. For all Pichai’s hipster affectation, the Bard unveiling had an unplanned messiness to it. Google seems to have been caught flat-footed by all the ChatGPT buzz too.
How else to explain the embarrassing Bard mistake on full display at launch — not at the event itself, where some demo flubs are expected, but in a pre-made GIF? A user is shown asking Bard for facts he can tell his 9 year old about the James Webb Space Telescope.
One of those “facts”, that the JWST took the first ever picture of an exoplanet, is untrue. Bard was hallucinating(Opens in a new tab). (UPDATE: While a Financial Times reporter claims Bard’s words were technically accurate(Opens in a new tab), that requires a reading of the language that no human would ever employ — which is yet another problem with AI search.)
No wonder parent company Alphabet lost as much as 8 percent of its share price the day of the Bard launch. Google put the main problem with AI search front and center, and furthermore suggested that the company can’t use its vast storehouse of data to fact-check itself.
Google should know better, given that it already had a “hallucination problem” with its featured snippets(Opens in a new tab) at the top of search results back in 2017. The snippets algorithm seemed to particularly enjoy telling lies about U.S. presidents. Again, what could go wrong?
In other words, launch your AI search tool too early and you’re at risk of playing yourself. Microsoft got lucky, in the sense that no obvious errors were on display at its launch event. But if ChatGPT-based search weren’t riddled with mistakes, why is it at such a tentative beta stage? Side note: If you’re interested in doing unpaid AI QA for Bing, there’s a sign-up sheet(Opens in a new tab).
“There’s still more to do there,” Sarah Bird, Microsoft’s Head of Responsible AI (a telling title!) said in response to a question from Wired about ChatGPT’s hallucination problem(Opens in a new tab). Yeah, no kidding: the 15 percent hallucination number came from a company that is in its own race to build a ChatGPT fact-checker(Opens in a new tab). (UPDATE: a New York Times columnist’s breathless report on New Bing(Opens in a new tab) revealed that it couldn’t even get basic math right, or even a list of local kid-friendly activities.)
Bird added that previous versions of the software could help users plan a school shooting, but that this functionality had been disabled. Good to know! What could possibly go wrong next? Surely there is no other unintended consequence lurking in this hallucinatory beta search product that could embarrass a large and legally vulnerable tech giant.
Clippy. Zune. New Bing.
Microsoft knows from embarrassment, of course: It’s the company that gave us one of the biggest misfires in software history, Clippy. The paperclip assistant was famous for dispensing unwanted advice. ChatGPT isn’t Clippy, in the sense that we’re coming to it with questions.
But the fact that it often hallucinates its responses — or, more often than you’d think, gives users a mundane variation on “I can’t answer that” — could make ChatGPT-enabled Bing a kind of Clippy on LSD. If enough casual users of the “New Bing” get garbled results, then that’s what it will be remembered for.
Doesn’t matter if a product improves later on; the initial popular response is what can turn it into a punchline. Microsoft should know that, too; it gave us the Zune. Rolling out a ChatGPT product before its truly ready for primetime is no different.
“The New Bing” is already kind of asking to be a punchline, honestly. Or are you really ready to ditch Google search and your Chrome browser for Bing and Edge, should the latter win the AI search race, whatever “winning” really means here? Didn’t think so. Tech inertia is profoundly underrated as a force.
ChatGPT is impressive in some circumstances — real estate agents in particular are loving it(Opens in a new tab) for listings-writing — and invokes fear in others. But every story about its disruptions seems somehow lesser, once you dig below the headline. It’s going to lead to a wave of student plagiarism! Except it can also tell you when a paper has been written by ChatGPT(Opens in a new tab), neutralizing its own threat. It passed a law school exam! Except it actually just scraped by with a C-plus(Opens in a new tab).
Here’s the thing: building the digital equivalent of a human brain, known in AI circles as “general AI”, is really hard going. We’ve barely begun to arrive at the insect intelligence stage(Opens in a new tab), another long-held AI goal. Will you really trust ChatGPT to deliver your search results, rather than, y’know, clicking on links yourself?
The answer could well depend on how much you yourself, dear reader, are having a hallucination problem.