Check your sources — Googles AI Overview can give false, misleading, and dangerous answers From glue-on-pizza recipes to recommending “blinker fluid,” Google’s AI sourcing needs work.
Kyle Orland – May 24, 2024 11:00 am UTC Enlarge / This is fine.Getty Images reader comments 269
Further ReadingGoogle is reimagining search in the Gemini era with AI OverviewsIf you use Google regularly, you may have noticed the company’s new AI Overviewsproviding summarized answers to some of your questions in recent days. If you use social media regularly, you may have come across many examples of those AI Overviews being hilariously or even dangerously wrong.
Factual errors can pop up in existing LLM chatbots as well, of course. But the potential damage that can be caused by AI inaccuracy gets multiplied when those errors appear atop the ultra-valuable web real estate of the Google search results page.
“The examples we’ve seen are generally very uncommon queries and arent representative of most peoples experiences,” a Google spokesperson told Ars. “The vast majority of AI Overviews provide high quality information, with links to dig deeper on the web.”
After looking through dozens of examples of Google AI Overview mistakes (and replicating many ourselves for the galleries below), we’ve noticed a few broad categories of errors that seemed to show up again and again. Consider this a crash course in some of the current weak points of Google’s AI Overviews and a look at areas of concern for the company to improve as the system continues to roll out. Treating jokes as facts The bit about using glue on pizza can be traced back to an 11-year-old troll post on Reddit. (via) Kyle Orland / Google This wasn’t funny when the guys at Pep Boys said it, either. (via) Kyle Orland / Google Weird Al recommends “running with scissors” as well! (via) Kyle Orland / Google
Some of the funniest example of Google’s AI Overview failing come, ironically enough, when the system doesn’t realize a source online was trying to be funny. An AI answer that suggested using “1/8 cup of non-toxic glue” to stop cheese from sliding off pizza can be traced back to someone who was obviously trying to troll an ongoing thread. A response recommending “blinker fluid” for a turn signal that doesn’t make noise can similarly be traced back to a troll on the Good Sam advice forums, which Google’s AI Overview apparently trusts as a reliable source. Advertisement
In regular Google searches, these jokey posts from random Internet users probably wouldn’t be among the first answers someone saw when clicking through a list of web links. But with AI Overviews, those trolls were integrated into the authoritative-sounding data summary presented right at the top of the results page.
What’s more, there’s nothing in the tiny “source link” boxes below Google’s AI summary to suggest either of these forum trolls are anything other than good sources of information. Sometimes, though, glancing at the source can save you some grief, such as when you see a response calling running with scissors “cardio exercise that some say is effective” (that came from a 2022 post from Little Old Lady Comedy). Bad sourcing Washington University in St. Louis says this ratio is accurate, but others disagree. (via) Kyle Orland / Google Man, we wish this fantasy remake was real. (via) Kyle Orland / Google
Sometimes Google’s AI Overview offers an accurate summary of a non-joke source that happens to be wrong. When asking about how many Declaration of Independence signers owned slaves, for instance, Google’s AI Overview accurately summarizes a Washington University of St. Louis library page saying that one-third “were personally enslavers.” But the response ignores contradictory sources like a Chicago Sun-Times article saying the real answer is closer to three-quarters. I’m not enough of a history expert to judge which authoritative-seeming source is right, but at least one historian online took issue with the Google AI’s answer sourcing.
Other times, a source that Google trusts as authoritative is really just fan fiction. That’s the case for a response that imagined a 2022 remake of 2001: A Space Odyssey, directed by Steven Spielberg and produced by George Lucas. A savvy web user would probably do a double-take before citing citing Fandom’s “Idea Wiki” as a reliable source, but a careless AI Overview user might not notice where the AI got its information. Page: 1 2 3 Next → reader comments 269 Kyle Orland Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012, writing primarily about the business, tech, and culture behind video games. He has journalism and computer science degrees from University of Maryland. He once wrote a whole book about Minesweeper. Advertisement Promoted Comments invertedpanda It should be noted that the issue is actually rarely the AI itself, but Google’s ranking system and featured snippets.
In most cases where I’ve tested these "bad AI results", the actual problem is that the AI is just re-phrasing the top result that creates the featured snippets. As an example, the "How many rocks should I eat per day" one that’s been making the rounds is actually just some gray-hat SEO of a featured snippet for a fracking company that cites an Onion article (SEO is complicated, folks).
So, the problem already exited in the form of featured snippets: Now it’s just rephrased with AI.
Of course, Google has some real issues with handling these shifty SEO strategies, and it plays whack-a-mole onstantly. I actually shut down one of my own websites because of Google not being able or willing to handle low-effort content farms that just gobble up content like mine and rewrite it using AI while doing a handful of additional techniques to edge out ranking over my original content. May 24, 2024 at 11:20 am torp View attachment 81471
This is a great example of:
garbage in, garbage out. Even the LLM says it’s from a Reddit post. people having unrealistic expectations about LLMs. Perhaps this will convince everyone that they’re parroting what they’re fed and have no understanding or self consciousness. google shooting themselves in the foot. It’s one thing to give a result like the Reddit suggesion as a link to the original post on Reddit. It’s another one entirely to get it in this overview where it sounds like it’s endorsed by Google. May 24, 2024 at 11:26 am MichaelHurd When it comes to treating jokes as factual, nothing beats The Onion! May 24, 2024 at 12:03 pm Channel Ars Technica ← Previous story Next story → Related Stories Today on Ars