Eight newspaper publishers sue Microsoft and OpenAI over copyright infringement

Eight newspaper publishers alleged that Microsoft and OpenAI used millions of their articles without payment or permission to train AI models like ChatGPT.

Sam Altman, CEO of OpenAI, during a panel session at the World Economic Forum in Davos, Switzerland, on Jan. 18, 2024.

Stefan Wermuth | Bloomberg | Getty Images

Eight U.S. newspaper publishers filed suit against Microsoft and OpenAI in a New York federal court on Tuesday, claiming the technology companies reuse their articles without permission in generative artificial intelligence products and incorrectly attribute inaccurate information to them.

The group of eight newspaper publishers takes issue with ChatGPT and Microsoft’s Copilot assistant — available in the Windows operating system, the Bing search engine, and other products the software maker produces. ChatGPT and Copilot have been “purloining millions of the publishers’ copyrighted articles without permission and without payment,” according to the complaint, which had been filed in the U.S. District Court for the Southern District of New York.

The newspaper publishers in the lawsuit operate the New York Daily News, the Chicago Tribune, the Orlando Sentinel, the Sun Sentinel in Florida, The Mercury News in California, The Denver Post, The Orange County Register in California and the Pioneer Press of Minnesota.

The newspaper publishers said in the lawsuit that OpenAI has drawn on data sets containing text from their newspapers to train its GPT-2 and GPT-3 large language models, which can spit out text in response to a few words of human input.

“The current GPT-4 LLM will output near-verbatim copies of significant portions of the publishers’ works when prompted to do so,” the complaint said, showing several examples of ChatGPT and the Copilot allegedly doing so.

The publishers said Microsoft copies information from their newspapers for the Bing search index, which helps inform answers in the Copilot. But such output doesn’t always provide links to newspaper websites, where they can view ads alongside articles or pay for subscriptions.

Microsoft and OpenAI representatives did not immediately respond to requests for comment from CNBC.

The legal challenge comes four months after The New York Times sued OpenAI over copyright infringement in the ChatGPT chatbot that the startup released in late 2022. OpenAI said in a January blog post that the case is without merit, adding it wants to support “a healthy news ecosystem.” That same month, Sam Altman, OpenAI’s CEO, said the startup wanted to pay The New York Times and was surprised to learn about the lawsuit.

In recent months, OpenAI has signed deals with a handful of media companies, including Axel Springer and the Financial Times, enabling the Microsoft-backed startup to draw on the publishers’ content to improve AI models.

Google, which has its own general-purpose chatbot for responding to user queries, said in February that it had reached an agreement with Reddit that includes the right to train AI models on the platform’s content.

The New York Times case also touched on the matter of OpenAI models regurgitating information from its articles. In its blog post, OpenAI characterized such behavior as “a rare failure of the learning process that we are continually making progress on.”

Correction: This article has been updated to reflect the correct day the lawsuit against Microsoft and OpenAI was filed.

WATCH: OpenAI CEO Sam Altman: The U.S. needs an AI policy