Dictionary Publisher Files Lawsuit Against OpenAI

A dispute between a long-standing knowledge publisher and the AI industry has now reached the courts.

Encyclopaedia Britannica, along with its dictionary brand Merriam-Webster, has filed a lawsuit against OpenAI. The publishers argue that the AI company used their copyrighted material without permission while developing its language models.

In the legal complaint, Britannica says it owns the rights to nearly 100,000 articles published on its website. The company claims many of these articles were scraped from the internet and later included in the data used to train OpenAI’s AI systems.

The publishers also say the chatbot ChatGPT can sometimes generate answers that closely mirror wording found in Britannica’s articles. In some cases, the complaint claims, the responses may include partial reproductions of the publisher’s content.

Another point raised in the lawsuit relates to how modern AI tools gather information. Britannica alleges that OpenAI uses systems similar to Retrieval-Augmented Generation, which allow AI models to pull information from external sources while producing responses. According to the publisher, this process may involve its copyrighted articles.

The complaint also includes claims under the Lanham Act. Britannica argues that when AI tools generate incorrect information and attribute it to the publisher, it could harm its reputation as a trusted reference source.

The lawsuit also highlights concerns about the broader impact on publishers. Britannica says chatbots that directly answer users’ questions may reduce the need for people to visit original websites, potentially affecting advertising and subscription revenue.

Britannica is not alone in raising these issues. Media companies such as The New York Times and Ziff Davis have already taken legal action against OpenAI over similar concerns. Several newspapers across the United States and Canada have also filed related lawsuits.

The publisher has also brought a separate case against Perplexity AI over comparable claims, and that legal process is still underway.

The bigger legal question behind these cases is whether copyrighted material can be used to train AI systems. In a separate dispute involving Anthropic, federal judge William Alsup suggested that using existing works as training data might qualify as a “transformative” use in some situations. However, he also ruled that downloading millions of books without permission broke the law, which led to a $1.5 billion settlement with affected authors.

Decisions in these cases could influence how AI companies obtain and use online content in the future.

Also Read: OpenAI Launches Codex Desktop App for Multi-Agent Work on macOS