- Big tech companies like OpenAI and Microsoft face copyright lawsuits over AI tech like ChatGPT.
- Tech giants are in legal strife for using content in AI models without authorization.
- AI advancements lead to legal and ethical debates over the use of public data in model training.
AI technology, while impressive, faces criticism for potentially infringing on copyrights. Notably, large tech firms are facing lawsuits for such violations.
The New York Times vs. OpenAI and Microsoft
For example, The New York Times has taken legal action against OpenAI and Microsoft. The newspaper accuses these tech giants of unlawfully using millions of its articles to develop AI technologies like ChatGPT, which now rival the Times in delivering instant information.
These allegations form part of a broader trend of lawsuits challenging the practice of training AI models with vast amounts of online content without compensating the original creators. This issue concerns artists, writers, and journalists, who worry their online work might be used to power AI solutions, like chatbots, without fair compensation.
The Times’ lawsuit is particularly significant as it targets OpenAI and Microsoft, two prominent players in AI. Microsoft, which holds a position on OpenAI’s board and has invested billions in the company, is directly involved in this case.
Filed on December 27, the lawsuit asserts that The Times is responsible to its subscribers. It argues that the unauthorized use of its content by Microsoft and OpenAI to create competing AI products undermines this duty. The Times acknowledges that these companies also used other sources, but it highlights the special focus on its content, accusing them of exploiting its journalistic efforts to develop competing products without permission or compensation.
Responding to the lawsuit, OpenAI spokesperson Lindsey Held expressed the company’s respect for content creators’ rights and commitment to collaborating with them. Despite ongoing and constructive talks with The New York Times, OpenAI expressed disappointment at this legal turn. The spokesperson remained hopeful for a mutually beneficial collaboration, similar to arrangements with other publishers.
Microsoft, however, did not comment on the lawsuit.
The Times had raised objections months earlier upon discovering its content was used in training AI models. Since April, the newspaper sought fair compensation and negotiated terms with OpenAI and Microsoft.
The companies, however, stand by the defense of ‘fair use’, arguing that the use of The Times’ content for ‘transformative purposes’ is permissible. The Times strongly contests this claim, arguing that AI outputs like ChatGPT and Microsoft’s Bing chatbot provide similar services to the newspaper, thereby not qualifying as ‘transformative’ but as direct competition.
AI technology and copyright concerns with various companies
While The New York Times is a prominent example, it’s not alone in this battle against copyright infringement by tech companies. Major news organizations, including CNN, have actively taken steps, like adding code to block OpenAI’s web crawler from accessing their content.
In related legal actions, comedian Sarah Silverman and two authors sued Meta and OpenAI in July, claiming their works were used without consent to train AI models. The companies haven’t responded publicly to these allegations. Most of these claims were dismissed by a judge in November.
The legal challenges for OpenAI and Microsoft extend beyond just news publications. In late November, they faced a lawsuit from authors over using their works in AI training. The lawsuit alleges that OpenAI, with Microsoft’s involvement, improperly used the content of numerous nonfiction books, including those of well-known authors, to train AI models like ChatGPT.
Julian Sancton, an author and editor at the Hollywood Reporter, spearheads this class action lawsuit, which was filed in Manhattan. He claims OpenAI used thousands of nonfiction books, including his own work, without permission to enhance its language models’ ability to interact with human-generated text.
This lawsuit joins a series of similar legal actions initiated by authors like John Grisham, George R.R. Martin, and Jonathan Franzen. These authors accuse OpenAI and other tech firms of exploiting their content to develop AI technologies, claims which the companies have refuted.
This case, led by Sancton, is notable as it’s the first to implicate Microsoft as a co-defendant. Microsoft’s significant investment in OpenAI and the integration of OpenAI’s technologies into its products are central to the lawsuit.
The lawsuit mentions the unauthorized use of Sancton’s book “Madhouse at the End of the Earth: The Belgica’s Journey into the Dark Antarctic Night” in training OpenAI’s GPT models. It further accuses Microsoft of being deeply involved in developing and training these models, thus holding them accountable for the alleged copyright infringement.
Sancton has approached the court seeking monetary damages and requesting an injunction to prevent further alleged copyright violations.
Emerging lawsuits in the AI art and imaging sector
The legal landscape surrounding AI technology continues to evolve, with several companies facing lawsuits. In January 2023, a lawsuit was filed against AI image generator companies like Stability AI, Midjourney, and DeviantArt. The lawsuit alleges that these companies infringed on the plaintiffs’ copyrights by using their original works for training and producing unauthorized derivatives, including mimicking the styles of various artists. Judge William Orrick, overseeing the case, indicated a likelihood of dismissing the lawsuit.
In a separate incident in January 2023, Getty Images brought a lawsuit against Stability AI. The complaint accuses Stability AI of replicating and processing millions of Getty’s images and related metadata without permission in the U.K. Shortly afterwards, Getty launched another lawsuit in the U.S. District Court for the District of Delaware.
This lawsuit raises concerns over copyright and trademark infringements, highlighting issues with AI-generated images that bizarrely or grotesquely included the Getty Images watermark, purportedly harming Getty’s reputation.
Copyright concerns primarily drive the surge in lawsuits against AI companies. AI technologies like ChatGPT, although trained on publicly available internet data, do so without explicit consent from the creators of this data. For instance, GPT-3’s training involved using content from Wikipedia and Reddit. This training process potentially includes conversations and segments from copyrighted works, allowing AI models to summarize such materials accurately.
Beyond individual cases, these lawsuits reflect broader apprehensions about the opaque nature of AI. There’s a growing concern that the ‘black box’ nature of AI makes it difficult to understand its inner workings, leading to fears that AI could be used to circumvent responsibility. This is particularly concerning when AI makes decisions or produces outputs without clear accountability.
Legal expert Matthew Butterick, who is involved in several of these lawsuits, expressed this concern in his blog. He warns that if AI companies are permitted to market these inscrutable systems, AI could become a tool for justifying means, regardless of the methods. He cautions that the reliance on AI might shift from its performance to its ability to execute actions that might be legally or ethically problematic for humans.