AI Training and Copyright: Federal Judge’s Ruling
A federal judge in San Francisco determined that using human-created works without permission for training artificial intelligence systems is permissible under US copyright law.
Judge William Alsup sided with tech companies on key issues affecting the AI sector, asserting that humanity had “fairly used” the works to develop a large-scale language model based on content from authors Andrea Burtz, Charles Graeber, and Kirk Wallace Johnson.
However, the judge also pointed out that humanity had retained over seven million pirated books in what he referred to as the “Central Library.” He ordered a trial set for December to assess the extent of the infringement.
Under US copyright guidelines, intentional copyright violations can result in statutory damages up to $150,000 for each work involved.
A spokesperson for humanity expressed satisfaction with the court’s recognition that AI training can be considered “transformative” and aligns with copyright objectives aimed at fostering creativity and scientific growth.
The lawsuit was initiated last year by a writer who claimed that the company, backed by Amazon and Alphabet, had utilized unauthorized, pirated versions of their books to train its AI, Claude, for responding to user prompts.
This class action is among various lawsuits from authors, media outlets, and other copyright holders targeting AI companies like OpenAI, Microsoft, and Meta over issues related to AI training.
The doctrine of fair use allows copyrighted material to be utilized in specific situations without the copyright holder’s consent.
This concept is a crucial legal defense for tech firms, and this ruling marks the first time such issues have been explored in the context of AI generation.
AI companies maintain that they are employing copyrighted content fairly to produce new, transformative works, arguing that imposing fees for copyright materials could hinder the growing AI sector.
Humanity defended its actions in court, claiming that the law not only permits but also supports AI training as a way to enhance human creativity. They stated that their system uses the books primarily to “study the plaintiff’s writings, extract non-copyable information, and leverage those insights for innovative technology.”
Copyright holders, conversely, contend that AI firms are unlawfully duplicating their work and creating competing content that endangers their financial well-being.
Judge Alsup sided with humanity on this point, noting that their use was “very transformative.”
However, he also ruled that humanity violated the authors’ rights by storing copies of well-known books in what is described as the “central library of all books in the world,” which may not solely be for AI training.
Humanity and other major AI institutions, including OpenAI and Meta, have faced accusations of illicitly acquiring millions of pirated book copies for AI training purposes.
In court filings, humanity stated that the source of its books does not relate to the fair use doctrine.
Judge Alsup remarked on the implications of the ruling, suggesting that the explanation provided by the defendant regarding the source of the pirated copies could be vital in arguing for fair use at a later date.
