On June 23, 2025, Judge Alsup in the Northern District of California issued an order in Bartz et al. v. Anthropic PBC, granting in part and denying in part Defendant Anthropic’s motion for summary judgment on the sole issue of whether its use of Plaintiffs’ books as training data for Anthropic’s large language models (LLMs) was “quintessential” fair use.

Central to its mixed holding, the court acknowledged that Anthropic used the works in various ways and for varying purposes, such that each “use” must be identified and assessed separately. Ultimately, the court held that while the use of textual works to train LLMs was “exceedingly transformative” and thereby was protected as fair use when considered against the remaining factors, the separate use of the works to create a central library was only fair use with respect to works purchased or lawfully accessed—i.e., the use of pirated copies to create the central library was not protectible fair use. This decision makes clear that the source of content is a key element in evaluating fair use.

Use Within Anthropic’s Central Library

Anthropic’s co-founder previously admitted to downloading numerous online digital book libraries known to be assembled from unauthorized copies that were stored indefinitely in a central library. While Anthropic later pivoted its data collection approach by purchasing copies that were then scanned into a digitized central library, Anthropic never deleted or removed the pirated copies from its central library.

In its opening brief, Anthropic argued that the key consideration for the court’s fair use analysis was what Anthropic did with those works during the training process—not whether it had lawful access to those materials. The court plainly rejected this argument, explaining that “[c]reating a permanent, general-purpose library was not itself a fair use excusing Anthropic’s piracy,” such that “Anthropic had no entitlement to use pirated copies for its central library.” Moreover, Anthropic was never entitled to create or hold copies of the pirated works, meaning that “almost any unauthorized copying would have been too much.”

The use of purchased copies to create a central library, however, was held to be a transformative fair use. While Plaintiffs argued the format change from print-to-digital required independent justification, the court held that the mere format change was a fair use, given that such copying involved no new copies being added to the library, eased storage, enabled searchability, and was not done “for purposes trenching upon the copyright owner’s rightful interests.” In short, the court found the format change was transformative to the extent it was reasonably necessary and such use did not “exploit anything the Copyright Act reserves to the copyright owner.”

Use Within Anthropic’s LLM Training

With respect to the use of copies for training purposes, the court found the purpose and character of using copyrighted works to train LLMs to generate new text was “quintessentially transformative” and favored fair use under the first “purpose and character of use” factor. Furthermore, the third factor, i.e., the amount and substantiality used, similarly favored fair use, both because such copying was found to be “reasonably necessary” (as opposed to strictly necessary, as urged by the Plaintiffs’ brief) for the transformative use and because the Plaintiffs did not allege that any portion of their works was included in Anthropic’s LLM outputs. With regard to the fourth factor of potential market harm, the court concluded that “the copies used to train specific LLMs did not and will not displace demand for copies of Authors’ works,” as such a market “is not one the Copyright Act entitles [Plaintiffs] to exploit.”

Thus, while a copyright holder “cannot rightly exclude anyone from using their works for training or learning [purposes] … [t]hey may need to pay for getting their hands on a text in the first instance.”

Looking Forward – the Facts Matter

As with any fair use decision, this case was fact intensive and dependent on the nature of the use, and pending cases in which copyrighted works were used to train AI will similarly depend on their specific facts.

For Anthropic, in addition to whether the source of the content was legal, critical to the court’s determination were specifics as to how Anthropic used the works for training the LLM, including that the works were “cleaned” to remove repeating or “lower-value” text, the works were “tokenized” where “characters were grouped into short sequences and translated into corresponding number sequences or ‘tokens,'” and then the LLMs used in Anthropic’s Claude product were “complemented by other software that filtered user inputs into the LLM and filtered outputs from the LLM back to the user,” which “ensure[d] that no infringing output ever reached the users.”

Critically, the court also highlighted that the authors did “not allege that any infringing copy of their works was or would ever be provided to users by the Claude service” and specifically held “if the outputs seen by users had been infringing, Authors would have a different case” and “could bring such a case.”

Print:
Email this postTweet this postLike this postShare this post on LinkedIn
Meaghan H. Kent

Meaghan Kent is a seasoned intellectual property (IP) litigator and counselor who advises media, consumer product, and software companies on IP protection, risks, and claims, with notable experience regarding artificial intelligence (AI) and copyright. Meaghan counsels clients on the development and protection of…

Meaghan Kent is a seasoned intellectual property (IP) litigator and counselor who advises media, consumer product, and software companies on IP protection, risks, and claims, with notable experience regarding artificial intelligence (AI) and copyright. Meaghan counsels clients on the development and protection of IP portfolios, including copyright registration, licensing, clearance, and fair use analysis, especially as they relate to complex and emerging issues in digital media and AI.

Marcella Ballard

Marcella Ballard is co-chair of Venable’s IP Litigation – Advertising, Brand, and Copyright Group. Marci is a seasoned first-chair Lanham Act and copyright litigator who represents clients before the United States Patent and Trademark Office (USPTO) and the Trademark Trial and Appeal Board…

Marcella Ballard is co-chair of Venable’s IP Litigation – Advertising, Brand, and Copyright Group. Marci is a seasoned first-chair Lanham Act and copyright litigator who represents clients before the United States Patent and Trademark Office (USPTO) and the Trademark Trial and Appeal Board (TTAB), and in bench and jury trials. Marci also represents clients in arbitration hearings throughout the United States and in the United Kingdom. Several well-known global brands rely on her sophisticated litigation skills and sage counsel in global trademark matters and brand management functions. She also manages global IP portfolios, and counsels clients on brand protection, trademark, copyright, trade secret, privacy rights, licensing, unfair competition, contracts, and business tort claims.