Simon Willison’s Weblog: Why I think the $1.5 billion Anthropic class action settlement may count as a win for Anthropic

Source URL: https://simonwillison.net/2025/Sep/6/anthropic-settlement/#atom-everything
Source: Simon Willison’s Weblog
Title: Why I think the $1.5 billion Anthropic class action settlement may count as a win for Anthropic

Feedly Summary: Anthropic to pay $1.5 billion to authors in landmark AI settlement
I wrote about the details of this case when it was found that Anthropic’s training on book content was fair use, but they needed to have purchased individual copies of the books first… and they had seeded their collection with pirated ebooks from Books3, PiLiMi and LibGen.
The remaining open question from that case was the penalty for pirating those 500,000 books. That question has now been resolved in a settlement:

Anthropic has reached an agreement to pay “at least” a staggering $1.5 billion, plus interest, to authors to settle its class-action lawsuit. The amount breaks down to smaller payouts expected to be approximately $3,000 per book or work.

It’s wild to me that a $1.5 billion settlement can feel like a win for Anthropic, but given that it’s undisputed that they downloaded pirated books (as did Meta and likely many other research teams) the maximum allowed penalty was $150,000 per book, so $3,000 per book is actually a significant discount.
As far as I can tell this case sets a precedent for Anthropic’s more recent approach of buying millions of (mostly used) physical books and destructively scanning them for training as covered by “fair use". I’m not sure if other in-flight legal cases will find differently.
If this does hold it’s going to be a great time to be a bulk retailer of used books!
Tags: law, ai, generative-ai, llms, anthropic, training-data, ai-ethics

AI Summary and Description: Yes

Summary: A landmark settlement involving Anthropic, which agreed to pay $1.5 billion to authors over the unauthorized use of a large volume of pirated books for AI training, sets a significant precedent in the realm of AI and copyright law. This case highlights the tensions between AI development and intellectual property rights, especially concerning generative AI technologies.

Detailed Description: The recent settlement involving Anthropic sheds light on the ongoing ethical and legal dilemmas surrounding AI training data and copyright issues. Here are the key points to note:

– **Nature of the Case**: Anthropic was accused of using pirated ebooks along with legitimate material for training AI models. This raised questions about the legality of “fair use” in the context of AI training.

– **Settlement Amount**: The company has agreed to pay “at least” $1.5 billion to authors, a sum that averages approximately $3,000 per book. This is a significant reduction compared to the maximum penalty, which could have reached $150,000 per book.

– **Implications for AI Training Practices**: The case may influence Anthropic’s future strategies, as they now consider purchasing and destructively scanning used physical books to train their models under fair use provisions.

– **Precedent for Future Cases**: This settlement could set a noteworthy precedent for other AI developers and researchers, particularly those currently facing similar legal challenges regarding their training data practices.

– **Industry Impact**: The resolution of this case emphasizes the critical need for rigorous legal frameworks surrounding the use of data in AI, especially in generative AI applications. It could lead to increased scrutiny and the necessity for compliance among companies sourcing training data.

– **AI Ethics and Compliance Considerations**: Security, privacy, and compliance professionals need to be aware of the implications of this settlement. It raises concerns over copyright infringement and the legalities surrounding the use of data in AI, which may influence governance policies regarding data acquisition for model training.

In conclusion, the Anthropic settlement serves as a vital reminder for technology companies about the importance of ethical practices in AI development and the potential legal ramifications of copyright issues in the rapidly evolving landscape of artificial intelligence.