Skip to main content

Meta PlatformsMeta trained Llama AI models on 81.7TB of pirated books from LibGen and shadow libraries with executive approval

Court filings revealed Meta engineers torrented 81.7 terabytes of copyrighted books from Library Genesis, Z-Library, and Anna's Archive to train Llama models. Internal emails showed Meta director Sony Theakanath confirmed 'GenAI has been approved to use LibGen for Llama 3' after escalation to Mark Zuckerberg, with explicit instruction to never publicly disclose the use. Engineers wrote scripts to strip copyright notices from ebooks. A June 2025 ruling found this piracy was not protected by fair use.

Scoring Impact

TopicDirectionRelevanceContribution
Intellectual Property Ethics-againstprimary-1.00
Overall incident score =-1.360

Score = avg(topic contributions) × significance (critical ×2) × confidence (0.68)

Evidence (2 signals)

Confirms Legal Action Jun 25, 2025 verified

Judge ruled Meta's use of pirated books for AI training was not fair use, but granted fair use for legally acquired works

US District Judge Vince Chhabria sided with Meta on fair use for legally acquired books but denied fair use protection for pirated copies. The judge described Meta's claim that public interest would be 'badly disserved' if prevented from using copyrighted text for free as 'nonsense'.

Confirms Legal Action Jan 9, 2025 verified

Court filings revealed Zuckerberg approved Meta's use of pirated LibGen books for Llama 3 training

Internal emails showed Meta's director of product management confirmed 'GenAI has been approved to use LibGen for Llama 3' after escalation to Zuckerberg, with explicit instruction to never publicly disclose the use. Engineers torrented 81.7TB of pirated books.

Related: Same Topics