The Wrong War — Bhavana Chamoli

In June 2025, Senior U.S. District Judge William Alsup ruled that Anthropic's use of millions of copyrighted books to train its language models constituted fair use — "transformative, spectacularly so." His opinion analogized model training to human reading and learning. Anthropic subsequently agreed to a $1.5 billion settlement. The creative community called it a partial victory.

It is reasonable to interpret Judge Alsup's ruling as treating human learning and AI training as categorically equivalent processes. They are not. And that misclassification — not the dollar figure of any settlement — is the central problem with every legal challenge filed against AI companies to date.

I say this as someone who has built these systems at institutional scale. I have worked on data architecture and AI deployment at a major hedge fund and at one of the world's largest custodian banks. I understand what happens in embedding space. The copyright frame, however legally convenient, does not reach the depth of what has actually occurred. The correct frame is not intellectual property law. It is extraction economics.

What Training a Language Model Actually Does

To understand why copyright is the wrong instrument, you need to understand the mechanism — not the metaphor.

A language model is trained on a corpus of text: books, articles, websites, academic papers, forum posts, song lyrics, legal filings, personal essays. During training, the model processes this text to learn statistical relationships between words and concepts. These relationships are encoded as numerical vectors in what researchers call an embedding space — a high-dimensional mathematical structure where meaning is represented geometrically.

Words and concepts that are semantically related end up geometrically close in this space. The relationships between concepts — analogy, causation, contrast, hierarchy — are encoded as directions and distances. This is not metaphorical. It is the literal mathematical structure of how the model operates.

"The embedding space does not contain the text it was trained on. It contains the relational geometry of the concepts within that text. Your novel is not in there. But the shape of the conceptual space from which your novel was generated is."

This is categorically different from copying a work. It is the absorption of the cognitive infrastructure that makes works possible.

Why the Human Learning Analogy Fails

Judge Alsup found the human learning analogy persuasive. It fails on four dimensions that matter for both the legal and policy analysis.

Scale and simultaneity. A human reads sequentially, one work at a time, across a lifetime bounded by biology. A model ingests billions of documents simultaneously. Human cultural transmission has always had a natural rate limiter: the bandwidth of individual minds and the time required to transmit ideas between them. That rate limiter is what made the cultural commons sustainable. AI removes it entirely.

Lossiness and transformation. Human learning is profoundly lossy. Most of what a person reads is forgotten. What is retained is transformed — integrated with everything else the person knows, filtered through their specific history. A language model's training preserves the structural relationships between concepts with high fidelity. The output is not one person's transformation of what they absorbed. It is the statistically distilled structure of millions of human minds, made available on demand.

Reciprocity. Shakespeare absorbed Plutarch — but Shakespeare also contributed to the ecosystem that sustained the writers who came after him. He participated in the commons he drew from. A language model does not participate in the commons. It extracts from it and re-enters as a competitor, not a contributor.

The species constraint. Every prior instance of cultural absorption occurred within the constraint of human minds transmitting to human minds. This constraint was the mechanism by which the commons remained a commons: accessible to all, dominated by none. AI removes the species boundary. What took civilization centuries to accumulate was ingested in a training run.

Argument structure

Three-layer argument structure: mechanism → legal failure → correct remedy

The Evidence of Displacement

This is not a theoretical concern. The displacement is already measurable across the sectors whose work trained the models.

◆ Displacement evidence across sectors

Writing

Freelance writing job postings fell 33% since ChatGPT's release. Earnings for experienced freelancers dropped 5%, with the highest-quality work displaced first.

Bloomberry · Hui, Reshef & Zhou, Org. Science 2024

Music

Under current conditions, 24% of music creators' revenues are at risk by 2028 — a projected €10 billion cumulative loss — despite creators' works providing the training fuel.

CISAC / PMP Strategy, December 2024

Coding

Computer programmer employment projected to decline 6–10% through 2033. Entry-level employment in AI-exposed roles fell 6% from late 2022 to July 2025.

BLS · Stanford Digital Economy Lab, 2025

Across all three sectors, the pattern is consistent: the work that trained the models is being displaced by the models it trained. Copyright litigation addresses none of this structurally.

What Was Actually Taken

Copyright law protects discrete expressions of ideas. It asks a narrow question: did you reproduce this specific creative work without authorization?

That question cannot reach what language model training actually did. What was taken is not any specific work. What was taken is the cognitive commons — the accumulated structure of human ideation, the geometry of how human minds generate meaning, encoded across the entire written and creative output of civilization over centuries.

Every human who ever wrote clearly contributed, incrementally, to a shared understanding of what clear writing looks like. Every musician who found the right chord progression contributed to a shared understanding of how harmony moves through tension and resolution. Every programmer who solved a problem elegantly contributed to a shared understanding of what good code looks like.

This accumulated understanding is not owned by any individual. It belongs to the commons — to humanity collectively, as the output of a multigenerational project that nobody designed and everybody participated in. That commons was ingested, compressed, and is now being deployed commercially at scale by a small number of private companies.

"Copyright law can address the specific case where a specific work was reproduced. It cannot address the systemic case where the structural output of an entire civilization was absorbed and commercialized."

The Correct Remedy: An AI Extraction Levy

The intellectual tradition that fits this problem is not intellectual property law. It is extraction economics.

When a mining company extracts lithium from the earth, the resource belongs to the commons — to the public, administered by a governing body on the public's behalf. A royalty is levied on the extraction, proportional to the value extracted. The proceeds flow back to the public. This is not a punitive framework. It is a structurally correct one: when an entity extracts value from a shared resource, a proportional return to the shared community is required.

The appropriate remedy is a levy on AI revenue — structured as a mineral extraction royalty, not a copyright settlement. The levy would be proportional to the commercial value generated from deploying models trained on the commons. The proceeds would be administered by a governing body — analogous to a sovereign wealth fund — for the benefit of the creative commons from which the value was extracted.

Two practical objections deserve direct answers. On quantification: we cannot precisely measure what was extracted. Neither can we precisely measure what was extracted from a mineral deposit before royalty rates are set — governments use estimates, production data, and revenue proxies. The unquantifiability of the harm is not an argument against the levy. It is an argument for it. When harm is diffuse and systemic, systemic remedies are required.

On governance: who administers the fund? This is a legitimate implementation question — not a challenge to the underlying principle. Extraction royalties in mineral economies face the same governance challenge, and workable frameworks exist. The principle that extracted value should flow back to the commons does not depend on solving the governance problem in advance.

Why the Law Is the Wrong Tool

The legal challenges against AI companies are sophisticated. The attorneys are capable. The plaintiffs include some of the most powerful media institutions in the world. The problem is not the people fighting. The problem is that the law being wielded was not designed for this problem.

Copyright was not designed to protect expression for its own sake. The U.S. Constitution is explicit: copyright exists "to promote the Progress of Science and useful Arts." The protection of individual works is the mechanism. Incentivizing the creation of works for the public benefit is the end.

The correct question, therefore, is not whether a specific work was reproduced. It is whether the incentive structure that produces works has been structurally altered. The data above suggests it has — in precisely the sectors whose work provided the training foundation for commercial AI systems.

This is not a failure of the lawyers or the plaintiffs. It is a failure of instrument selection. The people fighting this war are right about the injustice. They need a different theory of the case.

The war is worth fighting. The weapons need to change.

◆ Sources

Bartz v. Anthropic PBC, No. 3:23-cv-03417-WHA (N.D. Cal., summary judgment June 23, 2025)
Hui, X., Reshef, O., & Zhou, L. (2024). "The Short-Term Effects of Generative Artificial Intelligence on Employment." Organization Science, 35(6), 1977–1989.
PMP Strategy for CISAC. (2024, December). Study on the Economic Impact of Generative AI in the Music and Audiovisual Industries.
U.S. Bureau of Labor Statistics. Occupational Outlook Handbook: Computer Programmers. bls.gov
Brynjolfsson, E. et al. (2025). "Canaries in the Coal Mine? Six Facts about the Recent Employment Effects of Artificial Intelligence." Stanford Digital Economy Lab / ADP Research.
U.S. Constitution, Art. I, §8, cl. 8 (Copyright Clause).

About the author

Bhavana Chamoli

AI strategist and enterprise architect with experience building data and AI systems at institutional scale, including at MIO Partners (McKinsey's hedge fund office) and in support of BNY Mellon's data modernization program. MBA, Columbia Business School. MS, Carnegie Mellon University. Author of The Conscious Remainder (forthcoming).

bhavanachamoli.ai Substack LinkedIn