When Machines Learn from Humans: Copyright Law and the Challenge of Generative AI in India

April 25, 2026
4:09 am

Every time a large language model (LLM) like ChatGPT answers a question, it draws on patterns absorbed from billions of pages of text – news articles, novels, academic papers, and code, scraped from the internet during training. The authors of that text were never asked. They were never paid. In most cases, they do not even know their work was used. In the upcoming paragraph, you will know more about copyright law for generative AI.

This is not a hypothetical problem or a minor glitch in the system. It is the central legal dispute of our technological moment, playing out simultaneously in American courtrooms, European parliaments, and, increasingly, Indian courts. For India, a country with a fast-growing AI sector, a vast creative industry, and a copyright statute that turns sixty-eight this year, the challenge is both urgent and uniquely complex. We are essentially trying to govern a 21st-century “god-like” technology using a 20th-century rulebook written in the era of paper and ink.

The Global Litigation Landscape

The opening shots in the AI copyright war were fired in the United States. In late 2023, The New York Times sued OpenAI and Microsoft, alleging that millions of its articles were used without permission or payment to train ChatGPT. The publishers argued that the AI wasn’t just “learning” from them; it was competing with them by offering summaries that made visiting the original news site unnecessary. By early 2025, federal judges began allowing these core copyright infringement claims to proceed, rejecting OpenAI’s early motions to dismiss and setting the stage for a trial that could redefine the “fair use” doctrine for the digital age.

Across the Atlantic, the UK’s High Court delivered a significant, if frustratingly narrow, ruling in late 2025 regarding Getty Images and Stability AI. The court didn’t actually decide if AI training was legal; instead, it focused on geography. It ruled that Getty couldn’t prove that the model’s actual “training” took place on British soil. This highlights one of the biggest headaches for regulators: AI companies are global, but copyright laws are strictly local. Meanwhile, other US judges have offered a glimmer of hope for AI companies.

In cases involving companies like Anthropic and Meta, judges suggested that AI training might be “highly transformative.” This means that because the AI creates a mathematical model that is fundamentally different from a library of books, it might not be a “copy” under the law. However, with over fifty lawsuits pending globally, from authors like Sarah Silverman to music labels, no final, binding precedent has been set. The world is essentially waiting for a “Eureka” moment from a judge that hasn’t arrived yet.

India’s Moment: ANI v. OpenAI

In November 2024, this global uncertainty landed firmly in India. ANI Media Private Limited filed a suit before the Delhi High Court against OpenAI. They alleged that ChatGPT had unlawfully used and stored their copyrighted news content to train its models. ANI’s argument was straightforward: you used our hard work to build your profitable product, so you should have paid us first.

OpenAI’s defense was two-fold. First, they used the “where is the server?” argument, claiming the Delhi court didn’t have jurisdiction because their hardware isn’t in India. Second, they claimed there was no proof that any actual “copying” was happening inside Indian territory. The court, however, was intrigued enough to keep the case alive.

Most interestingly, the Delhi High Court appointed two “friends of the court”, expert amici curiae, to help them understand the technology. These experts disagreed entirely. One argued that using text to build a statistical model is a modern form of “learning” and should be protected as a permissible exception. The other expert disagreed, stating that taking millions of articles for a commercial, for-profit business is clearly outside the boundaries of “fair dealing.” With judgment reserved, India stands at the edge of a decision that will set the tone for the entire Global South.

The Law: Section 52 vs. Fair Use

To understand why India’s situation is so tricky, we have to look at how different countries handle “exceptions” to copyright.

In the United States, they have “fair use.” It’s a flexible, four-factor test that allows judges to use their common sense. If a use is “transformative”, meaning it adds something new or has a different purpose, it’s often allowed. This is how Google was allowed to scan millions of books to create a search index.

The European Union took a more “top-down” approach. They created a specific exception for “Text and Data Mining” (TDM). Basically, if you access data legally, you can train your AI on it, unless the creator explicitly says “no” through a digital “opt-out” signal.

India, however, relies on Section 52 of the Copyright Act, 1957. This is what’s known as a “closed list.” Unlike the US, where a judge can decide something is “fair” even if it’s not in the books, Indian judges can only allow exceptions that are specifically listed in Section 52. These include things like “private research,” “criticism,” or “reporting of current events.”

The problem? “Training a multi-billion dollar AI model for profit” is not on that list. This creates a massive “doctrinal gap.” As one legal scholar noted, our law doesn’t offer any certainty for large-scale automated data analysis. Indian courts have historically been quite strict about this, refusing to stretch the law to cover new technologies unless the government updates the statute.

Technical Realities and Legal Fictions

A lot of this debate hinges on a technical misunderstanding: what does an AI actually “do” with a news article?

When an AI trains, it copies the data into its memory. But once the training is done, the model doesn’t keep a “copy” of the article like a PDF in a folder. Instead, it turns the article into numbers, mathematical “weights” that tell the AI how likely certain words are to follow others. In theory, the AI “learns” like a human student.

However, sometimes the AI “overfits” or “memorizes” parts of its training data. If you prompt it the right way, it might spit out several paragraphs of a New York Times article or an ANI news report word-for-word. This is the “smoking gun” for publishers. They argue that if an AI can reproduce their work, it is a “market substitute.” Why would a user pay for a news subscription if ChatGPT can give them the whole story for free?

The Policy Response: India’s Emerging Framework

Realizing that the 1957 Act is out of its depth, the Indian government has started to move. In May 2025, the Ministry of Commerce set up an expert panel to see if the law needs a “Version 2.0.” They are considering adding a whole new chapter to the Copyright Act specifically for AI.

Meanwhile, the Ministry of Electronics and Information Technology (MeitY) has released guidelines emphasizing “Innovation over Restraint.” But they’ve been very clear: just because a technology is “good for the public” doesn’t mean it gets a free pass to ignore copyright.

The most talked-about proposal is a “hybrid licensing model.” Under this plan, AI companies could train on copyrighted works without asking for permission every single time (which would be impossible given the billions of pages involved). Instead, they would pay a percentage of their revenue into a “Copyright Royalties Collective.” This money would then be distributed back to the journalists, authors, and artists whose work was used. It’s a bit like how radio stations pay a blanket fee to play music.

Creative Industries vs. the Tech Sector

This has created a deep divide in India. On one side, we have the Digital News Publishers Association (DNPA) and music rights societies. They are terrified that AI will cannibalize their industries, using their own content to put them out of business. They want strict rules and high payments.

On the other side are Indian AI startups and tech groups like NASSCOM. They are building India’s own LLMs, like Ola’s “Krutrim”, which are trained in Indian languages like Hindi, Tamil, and Telugu. They argue that if India makes copyright laws too strict, only giant American companies like Google and OpenAI will be able to afford the lawyers and the licenses. Local Indian startups, which have less money, would be crushed. The irony is that in an attempt to protect Indian creators, we might end up handing the keys to our digital future to Silicon Valley.

The Path Forward

So, where do we go from here? Most scholars agree that “doing nothing” is not an option. We are likely looking at three potential pillars for a future law:

A Data Mining Exception: Let companies train on data, but give creators a clear “opt-out” button.
Mandatory Transparency: Force AI companies to list what data they used, so creators at least know if they are being used.
The Royalty Model: Create a centralized way for creators to get paid without stopping the technology from growing.

What is clear is that the 1957 Copyright Act, written when India was a brand-new republic, and computers were the size of rooms, cannot handle the weight of the AI revolution. Our courts are doing their best to interpret the old rules, but “judicial creativity” can only go so far. Eventually, the legislature will have to step in.

Conclusion

The question of whether AI can legally “read” everything we write is not just a boring legal debate. It is a question of who gets to profit from human creativity. The answer will decide whether the AI era is one of collaboration between humans and machines, or one where machines simply replace the very people they learned from.

India is now at the center of this global conversation. The upcoming ruling in the ANI v. OpenAI case and the government’s new legislative proposals will matter just as much as what happens in Washington or Brussels. In the age of AI, every word ever written is potentially training data. The law must now decide, once and for all, whose word that is.

Author: Mr. Prafull Tiwari, student at ILS Law College, Pune

Link to similar articles: https://jpassociates.co.in/ani-v-openai-copyright-law/

References

New York Times Co. v. OpenAI Inc., Southern District of New York (2023).
Getty Images v. Stability AI, D. Del. (2023).
Getty Images (US) Inc. &Ors v. Stability AI Limited [2025] EWHC 2863 (Ch).
Bartz v. Anthropic, Order, June 23, 2025.
Kadrey v. Meta Platforms, Order, June 25, 2025.
ANI Media Pvt. Ltd. v. OpenAI Inc., 2024 SCC OnLine Del 8120.
Authors Guild v. Google Inc., 804 F.3d 202 (2d Cir. 2015).
Directive (EU) 2019/790 on Copyright in the Digital Single Market, Articles 3 and 4.
Regulation (EU) 2024/1689 (EU AI Act).
Section 52(1)(a), Copyright Act, 1957 (India).
Civic Chandran v. Ammini Amma, Kerala High Court.
Shemaroo Entertainment Ltd. v. News Nation, Bombay High Court.
MeitY, India AI Governance Guidelines 2025.
NITI Aayog, National Strategy for Artificial Intelligence (#AIforAll), June 2018.
NITI Aayog, Principles for Responsible AI, 2021.
Digital News Publishers Association (DNPA) letter to MeitY, January 26, 2024.

Discover more from J.P. Associates

Subscribe to get the latest posts sent to your email.

When Machines Learn from Humans: Copyright Law and the Challenge of Generative AI in India

The Global Litigation Landscape

India’s Moment: ANI v. OpenAI

The Law: Section 52 vs. Fair Use

Technical Realities and Legal Fictions

The Policy Response: India’s Emerging Framework

Creative Industries vs. the Tech Sector

The Path Forward

Conclusion

Related

Discover more from J.P. Associates

Leave a Reply Cancel reply

Post Categories

Recent Posts

Karnataka High Court Strikes Down Pan Masala Cess: A Detailed Analysis of the Dhariwal Industries Pvt. Ltd. v. Union of India Judgment

Health Security se National Security Cess Act, 2025: A Capacity-Based Levy on Pan Masala Manufacturing and the Pending Delhi High Court Challenge

Deepfakes, Synthetic Media & Personality Rights in India: What Businesses and Public Figures Must Know Under the IT Rules, 2021 (as amended in 2026)

When Machines Learn from Humans: Copyright Law and the Challenge of Generative AI in India

The Global Litigation Landscape

India’s Moment: ANI v. OpenAI

The Law: Section 52 vs. Fair Use

Technical Realities and Legal Fictions

The Policy Response: India’s Emerging Framework

Creative Industries vs. the Tech Sector

The Path Forward

Conclusion

Related

Discover more from J.P. Associates

Leave a Reply Cancel reply

Post Categories

Recent Posts

Post Tags

Disclaimer & Confirmation