Forgot your password?
typodupeerror
Piracy AI The Courts Slashdot.org

Zuckerberg 'Personally Authorized and Encouraged' Meta's Copyright Infringement (variety.com) 76

Five major publishers and author Scott Turow have sued Meta and Mark Zuckerberg, alleging that Zuckerberg "personally authorized and actively encouraged" massive copyright infringement by using pirated books, journal articles, and web-scraped material to train Meta's Llama AI systems. Meta denies wrongdoing and says it will fight the case, arguing that courts have recognized AI training on copyrighted material as potentially fair use. Variety reports: "In their effort to win the AI 'arms race' and build a functional generative AI model, Defendants Meta and Zuckerberg followed their well-known motto: 'move fast and break things,'" the plaintiffs say in their lawsuit. "They first illegally torrented millions of copyrighted books and journal articles from notorious pirate sites and downloaded unauthorized web scrapes of virtually the entire internet. They then copied those stolen fruits many times over to train Meta's multibillion-dollar generative AI system called Llama. In doing so, Defendants engaged in one of the most massive infringements of copyrighted materials in history."

The suit was filed Tuesday (May 5) in the U.S. District Court for the Southern District of New York by five publishers (Hachette, Macmillan, McGraw Hill, Elsevier and Cengage) and Turow individually. The proposed class-action suit seeks unspecific monetary damages for the alleged copyright infringement. A copy of the lawsuit is available at this link (PDF). [...] the latest lawsuit alleges that Meta and Zuckerberg deliberately circumvented copyright-protection mechanisms -- and had considered paying to license the works before abandoning that strategy at "Zuckerberg's personal instruction." The suit essentially argues that the conduct described falls outside protections afforded by fair-use provisions of the U.S. copyright code.

This discussion has been archived. No new comments can be posted.

Zuckerberg 'Personally Authorized and Encouraged' Meta's Copyright Infringement

Comments Filter:
  • by HalAtWork ( 926717 ) on Wednesday May 06, 2026 @01:19PM (#66130610)

    It goes to show you how full of themselves someone is when they straight up flout the law and personally authorize it.

    Platform engagement tactics also show his motives are not altruistic.

    It is clear he is a greedy sociopath.

    • > when they straight up flout the law and personally authorize it

      That's the law for you, not the corporate oligarchs.

      Worst case outcome for them is they pay a small fine, nobody is personally held liable.

      • Worst case outcome for them is they pay a small fine, nobody is personally held liable.

        Now's the time to put those trillion dollars fines for copyright infringement laws to good use.
        If Zuck personally authorized it...

    • Mostly agree with you and could cite several recent books to support the areas of agreement, but I still want to go for the joke. The root of the problem is perverse incentives, but I would argue the incentives rule and Zuck is just another fool trapped in a system that he thinks he built. (And I still think the most perverse incentives are over at YouTube and Amazon. Or maybe in the AI bubble somewhere? (We are so googled and zucked...))

      Okay, I can't resist one citation for Chaos Monkeys because it is so

    • "flout the law"

      They disregarded the rights of authors etc. to be compensated for their work. The Law just identified the transgressions.

      You can be sure if I or you shared Meta's source code and built a site with it we would be pursued to the ends of the Earth.

      Hypocrisy, and yes just one practitioner. Get caught, pay up.

      I approach this the same way I approached the Martha Stewart insider trading scandal. It's not that she made a few bucks, or avoided losing some, but, but, some other schlub suffered the loss

      • by haruchai ( 17472 )

        "Let Zuck pay a meaningful price. Or just go on..."
        Meta will pay a meaningful bribe to Trump and keep on zucking

  • by k3v0 ( 592611 ) on Wednesday May 06, 2026 @01:19PM (#66130612) Journal
    but unauthorized distribution via torrents is def not fair use
    • The training is not legal. They're using copyright materials for commercial purposes and may potentially be able to reproduce it for everyone who uses it even if it's in bits and pieces.

      • by Junta ( 36770 )

        I agree with you, but the big tech companies *seem* to be winning their arguments, even when the plaintiff shows output that includes even the watermark of the plaintiff's stuff on something that looks like the plaintiff's assets.

        So it's at least more pragmatic to show that the acquisition and likely redistribution of the works while torrenting were a problem without even bringing up the whole AI ingest argument.

      • by taustin ( 171655 )

        The training is not legal.

        Until the Supreme Court rules on it (and they will, eventually), the question remains open.

        After all, Google Won Authors Guild v. Google at trial. Many of the same arguments will apply here.

        The issue to pursue here isn't what they did with the pirated material, it's how they pirated it.

        (Commercial use can affect the penalties for infringement, but it's a factor in whether or not it is infringement.)

      • by Whibla ( 210729 )

        The training is not legal. They're using copyright materials for commercial purposes and may potentially be able to reproduce it for everyone who uses it even if it's in bits and pieces.

        Oh come on now - the outcome of the training is literally transformative (c.f. GPT, i.e. generative pretrained transformer), hence not a copyright violation. </s>

        (Ok, so this ^ was sarcasm. If any lawyer uses this as a defence in court I apologise for potentially giving them the idea. On the plus side, perhaps I could then sue them for copyright violation??)

    • by bill_mcgonigle ( 4333 ) * on Wednesday May 06, 2026 @01:44PM (#66130648) Homepage Journal

      One might imagine that buying a million books would get a buyer a very good discount from the publisher.

      They could have paid $5 or whatever for each book they trained on. $15M or so for 3 million books - they could totally afford that but "why pay when you can steal?"

      Then at least they would be defending fair use rather than defending a 'theft' lawsuit.

      If we had Constitutional Copyright they at least would have millions of 14-year-old books to train on. That would be quite sufficient to train and refine models.

      • One might imagine that buying a million books would get a buyer a very good discount from the publisher.

        They could have paid $5 or whatever for each book they trained on. $15M or so for 3 million books - they could totally afford that but "why pay when you can steal?"

        True, though I'll bet the negotiations with dozens of publishers would have dragged on for years.

        If we had Constitutional Copyright they at least would have millions of 14-year-old books to train on. That would be quite sufficient to train and refine models.

        Although I know what you mean, there is no such thing as "Constitutional Copyright". The Constitution authorizes Congress to set terms for protecting the work of authors and inventors, but it doesn't specify the 14-year term. That was just the first term that Congress chose to enact. It happened not long after the constitution was ratified, but it wasn't in any way part of the constitution.

    • Even if training is legal (and I don't concede it is), you still have to have a legally authorized copy to train from to begin with. Unless Meta paid for every book they trained their AI from, they're already guilty of copyright infringement.
  • those rules are for you, not me!

  • Precedent (Score:5, Insightful)

    by Local ID10T ( 790134 ) <ID10T.L.USER@gmail.com> on Wednesday May 06, 2026 @01:36PM (#66130628) Homepage

    Precedent holds that training with copyrighted material is transformative in nature, and thus is non-infringing.

    Precedent further holds that pirating the material to train with is an incurable violation of copyright: That an AI trained using a dataset that includes pirated material is tainted to a degree that can only be cured by deletion of the AI and the training set data. Purchasing valid copies of the data after the fact are not sufficient; although a new dataset can be constructed from the newly purchased data and a new AI trained with this new dataset. This is in addition to the financial liability of the copyright violations.

    Zuck is fucked.

    • The cure was to make some lawyers very rich while the copyright owners got peanuts, which also prevented those lawyers from taking it to a higher court. Even so, it ain't over till the supreme court geriatrics divine the penumbra of the law.

    • Purchasing valid copies of the data after the fact are not sufficient; although a new dataset can be constructed from the newly purchased data and a new AI trained with this new dataset.

      Any guestimates on how much it would cost META to actually purchase all or most those same works to re-train their bots legally?

      (They still may have to pay a fine or refund for their original transgression.)

      Zuck is fucked.

      The rich often skirt the law. They have a thousand flying monkey lawyers going through thousands of leg

    • Precedent holds that training with copyrighted material is transformative in nature, and thus is non-infringing.

      Precedent further holds that pirating the material to train with is an incurable violation of copyright

      These two statements are mutually contradictory. Is it infringing or not?

      • Training is not the same as pirating.

        Training using material legally acquired is legal.

        Training using illegally acquired material is illegal.

        Not a contradiction.

        • Ah, yes, I missed the different verb. Except "pirating" isn't really a legal concept. What is the specific precedent you're referring to?
          • I was thinking Bartz vs. Anthropic, but there are other rulings.

            I think someone summarized them further down, but I recommend reading them for yourself.

            • I was thinking Bartz vs. Anthropic, but there are other rulings.

              I think someone summarized them further down, but I recommend reading them for yourself.

              Bartz v Anthropic is not a binding precedent. It could have been binding in the 9th circuit, but they settled before the appellate court could consider it. And, of course, it's always possible that the appellate could have reversed Alsup.

              I think this question is still very much in undecided. It's trending against Meta's interests, but AFAIK isn't there yet.

              • This is true. The cases so far have settled. None have gone thru the appeals process to actually generate binding precedent from a high court -they have all been rulings from a trial court judge.

                BUT in each case, the judge has ruled as a matter of law that training is inherently transformative and thus not infringing. This is precedent setting, but not binding. Another judge could decide differently -but in doing so would create a matter for appeal.

                Where the other cases have diverged is on the other iss

    • Precedent holds that training with copyrighted material is transformative in nature, and thus is non-infringing.

      This is a misunderstanding of the law.

      If you copy something substantial, then it's a copy. In court, you can hope to argue a fair use defense and hope the jury agrees with you. Transformative is only one of the factors [wikipedia.org]. Commercialization is another of the factors, and some judges have ruled that it's the most important factor.

      Copyright law is confusing and especially when it comes to fair use, the defense is designed to be a judgement call.

    • Zuck is fucked.

      Only if he does not pay enough money to the right people. He is stupid, so he may make that mistake; however, it is perfectly clear that if you give the right amount of money to the right amount of people, you can do whatever you want without concern.

  • by FudRucker ( 866063 ) on Wednesday May 06, 2026 @01:49PM (#66130670)
    ChatGPT (OpenAI) Google, Microsoft, and other unnamed AI systems too if they are taking other people's content to feed to their AI then they are guilty of copywrite infringement
    • by Tablizer ( 95088 )

      I believe scraping the web would only be a problem if they skirted paywalls (hacked), or violated the terms of paid subscriptions, some of which forbid bot training use.

      Scraping publicly available sites for AI training is generally not considered copyright violation, although I'm not an expert, so don't quote me.

      • by allo ( 1728082 )

        Legally you need a permission to use the content. This permission can either be getting a license (possibly implicit, e.g. when someone says "I don't care, just take it") or the usage falling under a clause of law that does not require a license. Here the companies say AI training is transformative use and courts currently do not rule otherwise. So they use unlicensed content, but they use it in a scenario that does not require a license.

        • by Tablizer ( 95088 )

          This seems like a contradiction, I'm not following. Perhaps you mean "use" differently than I'm interpreting.

          • by allo ( 1728082 )

            Use may be a badly defined word for such things.

            Let's try it another way.
            You start with having all rights what you want to do. Then the law takes some of these rights away to protect the (intellectual) property of others. It also defines under what circumstances you still have these rights.

            One option everyone knows is, that you get the permission by the creator, usually as some license which may come with some requirements and obligations. But another option is that your use qualifies as "fair use" and anot

  • torrenting is not fair use. If an individual does this they will face jail time and fees, so should a business what facebook did is not fair.

    • Scanning a book and printing out a hundred copies for yourself is not fair use either.

      Zuck, Dario and Altman should share a cell.

      • No it's not, but if it's ok for a human to read and retain information (like someone with a photographic memory) then wouldn't it be ok for AI to do the same? I'm not arguing for this but I suppose this is an argument that could be made. You should be able to determine if your work is searchable and readable by AI. AI should at least get citations right.

        • What if the human printed out 100 complete copies of the book for separate sets of annotation while learning it?

          Training makes literal non DMCA exempted copies during the process.

    • by allo ( 1728082 )

      Fair use is not the way you get it, but what you do with it.
      The point here is, that *uploading* to people who probably won't just train an AI with it is not allowed.

  • I heard Zuckerberg failed to pay back a loanshark operating in the Metaverse and they took his legs.
    • Seems like less than a year passed between Zuck changing the company name to reflect the fact that they were now a Metaverse company and Zuck proclaiming "We're an AI company now!" When's the next name change? Last I heard, Meta Reality Labs was losing about a billion dollars every quarter (although they are doing some fun experiments).
  • ClippyAI: 1. Direct Precedents: AI-Specific Rulings

    As of 2026, several cases have directly addressed generative AI training, with mixed results that Meta uses to bolster its "fair use" defense.

    Kadrey v. Meta (2025): In this case, Meta successfully argued for partial summary judgment on the grounds of fair use regarding the training process of its Llama models. Judge Vince Chhabria ruled in favor of Meta, though he expressed significant doubts about whether all forms of AI training would be considered
  • by sabbede ( 2678435 ) on Wednesday May 06, 2026 @02:25PM (#66130756)
    Yes, they may be in real hot water with the torrents, but scraping publicly available websites cannot be wrong. It's public information in public view. That should probably be tossed out, and the proceedings should focus on the pirated material.
    • by Okind ( 556066 )

      [...] scraping publicly available websites cannot be wrong. It's public information in public view.

      This depends entirely on the copyright laws where you live / read the stuff.

      For example, newspapers generally publish some articles for free reading by the general public. They are in public view, and thus also public information (as in, available for the public for a specific, narrow purpose).

      But the newspaper still holds the copyright. And the (implied) licence to read their news articles most certainly does not allow you to copy them, for example. So scraping is almost certainly a copyright violation (i

      • Your computer creates a copy of the article to show on your screen when you load the article.

      • Well, in that case its distribution that gets one in trouble. It's not illegal to photocopy a newspaper, but you can't then distribute those copies. I'm not even sure if it's technically legal for a professor to copy an article and pass it out in class, but I doubt any have gotten in trouble for it.

        Someone posted a list of recent court decisions that say scraping for AI training is fair use due to it being transformative. If that's the case, and the jurisdictions line up, that portion of the suit will

  • Sooooo, when is this guy going to go to prison for this?

  • I despise the whole concept behind Facebook.

    You're basically submitting yourself to a continuous session under the data harvesting fluoroscope.

    Sorry, I never signed up and never will sign up.

    Socializing live by going out to real physical events is much better for you.

  • Scraping facebook of all its data etc is fair game "to train an AI" ?

    Why do I suspect that Zac would disagree....
    • "Garbage In, Garbage Out." Why would you want to train any AI with the garbage on Facebook? I think the whole garbage in, garbage out principle is what will be LLM's undoing as AI is increasingly trained on AI output, creating a feedback loop of mediocrity and hallucinations.
  • I have no love for Zuck, and believe under current precedents that the conduct was illegal.

    But on the other hand, US copyright is broken and has strayed from the original purpose and social bargain. Thus I have some sympathy for the argument that training AI is the definition of promoting the useful arts and sciences.

    I'm more upset about the way publishers license ebooks to libraries and restrict academic papers.

  • It is not like it has not been obvious for quite a while what a lowlife he is.
  • I had to move pretty slowly, because everything was broken. Most frustrating? I couldn't connect to WiFi at my desk, so every morning I had to pick up my laptop, walk 100 feet away to login, then go back to my desk. So apparently the Access Point sitting literally right over my desk was messed up, but none of the others.
    • by whitroth ( 9367 )

      Actually, I believe you. Given their "devops" involves rolling alpha-tested fixes/changes into production, then finding the problems and fixing them IN PRODUCTION...

      Meanwhile, I can't post to my own author page, because as of this week, they demand I have a smartphone and use their app to log in. (I own a flip phone.)

  • I say it is not covered by copyright law, hence only alleged.

"And do you think (fop that I am) that I could be the Scarlet Pumpernickel?" -- Looney Tunes, The Scarlet Pumpernickel (1950, Chuck Jones)

Working...