Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
AI Entertainment Technology

More Than 10,500 Artists Unite in Fight Against AI Companies' 'Unjust' Use of Creative Works (aitrainingstatement.org) 64

More than 10,500 artists and creators -- including ABBA's Bjorn Ulvaeus, actress Julianne Moore, actors Kevin Bacon and F. Murray Abraham, as well as former Saturday Night Live star Kate McKinnon, author James Patterson and Radiohead's Thom Yorke -- signed a statement condemning AI companies' unauthorized use of creative works for training their models. The initiative, led by former AI executive Ed Newton-Rex, demands an end to unlicensed training data collection amid mounting legal challenges against tech firms. "The unlicensed use of creative works for training generative AI is a major, unjust threat to the livelihoods of the people behind those works, and must not be permitted," reads the statement.

The protest comes as major artists and publishers battle AI developers in court. Authors John Grisham and George R.R. Martin are suing OpenAI, while record labels Universal, Sony and Warner have filed lawsuits against AI music creators Suno and Udio. The signatories reject proposed "opt-out" schemes for content scraping, calling instead for explicit creator consent.
This discussion has been archived. No new comments can be posted.

More Than 10,500 Artists Unite in Fight Against AI Companies' 'Unjust' Use of Creative Works

Comments Filter:
  • Congress isn't going to do shit. The EU is completely paralyzed waiting for the US to tell them which way to jump ... until this gets to the US Supreme Court the status quo will stand.

    It could get hilarious, we could see the first trillion+ dollar judgements if copying into a training set requires a license. The statutory damages on registered works alone will get there for the big boys.

    • My best guess is the courts will (eventually) rule that content creators must be paid for their work. There are already thousands of rulings supporting copyright. When AI must pay for content they have 2 choices, train LLMs with Public Domain media for free or pay for a limited set of up-to-date content. I expect Subscribers will pay only for the content they need and no more. That means the Subscriber chooses from a catalog of LLMs loosely tailored to their needs. If you have a question about Shakespe
      • by mysidia ( 191772 ) on Tuesday October 22, 2024 @03:53PM (#64885375)

        My best guess is the courts will (eventually) rule that content creators must be paid for their work.

        There's hardly any basis for it. The most likely outcome would be that downloading data and training a model with that data does Not infringe upon copyright at all. You are not reproducing and distributing Nor are you publicly performing a work by training a model.

        On the other hand: when using the trained model to generate images based on it - the resulting image might be infringing. This Infringement could occur when you prompt an AI model and then download and distribute the result of the generative AI -- IF the result is probatively similar to a copyright work, then the AI created a reproduction, and the user's act of downloading and sharing a copy of that reproduction would be copyright infringement. Also, the Online service provider who provided the model that the end user prompted Could be vicariously liable, because their online Subscription as a service "AI generator tool" will have created the reproduction most likely on its own: Therefore the online AI service provider would be Liable for assisting the end user of their online service in committing any acts of copyright infringement.

        • Re: (Score:3, Insightful)

          by gabebear ( 251933 )
          I call BS on LLMs not copying, if a model can trivially reproduce enough of a copyrighted work then the model is a copy and is infringing if the model is used or published. Copying copyrighted work is allowed as long as it's not “a substantial part thereof", which is what I see most people trying to argue(that it's only copying little bits). How much copying happens is up to the trainer which means the trainers/model-creators are also violating copyright when they copy too much and publish their model
          • While it's true that the information from the original work is in some sense copied into the AI model, that information is also thoroughly mixed with information from the rest of the vast training data set. The information from one work is fragmented into contributions to many numerical weights in different parts of the artificial neural network. A good analogy is that when you shine a light on an artwork, say one that is displayed outside in public, the light reflects off all the different atoms of the a
            • by jvkjvk ( 102057 )

              >While it's true that the information from the original work is in some sense copied into the AI model, that information is also thoroughly mixed with information from the rest of the vast training data set.

              So you're saying it's okay because they do it to everybody? What? So their vast training set was also mostly copyrighted, so it's fine as long as we mix it all up? Even though it can be "unmixed"?. Nope. Sorry. Try again.

              • by dgatwood ( 11270 )

                >While it's true that the information from the original work is in some sense copied into the AI model, that information is also thoroughly mixed with information from the rest of the vast training data set.

                So you're saying it's okay because they do it to everybody? What? So their vast training set was also mostly copyrighted, so it's fine as long as we mix it all up? Even though it can be "unmixed"?. Nope. Sorry. Try again.

                Being inspired by a copyrighted work isn't typically a copyright violation if a human does it, so logically it shouldn't be perceived as one if a computer does it, either, and the weights being mathematical tensor weights instead of changes to protein structures at the connections between neurons doesn't really change anything, IMO.

                Therefore, training the model is no more making a copy than you reading a book is making a copy. If you inadvertently write a novel and start it with "It was the best of times,

                • by jvkjvk ( 102057 )

                  If you can "untransform" the transformed work back into the original, your argument falls apart.

                  It happens you can do so.

                  • by dgatwood ( 11270 )

                    If you can "untransform" the transformed work back into the original, your argument falls apart.

                    It happens you can do so.

                    If you can "untransform" it to a meaningful amount of the original work (i.e more than would be laughed off as obviously fair use), then maybe, but that also is a strong indication that the model doesn't have enough training data in a specific area, and that the code that runs the model isn't doing enough to reject queries that go down a path that hits nodes with limited training or whatever, i.e. it is a fundamental bug in the model or surrounding code if that can happen in practice.

        • There is no distribution or performance requirement for copyright infringement.

      • When AI must pay for content they have 2 choices, train LLMs with Public Domain media for free or pay for a limited set of up-to-date content
        Isn't that a false dichotomy that misses (rather badly) that copyright is automatic? IF they use say appropriately licensed creative commons works,. those are still copyrighted worjs if created where copyright is automatic, for instance.

        Maybe ... we REALLY need to stop using "copyrighted" as a synonym for "needs to pay" or "off limits" - when the issue is more lice
        • Seriously, the RIAA and the like have done enough to fuck up people's understanding of copyright as it is.

          To borrow from Mr. Adams: "A bunch of mindless jerks who'll be the first against the wall when the revolution comes."

      • "train LLMs with Public Domain media for free"

        So we could get the level of literacy of Jane Austen and the Brontë sisters? James Fenimore Cooper? Daniel Defoe? Jack London? Edgar Allen Poe? H.G. Wells?

        I can think of ever so many worse outcomes.

        Of course when Chandler's and Christie's works hit public domain the LLM may get just a little paranoid.

  • The difference (Score:5, Insightful)

    by bhcompy ( 1877290 ) on Tuesday October 22, 2024 @03:34PM (#64885301)
    What is the fundamental difference between asking JJ Abrams to rip off Spielberg's style and asking a computer to do it? Both were trained on the same source material. You don't need any special permission or license to use that material (legally obtained or not) as inspiration as long as your output isn't substantially the original work (which isn't the claim here). Outside of some "divine inspiration", the output of all artists is influenced/based in part on inputs they've previously digested regardless of whether those inputs were copyrighted or not. How is that any different?
    • Re: (Score:1, Insightful)

      by ebunga ( 95613 )

      It's one thing to be inspired by something, it's another to copy and paste their work into yours. "Generative AI" as it is implemented right now does not create anything. It merely copies.

      • "Generative AI" as it is implemented right now does not create anything. It merely copies.

        Point one, IMO, seems hardly substantiated if one wants to get really pedantic in that a specific combination of elements not existing before exists now - seems in that basic context it'd be created whether man or machine does it.

        Second one IMO DEFINITELY needs a citation. My understanding of LLMs and AI is rudimentary at best, but it seems like this is not how they are supposed to work, not even "as they are currently implemented. IMO if that is correct (that this notion is incorrect)... can we not?

      • Re:The difference (Score:5, Insightful)

        by dgatwood ( 11270 ) on Tuesday October 22, 2024 @04:21PM (#64885509) Homepage Journal

        It's one thing to be inspired by something, it's another to copy and paste their work into yours. "Generative AI" as it is implemented right now does not create anything. It merely copies.

        No, not really. It is more like a series of probabilities. If you train on a book that includes the line "The quick brown fox jumped over the lazy dog," it might know that there's a 1% chance that "the" is followed by quick or lazy, a 1% chance that fox is preceded by quick and brown, a 1% chance that lazy is followed by dog, etc.

        The original data is not present in the training set, except in very limited circumstances where there are not very many items in the training set that match against the query weights, and even then, it is still a statistical description of the original data, not the original data.

        That set of weights basically turns into a beefed up version of what happens when you tap the middle button above your iPhone keyboard. If you feed it "The quick", it might sometimes follow a path in which several of those words appear in the expected order, or it might favor other contextual clues, and you might get "The quick farm tractor ate three brown eggs for middle school students in Germany."

        Calling it copying is a rather severe misnomer. To the extent that anything resembling copying occurs, it is a fluke, and means that the model is too small.

        • Calling it copying is a rather severe misnomer. To the extent that anything resembling copying occurs, it is a fluke, and means that the model is too small.

          The thing that everyone seems to forget is that in order to train an AI model on a work, you have to have a copy of that work. If said copy was not purchased, that's the copyright infringement. Training an AI on the infringing copy is irrelevant.

          Whether the AI can produce similar output sufficient to be considered a derivative work is a separate quest

          • by dgatwood ( 11270 )

            Calling it copying is a rather severe misnomer. To the extent that anything resembling copying occurs, it is a fluke, and means that the model is too small.

            The thing that everyone seems to forget is that in order to train an AI model on a work, you have to have a copy of that work. If said copy was not purchased, that's the copyright infringement. Training an AI on the infringing copy is irrelevant.

            Nope. Copyright law provides an automatic fair use exception for reading a book, looking at a website, playing a movie, etc. The things that a user would do when consuming some types of media are technically making a copy, but not in a way that is considered unlawful infringement. And browser caches and other similar behavior are also protected. The term for these in Title 17 is an "ephemeral copy". So barring any reason to treat AI training differently, an AI model "reading" or "watching" should also

            • Copyright law provides an automatic fair use exception for reading a book, looking at a website, playing a movie, etc.

              Nope. "Fair use" is specifically and only for "quoting [or excerpting] brief segments for review." Reading a book has nothing to do with copyright. After you buy a book, whether you actually read it is irrelevant. You could use it to put under the short leg of your desk for all the publisher cares. But by purchasing, you have acquired an authorized copy, so no copyright infringement.

              "Loo

      • Re:The difference (Score:4, Insightful)

        by presidenteloco ( 659168 ) on Tuesday October 22, 2024 @04:23PM (#64885515)
        "Merely copies" reflects a misunderstanding of how artificial neural net models work. They copy and mix information from vast input datasets, representing abstracted commonalities discovered in the whole of the input. Then they output a different mix of information, depending on large parts of the intermixed and abstracted information model, and depending on the particular prompt requesting some output. Note that it's called a prompt (a prompt to take the action of touring
          associatively relevant parts of the overall model and synthesizing a result) rather than a (passive) query (for directly stored information).
        • Re: (Score:2, Insightful)

          by Anonymous Coward

          Remember back in the early days when ChatGPT became open to the public and we were all amused for a few weeks how inputing random hashes would bring up rather curious output? I once had it regurgitate the copyright page from a specific book, verbatim.

      • Which would mean you could sue for infringement. They aren't suing for infringement. They're challenging the learning.
    • The model is irrelevant, pretending the model was of primary importance was an early and successful disinformation campaign which is now losing its effectiveness. As they say, the primary issue is "use of creative works for training".

      Human reading and viewing is generally not considered copying, whereas copying is considered copying. JJ Abrams could view Spielberg's movies without license and without fair use, the AI companies need fair use for their training software to view Spielberg's movies ... because

      • Technically to view a movie on the internet I need a computer to copy it as well. Its up to the courts to decide if its fair use or not.

        • The transient copies are exempted by DMCA and before that implied license doctrine, which judges are never going to apply to allow AI training.

          They copied a ton of stuff behind terms of service and paywalls too ... fair use is their only shot. Fair use or bankruptcy.

      • Of course taking the information from the movie into our human biological neural net is copying the information (and its copyrightable patterns).

        That's not different in kind to what the computer is doing copying it to be represented as many additions/substractions to many weights between the nodes of an artificial neural network.

        That's the crux of the issue.

        We're going to need some new understanding, in the legal system, of what is really going on with AI learning and output generation/synthesis.
        • They could change decades of jurisprudence and common sense of what constitutes a copy because it's convenient to AI companies ... or they could just not. I'm going to assume the judges will just not.

          • It's actually better if neither the human copying and fragmenting the information into their network memory, and the computer doing essentially the same thing, are considered copyright infringements.
    • If you ask someone to rip off someone's copyrighted work, and they do, then both you and the person who did it can be prosecuted. So you think LLMs should be held to the same standard?
      • I can't be sued for writing a book in the style of, say, Michael Crichton. I can be sued for creating output that is substantially similar enough to be considered a violation (I take Jurassic Park, rename it Jurassic Meadow and rename the characters, but keep the story). Regardless, that's suing someone based on the output they create, which isn't what's happening here. They're challenging the learning.
    • The difference is the AI can do it for less than JJ Abrams. And that's why Hollywood loves it - because JJ Abrams might accidentally get creative and threaten the profits of the Hollywood studio.

      I understand where the misunderstanding lies. While artists do learn from others, and even sometimes incorporate elements of others' work into their own, an artist is fundamentally different from computer equivalent of Mad Libs not in the volume produced, but in the creativity of what they produce. If you scal

      • Sure, it can generate a nice painting in the style of Monet, but it can't produce something that will make the average person say, "Wow, that's something new ... and interesting." The proper understanding of AI is not in reference to copyright, because it can't produce something creative or original. It can only shuffle what it has already seen. It is more akin to an automated Mad Lib generator than anything else.

        How is this any different than Thomas Kinkade?

        Which brings us to the second point: because AI

        • Bob Ross introduced an entire generation to the notion that anyone could paint, and Kinkade showed them they could build a successful business selling mediocre art. While both were very influential in the art world, neither produced work that was regarded as exceptional.

      • by AmiMoJo ( 196126 )

        The reason it's cheaper is because it's worse. If you look at what these AI "artists", the guys who write the prompts and think they are creating this century's great masterpieces, it's 99.999% shit. It's derivative, obviously AI generated, and devoid of the creativity that a human artist would put into it. It's like bad CGI.

        It's not just the fact that it is produced by taking artists' work as training data without recompence, it's that they are trying to eliminate the artists entirely. "Train your replacem

  • Artists... (Score:2, Insightful)

    by DaFallus ( 805248 )
    Reduce copyright back to 15 years or fuck off.
    • Re: (Score:3, Interesting)

      by Travelsonic ( 870859 )
      Copyright being shorter again would help so much (perhaps more controversially, if retroactively applied to works based on publishing date so things that should have been public domain long ago can become public domain as intended).

      The more I think about it, I wonder how much copyright being as long as it is has contributed to the perception of a lack of creativity not just in regards to companies milking existing IPs, but also in terms of fear of being sued over some stupid thing being too similar in w
      • by sconeu ( 64226 )

        In addition, no retroactive extensions would be a good fix as well. Authors who copyrighted their stuff pre-CTEA knew what they were signing up for. There was no reason to extend their copyrights.

  • Based on the amount of money being thrown around when the buzz-bomb "AI" gets mentioned? There is no stopping the hoovering of creative work into the computer replacement for creative work. And even if *WE* stop it here, there's a whole world of countries out there that will just let their version keep chugging along, until our creative industries fade into oblivion due to the wash-over of the spilling tides, nay tsunami, of AI generated creative works. And as much as our creative industries have lowered th

  • by oumuamua ( 6173784 ) on Tuesday October 22, 2024 @03:56PM (#64885385)
    Their work was 0.00000001% of the training set, here is your 0.1 cent share of profit https://www.genolve.com/design... [genolve.com]
  • by cascadingstylesheet ( 140919 ) on Tuesday October 22, 2024 @04:09PM (#64885457) Journal

    ... should already be public domain.

    The founding fathers had it right. Seven years, extendable by seven years, once.

    • by Misagon ( 1135 )

      Copyright-infringement and plagiarism are not the exact same thing.

      If you copy something, you often copy the name of the author/arist as well.
      If you plagiarise something, you detach the attribution from the work.

      There are many movie scripts based on old literary works but updated for more modern settings and situations.
      But in no way does a serious script-writer hide the fact that their script was based on a work by Shakespare or Jane Austen, and try to pass the work off as their own, because that would be p

  • by sunderland56 ( 621843 ) on Tuesday October 22, 2024 @04:20PM (#64885501)

    Actually fewer than 100 artists signed it. The rest were AI-generated bots.

  • "Nobody can make a movie or game until they go through us! Hope you have a $100k to $1 mil budget minimum!" - sincerely, entitled assholes who should have gotten a fallback position. This really seems like "oh no, we can't put the switchboard operators out of business!" level crap. I know art has been around since humans existed so it's not quite on the same scale but they said the same thing about computers and now anyone with a brain uses a digital pad and pen with vastly superior layers and revision hist
  • YouTube videos tracking all the movie "tributes" and paeans to what went before are some of the best viewing.

    AI just cuts out the middle man.

  • Everything is based on what people previously did, otherwise all art would look like cave paintings. To claim someone can’t use your works decades after you’ve made money of it is stupid, because it’s not unreasonable to conclude someone else may have made something similar enough within that time frame.

    • AND btw we know ABBA steals melodies too: https://m.youtube.com/shorts/f... [youtube.com]

      I notice a lot of the copyright Karen’s made money of public domain material. I mean if you were around in the 90s you will recall Disney was the most vicious copyright Karen. They got laws written to extend their copyright of Mickey Mouse in spite of the fact that some of Disney’s biggest hits such as Snow White and Sleeping Beauty came from the public domain.

  • AI is where it currently is being dumped, and everyone wants a piece of the p.. Oh you mean that all the 'open' AI models are using content from everyone without paying the royalty kickbacks? Uh oh.

    There is a simple fix for this, you own your own identity, performance, and art until you die + 5 years. Then it is public domain. For this to work corporations can not hold copyright, which fixes a lot of shit like the 70+ year Disney copyright shit.
  • Art doesn't stand still. And its always been a human thing since there have been human things.

    I don't think the Artists have anything to worry about, if anything the value of their work will go up if the examples of AI art are to go by. It's programmer art. Looks synthetic, mass produced, a too perfect plastic cup next to hand painted porcelain, imperfection is style.

    Looks like an opportunity for the lawyers.

    Looks like an opportunity to hype AI

    An an opportunity for us. Recorded music is getting cheaper, bec

  • "Allowing children to view our works is an unjust use of intellectual property. They might learn how to do this themselves!"

If you have a procedure with 10 parameters, you probably missed some.

Working...