Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
AI Technology

Thousands of Authors Urge AI Companies To Stop Using Work Without Permission (npr.org) 118

Thousands of writers including Nora Roberts, Viet Thanh Nguyen, Michael Chabon and Margaret Atwood have signed a letter asking artificial intelligence companies like OpenAI and Meta to stop using their work without permission or compensation. From a report: It's the latest in a volley of counter-offensives the literary world has launched in recent weeks against AI. But protecting writers from the negative impacts of these technologies is not an easy proposition. According to a forthcoming report from The Authors Guild, the median income for a full-time writer last year was $23,000. And writers' incomes declined by 42% between 2009 and 2019. The advent of text-based generative AI applications like GPT-4 and Bard, that scrape the Web for authors' content without permission or compensation and then use it to produce new content in response to users' prompts, is giving writers across the country even more cause for worry.

"There's no urgent need for AI to write a novel," said Alexander Chee, the bestselling author of novels like Edinburgh and The Queen of the Night. "The only people who might need that are the people who object to paying writers what they're worth." Chee is among the nearly 8,000 authors who just signed a letter addressed to the leaders of six AI companies including OpenAI, Alphabet and Meta. "It says it's not fair to use our stuff in your AI without permission or payment," said Mary Rasenberger, CEO of The Author's Guild. The non-profit writers' advocacy organization created the letter, and sent it out to the AI companies on Monday. "So please start compensating us and talking to us."

This discussion has been archived. No new comments can be posted.

Thousands of Authors Urge AI Companies To Stop Using Work Without Permission

Comments Filter:
  • sue them and demand source code in court!

    • by rsilvergun ( 571051 ) on Monday July 17, 2023 @12:25PM (#63693536)
      It's cute that you think authors can afford to sue. There's one or two really rich ones who can, but their agents will just settle out of court.

      This'll be just like how song royalties work with Spotify. Authors will get nothing and they'll like it. We've pretty much given corporations absolute power while we argued over petty moral panics and culture war distractions.
      • Reddit (the corporation) owns the rights to all of those posts and comments used as training data. Reddit has a ton of money to file a lawsuit, and can very likely get a massive war chest from its VCs who could see ripping the guts out of OpenAI, Google, etc. as a very interesting way to make Reddit massively profitable.

        • In most countries, if not all, using texts for learning doesn't legally indebt you to the author of a text in any way, at least as far as copyright is concerned. That should apply to both writers' books *and* Reddit comments. The best shot that Reddit may have here is ToS violation (I think?), but I don't see how that would help book authors.
        • Like I said, some of the big guys might get paid after settling out of court. But for 99% of the people who actually write shit they gets nothing.

          This isn't about Reddit wanting a payday for their user's posts. This is about individual authors whose work was used to train LLMs wanting payment. They're not going to get any. The courts will side with whoever has the most cash and it sure ain't them.
          • This is about individual authors whose work was used to train LLMs wanting payment. They're not going to get any.

            I'd like to know how ChatGPT, etc. even got ahold of their works because I am not aware of any legal site where this data could be scraped, and I doubt these companies would be stupid enough to break DRM on commercial ebook products, rip the content and use it as the basis of a commercial product.

            With these authors, I feel like there is some info we are not getting about precisely how the AI comp

            • I have yet to see it confirmed that ChatGPT was actually trained on complete copyrighted works through piracy. If they did train it on a pirated library as speculated then it sounds like each author is owed the price of a single copy of their book and some punitive damages to prohibit the practice. If they lifted the books from legal library sites well then they've already been compensated for that copy. If it was trained on webpages reviewing and giving synopsis of the book well then that isn't their wo
        • by r0nc0 ( 566295 )
          Reddit has made the entire corpus of posts and comments available to anyone in the past...
        • Reddit publishes those posts to the public without any requirement you agree to any of their terms. Google's recent court victory of a song lyrics company should come into play here. Just because you publish some obscure term on a different page doesn't mean it was read and agreed to. I expect a new type of nag dialogue box that forces you to agree to their terms prior to being able to see the webpage.
      • There's such a thing as a class-action lawsuit that would allow them to jointly file. I'm sure there's a lawyer or two that will happily work for little up front for a percentage as well.

        The case would be incredibly difficult though. Any human author is to some degree inspired by what they've read and that makes it particularly difficult to argue that an LMM is much different than they are. Most human created works aren't terribly original, but humans seem to prefer variations on the same small set of th
    • Re: (Score:2, Troll)

      by slazzy ( 864185 )
      In what country though? My company is based in Eritrea for example, we don't have any copyright law.
  • by mlawrence ( 1094477 ) <martin@@@martinlawrence...ca> on Monday July 17, 2023 @12:37PM (#63693590) Homepage
    Every creative person has taken from something they have experienced "without permission". AI can do it faster.
    • by Narcocide ( 102829 ) on Monday July 17, 2023 @12:44PM (#63693636) Homepage

      The AI is using known, predetermined algorithms to harvest the content and regurgitate it in a predictable format, and is being fed copywrited source data illegally by handlers who are essentially stealing said source data. Artists who do the same thing also get sued for plagiarism. The only difference is the AI can put enough extra steps in the process that it's less obvious.

      • "The AI is using known, predetermined algorithms" That sentence seems contradictory to me. Predetermined algorithms was the way classical computers would analyze text. AI uses unknown and dynamic algorithms that even the developers sometimes cannot understand.
        • That's a lie they are telling you to add a veil of mystery, and if you ask the AI the right things you can get predictable results, which proves it to be a lie.

          • Re: (Score:3, Insightful)

            by amoeba1911 ( 978485 )
            Yes, if you ask AI the right things you can get predictable results. But that means nothing! You can do the same exact thing with humans as well. https://magicmentalism.com/pow... [magicmentalism.com] So, on the contrary: a person reading a book and getting inspired to write some stuff is no different than AI reading a book and getting inspired to write some crap. Large models do the same exact thing as humans, but they just do it much faster and better. What you're thinking of are simple algorithms with fixed input and deter
            • " a person reading a book and getting inspired to write some stuff is no different than AI reading a book and getting inspired to write some crap"

              A person reading a book and producing exact copies of the contents of that book is a plagiarist.
              An AI reading a book and producing exact copies of the contents of that book is a...

              • An AI reading a book and producing exact copies of the contents of that book is a...

                That isn't what is happening. At all.

              • "producing exact copies of the contents of that book" - Yeah, I see your point, and I see it's based on a misconception of how AI works. It's not a copy-paste algorithm, it's a convolutional neural network. There is a huge difference, yet it can look the same to the untrained eye. If you don't have a good understanding of the underlying technology, I definitely see how it would lead you to conclude it is a copy-paste algorithm.
          • It's not a lie at all. The latest field in AI is figuring out how it learned to solve a problem to improve the next version. They've found machines have learned to solve problems in very inefficient ways when left to their own devices.
          • by dfghjk ( 711126 )

            Predictable results are not proof of a "known, predetermined algorithm". AI doesn't "harvest" content either. You are a moron.

        • by Narcocide ( 102829 ) on Monday July 17, 2023 @01:06PM (#63693762) Homepage

          Make no mistake, it's not an iteration on the "classical way computers would analyze text" it's just an iteration of how much computational horsepower they can throw behind the analysis, and how much total raw source data they can throw into it. There was some chain of AI generated images going around early on and there was talk of how certain groups of questioning seemed to mysteriously include a similar looking lady. I forget what they were calling her but there was a lot of talk about how the pictures were "haunted" by this "creepy" visage. Well, guess what. She's just a person in one of the source images and the reason she kept showing up wasn't supernatural in origin. If they'd used more women in the source data the AI could have come up with some different women. That's all. Garbage-in, garbage-out. CS 101.

      • to harvest the content and regurgitate it in a predictable format

        Is that a paraphrase of "to become a writer by reading many books at a young age and learning from it"? If so, then it looks like pretty much all authors could be sued for that.

        • by dfghjk ( 711126 )

          Exactly, and authors are people who write things to be read. They monetize it by selling what they write. AI is trained by "reading" what people write. Training AI with authored text is using that text for precisely what it is meant for and for which the author has already been compensated.

        • by dryeo ( 100693 )

          OTOH, perhaps it is more like photocopying the whole library, or actually, scanning and OCRing the whole library to read later and keep a copy for reference.

      • by DarkOx ( 621550 )

        I think if the decision is rendered by people who understand how LLMs work on technical level, even if they exclusively trained it on copyright protected works, my guess is its fair use.

        It really has to be it just collecting facts about the works. Statistical frequencies of groups of tokens. Statistics about the relationships between the groups, and other stuff, its not storing the text strings.

        Its always been legally permissible to record and report on facts about a copyright protected work. Imagine if Pa

        • by dfghjk ( 711126 )

          "..even if they exclusively trained it on copyright protected works, my guess is its fair use."

          It's not fair use, it is legal use as intended. Reading copyrighted material is NOT a violation of copyright.

          Anyone who reads works is merely "collecting facts about the works", "training" their own "neural network" using that text.

      • The AI is using known, predetermined algorithms to harvest the content and regurgitate it in a predictable format, and is being fed copywrited source data illegally by handlers who are essentially stealing said source data. Artists who do the same thing also get sued for plagiarism. The only difference is the AI can put enough extra steps in the process that it's less obvious.

        LOL 'stealing' next you'll be saying that torrenting movies is theft.

      • by dfghjk ( 711126 )

        I like how you used the word "said" so that you sound informed, as opposed to the ignorant douche that you are.

      • by dvice ( 6309704 )

        Stealing = Theft, the illegal act of taking another person's property without that person's freely-given consent
        As owner still has their property, it has not been taken, so it is not stealing.

      • The AI is using known, predetermined algorithms to harvest the content and regurgitate it in a predictable format, and is being fed copywrited source data illegally by handlers who are essentially stealing said source data.

        There is nothing predictable about these systems. Not only do any responses from machines depend upon context of the question randomness is explicitly injected such that even if you were to ask the very same question with the very same context you may well get a wildly different response.

        If I go to a website that has copyrighted material on it, read that content and learn from it am I doing something wrong? If not what's the difference if a machine does the same thing? Copyrights are not a grant of exclu

      • The AI is using known, predetermined algorithms

        Just because an algorithm is known doesn't mean it can't do novel things. One could claim evolution is a known algorithm...

        to harvest the content and regurgitate it in a predictable format

        Mathematically it's often not possible to regurgitate the content. For example, GPT4 is estimated to have 1T parameters (at least 32T bits) and is estimated to be trained with around 20T tokens. Assuming each token is around 8 characters, that's 1280T bits. Even with

    • The difference is the AI's owner never got permission to perform the work to the AI.

      • You don't know they didn't get permission to read the book. If they checked those out from a virtual library the author was already compensated and they had permission to read the book.
        • Checking out a book from the library doesn't give you rights to do whatever you want. Just like how checking out a book from the library doesn't give you permission to read it into a microphone connected to a broadcast radio tower, it didn't give you permission to read it into an AI training dataset.

    • Every creative person has taken from something they have experienced "without permission". AI can do it faster.

      Exactly.

      If its publicly viewable and accessible to humans, why shouldn't it be useable by AI?

      • I can see your face walking down the street. I cannot however legally take a picture of you and use it on a billboard to hock my wares. There are existing lines in the sand about acceptable uses of publicly observable things. Plagarism can get you fired/sued.

        AI opens up the issue to the difference between stealing from a single person and allowing massive automated theft under the guise that by churning it through a black box you are no longer "stealing", but just "training".

        • I can see your face walking down the street. I cannot however legally take a picture of you and use it on a billboard to hock my wares.

          But I can draw a picture vaguely similar to your face and use it on a billboard to hock my wares.

          In any case, you're talking about a completely different area of law; your face isn't copyrighted.

    • If pedantry is all you got, you got nothing.
  • "There is no urgent need for an AI to write a novel."

    Well, no urgent need for a human to write one either.

    • by Anonymous Coward

      disagree completely

      once you move past pure survival, writing a great novel is among the most important things someone can do - you are a conscious being

    • by m00sh ( 2538182 )

      "There is no urgent need for an AI to write a novel."

      Well, no urgent need for a human to write one either.

      Except for George RR Martin.

  • I've been wondering how long services like ChatGPT could continue before copyright concerns came about.

    Since ChatGPT doesn't give a bibliography or resources used with each response, nor does it do a good job citing sources in its answers, its responses may be in copyright violation.
    • AI will soon become so common it will be like music piracy. For every case they prosecute, thousands will slip by. Creativity may soon be just your hobby instead of something you can live off.
    • We need authorship-AI, a model trained to predict the author, copyright status, credibility and originality of a piece of content, of course fine-tuned from human (expert) preferences.
      • by Anonymous Coward

        You still don't get it, while some AI may use copyright material in a response, the latest crop, learn general concepts from examples and does never, at any point, store the source information. It builds a giant diffusion map or semantic map. It's abstracted to a broader level of generalization. Whoever restricts AI will become the new third world.

        • Wrong, they don't *learn* anything. We need new words to describe what they do, because adapting existing language also confers meaning that is not true. Like calling it "artificial intelligence" - it's not intelligent. It knows nothing. It cannot know anything.

          • it's not intelligent. It knows nothing. It cannot know anything.

            This is philosophy and will never be universally agreed upon, like all philosophy.

  • by nospam007 ( 722110 ) * on Monday July 17, 2023 @12:44PM (#63693642)

    They READ the books and learn from them.
    The AI has just a better memory than most.

    What's next, schoolbook authors suing AI because they can speak English?

    • But if you author something based on other sources, it is customary to cite your sources within the narrative, or in a bibliography.
      • Detecting copyright infringement in AI training sets and outputs will be big business.
        • I bet that AI-based companies will already include such detection into the production of their works in an adversarial fashion anyway. They'd be stupid not to.
          • I doubt it. The people who are writing this software are the same types who say "everything should be free". They basically said "well they let us scrape this data, so we're using it", with zero thoughts of consequences, assuming that nothing will be done because nothing ever has. These tech companies move so much faster than legislators and regulators that they believe they are untouchable. Perhaps this time around the copyright holders will act soon enough to stop the theft.

            • oddly enough, the people that shout, "everything should be free!" still want to get paid for THEIR work.
            • by dfghjk ( 711126 )

              You are EXTREMELY wrong about this. Large corporations are overwhelmingly concerned about the legal consequences of applying AI models.

        • It's not infringement to remember some words you read in a book. AI is not reproducing entire works verbatim.
      • "Customary"? I seem to recall that Crichton did that with some novels, but there aren't going to be many books with an appendix listing all the inspiring predecessors to the work in question.
        • aren't going to be many books

          ...I really should have added "in the belles-lettres genre", mind you -- not things like scientific papers or monographs. But certainly things like novels or short stories qualify.

        • His Book Eaters of the Dead had a sources listed as I recall.
      • by m00sh ( 2538182 )

        Did your school algebra book cite the original al-jabr book as a source?

        Does a school book even have a references section?

      • Re: (Score:3, Informative)

        by amoeba1911 ( 978485 )
        You have no obligation to cite your source of inspiration. If I read a 1000 books and learn how to write books, how to create an engaging story, how to create dialogue, how to make plot twists, and I use all that gathered knowledge to write a new book, I do not have to give any credit to any of the 1000 books. The only violation would be if use the character from a book in my book, or use dialogue from a book in my book, etc, but unless AI is overfitted, it will not do that.
        • " If I read a 1000 books and learn how to write books, how to create an engaging story, how to create dialogue, how to make plot twists, and I use all that gathered knowledge to write a new book,"

          BTW, somebody should sue Tarantino then.

      • by dfghjk ( 711126 )

        But the discussion is about training AI's, not applying them to tasks (which is what authoring using AI would be).

        There is no copyright violation when training an AI, there could be one later if a significant portion of a copyrighted work is regurgitated. That isn't being claimed, nor are the complainants here even aware of that issue.

        • In the United States, for instance, fair use allows for limited use of copyrighted material for purposes such as criticism, comment, news reporting, teaching, scholarship, or research.

      • "But if you author something based on other sources, it is customary to cite your sources within the narrative, or in a bibliography."

        Apparently not even scientists know that rule.

    • The AI has a better memory than everyone. COMBINED. :)
  • With all that creativity how is it no one has imagined a modern economic system besides capitalism? Well authors, welcome to capitalism maybe now you'll think of something ;)
    • by m00sh ( 2538182 )

      With all that creativity how is it no one has imagined a modern economic system besides capitalism?
      Well authors, welcome to capitalism maybe now you'll think of something ;)

      Not sure if joking?

      In case you aren't, the reason you think this way is that there has been massive propaganda to make you think capitalism is the only possible modern economic system that just happens to match human psychology and everything else is doomed to fail.

  • Oh wait, it did. Hollywood desperately tried to stop VCRs from becoming a thing. The old way of doing things is over once a disruptive technology takes root. Same thing happened to the music industry. You know, that industry that thrived on forcing you to buy a CD with 11 songs on it that you didn't want so you could get the one song you did. MP3s and iTunes disrupted that business model. Some aspects of existing businesses are going away. Some will remain. Content creators are going to have to get

  • by ByTor-2112 ( 313205 ) on Monday July 17, 2023 @01:54PM (#63693986)

    You can't read a book and make a movie just like it without paying the author. Why should AI be able to do the same thing?

  • We already have copyright laws; apply those. If the AI are trained well, those laws won't be applicable; the AI will be producing new works, just like a person reading, learning, and writing.
  • Genius (Score:4, Insightful)

    by nospam007 ( 722110 ) * on Monday July 17, 2023 @03:02PM (#63694264)

    "The only people who might need that are the people who object to paying writers what they're worth."

    Yes, the same reason to use AI because people don't want to pay for stock-photos, models, lighting, photographers, painters, background, actors, stuntmen ...

    Nobody wants to do it for the cultural enlightenment of the people.

  • Virtually all of human endeavour involves iteration on previous works. Why do these people think that AI should be singled out from this universal process? I've heard more than a few directors/writers/actors boasting about their use of a previous work as the basis for their own, such as "my story is [insert book/film title] in space". Virtually all of Disney's films are iterations on old folk tales. The common way to educate new writers/directors/actors is to have them read/analyze/portray vast amounts

  • In other news, the authors are also suing other authors demanding that they get permission to read their works before being influenced by them. The suit requests that other authors be enjoined from using any words the plaintiffs have used in their works.

  • Charging someone a continuous fee in order to read your text is asinine.
  • Why is it if you go off and learn how to paint from looking at master pieces, no one objects, but when AI learns from the top writers it's suddenly not OK ?

    This just seems like a gripe fest that AIs can learn how to do what you are doing.

The unfacts, did we have them, are too imprecisely few to warrant our certitude.

Working...