Penguin Random House Underscores Copyright Protection in AI Rebuff (thebookseller.com) 40

Posted by msmash on Saturday October 19, 2024 @01:00AM from the drawing-the-line dept.

The world's biggest trade publisher has changed the wording on its copyright pages to help protect authors' intellectual property from being used to train large language models and other artificial intelligence tools, The Bookseller has reported. From the report: Penguin Random House has amended its copyright wording across all imprints globally, confirming it will appear "in imprint pages across our markets." The new wording states: "No part of this book may be used or reproduced in any manner for the purpose of training artificial intelligence technologies or systems," and will be included in all new titles and any backlist titles that are reprinted.

The statement also "expressly reserves [the titles] from the text and data mining exception," in accordance with a European Parliament directive. The move specifically to ban the use of its titles by AI firms for the development of chatbots and other digital tools comes amid a slew of copyright infringement cases in the US and reports that large tranches of pirated books have already been used by tech companies to train AI tools. In 2024, several academic publishers including Taylor & Francis, Wiley and Sage have announced partnerships to license content to AI firms.

Penguin Random House Underscores Copyright Protection in AI Rebuff

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 40 Comments Log In/Create an Account

Comments Filter:

Doesn't Mean Much (Score:3)

by Bahbus ( 1180627 ) writes: on Saturday October 19, 2024 @01:23AM (#64876515) Homepage

I doubt this will stop companies (or individuals) from doing it, though. They'll just do a better job at hiding it and making it harder to prove ever happened in the first place.

- Re: (Score:3)
  
  by Z00L00K ( 682162 ) writes:
  
  Add to it that there are loopholes in the copyright law protecting parodies.
- - Re: (Score:3)
    
    by Visarga ( 1071662 ) writes:
    
    How can you steal something free to access and copy?
- - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
- Re: (Score:2)
  
  by NettiWelho ( 1147351 ) writes:
  
  and making it harder to prove ever happened in the first place
  its easy to prove when your "AI" models training data contains all the other companies watermarks in the requested test sample.
  - Re: (Score:3)
    
    by Bahbus ( 1180627 ) writes:
    
    Watermarks aren't foolproof or perfect. Just an increase in difficulty.
- Re: Doesn't Mean Much (Score:5, Informative)
  
  by St.Creed ( 853824 ) writes: on Saturday October 19, 2024 @10:43AM (#64877219)
  
  The EU copyright law has explicit language allowing the use of copyrighted materials for training AI. The only companies this could reach are US companies.
  
Agree (Score:3)

by will4 ( 7250692 ) writes: on Saturday October 19, 2024 @01:25AM (#64876519)

The corpus of public domain books, freely available for AI training, much better written, would be good starting point for training AI models on literature.
Even the penny dreadfulls, dime novels, pulp magazines are better written than much of the modern books.
Reality: We should push the copyright office, Congress to
limit copyright for written works on paper or electronic to 50 years from the earliest of
- date of first publication - revisions, author's cut, etc. do not extend copyright on the original work
- if unpublished, then 50 years from the youngest author's 35th birthday.
- works made for hire for magazines and corporations, 50 years from creation date
And require in all published works on the copyright page a statement "This book or literary work will become public domain in the USA in the year X at the latest. If copyright law or regulations changes to extend beyond year X, this book will be donated by the copyright holder at time of publication or its asignee to the public domain in year X."

- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  - if unpublished, then 50 years from the youngest author's 35th birthday.
  So if an 86 year old and an 87 year old write something together, they already lost copyright one year before it was written (since it was certainly unpublished when it was not yet written).
  - Re: Agree (Score:2)
    
    by St.Creed ( 853824 ) writes:
    
    Only if they never publish. In which case, tough luck.
    - For late in life publications by authors (Score:2)
      
      by will4 ( 7250692 ) writes:
      
      Add in a 10 year copyright from the date of publication for authors older than 80 years old.
      The main points being:
      - Publishing your work requires you to legally state when the work enters the public domain. And state that if any law changes extend that copyright term, the work is freely donated to the public domain without any ability to retract such donation in the future
      - Copyright exists from a limited time after publication and the duration of copyright is based on 85 years from the birthdate of the ol
- Re: (Score:2)
  
  by Z00L00K ( 682162 ) writes:
  
  My take is 5 years after the death of the last passed creator.
  That would be enough to close the books for the creator and ensure they'll get a decent closure.
  That would mean that at least some of the works of Prince could be free to use.
  - Much simpler solution (Score:2)
    
    by Anonymous Cward ( 10374574 ) writes:
    
    Make it work like a patent with a very limited term but requiring a whole, complete implementation being deposited along with the key tools needed to reproduce it. For recorded stage plays, movies and shows, this would include not only the original source footage but also things like the script and designs or replicas of key props. For literature, it would include authors notes and early drafts, as well as references where applicable for any research materials used, and yes, for software, that would include
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
- Re:Agree (Score:5, Insightful)
  
  by msauve ( 701917 ) writes: on Saturday October 19, 2024 @08:07AM (#64876923)
  
  14 years, with one 14 year extension, as was originally implemented. The purpose of copyright is "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries."
  
  Authors don't create works if a payback is going to take more than 28 (or even 14) years, so longer terms do nothing to promote progress, in fact they impede progress. They hold our culture for ransom.
  
They can say that all they want (Score:4, Insightful)

by Mononymous ( 6156676 ) writes: on Saturday October 19, 2024 @03:03AM (#64876605)

This is a legal question. Whether or not it's allowed depends on the law, not on the message on the copyight page.
If the law restricts AI training, saying it's not allowed doesn't change that.
If the law doesn't restrict it, saying it's not allowed doesn't somehow change the law.

- Re: (Score:2)
  
  by phantomfive ( 622387 ) writes:
  
  They should reword it to be like a shrink-wrap license. If you don't agree, then return the book, etc.
- Re: (Score:2)
  
  by evanh ( 627108 ) writes:
  
  I wouldn't be so dismissive.
  A copyright disclaimer holds a hell of a lot more water than something like a TOS or EULA. Copyright is something that is well tested and repeatedly upheld by the courts. And, in the US, it has the DMCA cudgel smashing everything too.
  - Re: (Score:2)
    
    by jabuzz ( 182671 ) writes:
    
    Argh good old USA, who spent best part of 200 years not giving a f*%k about other peoples copyright. The mind boggles as to why they would expect me to now respect their copyrights. When they apologize and pay reparations I will listen.
  - Re: They can say that all they want (Score:2)
    
    by topham ( 32406 ) writes:
    
    Copyright disclaimers don't hold particular weight without a pre-existing interpretation to give them weight. There isn't one in this case, and it'll end up irrelevant anyway. Re-encode the material in a jurisdiction that simply doesn't recognize anything complicated and you're done.
They have no right to tell us what to do (Score:5, Insightful)

by Visarga ( 1071662 ) writes: on Saturday October 19, 2024 @06:43AM (#64876813)

Authors can only control the right to copying, not other rights. Training AI on their books is not something they can control. It's not copying, so they have no rights over that. When you buy a book you can wipe your ass with it, the author can't say anything.

- - Re: They have no right to tell us what to do (Score:2)
    
    by St.Creed ( 853824 ) writes:
    
    So far, no case has been brought in the EU, because it exempts AI training from copyright. In the USA all cases I know of have been lost by the plaintiffs or delayed, or the loss has been appealed. My expectation is that this will continue to be the case. Even Republicans understand that the money of the future isn't made with copyright but with AI.
  - Re: (Score:2)
    
    by vlad30 ( 44644 ) writes:
    
    So If a person after reading several books by an author is inspired then writes a new book in similar style maybe even similar characters would that be copyright infringement? That is essentially what AI is doing currently. By this almost every book out there ever written is infringing on previous books. Extend this to music how many song writers are inspired by previous ones producing songs and music pieces that sound similar unintentionally?
- Re: (Score:2)
  
  by pauljlucas ( 529435 ) writes:
  
  The thing everybody gets wrong, as you correctly noted, is the training-the-AI-part is not copyright infringement. However, in order to train the AI, you need to have a digital copy. Unless the trainers made an authorized copy, the fact that they have a copy is copyright infringement. That's the illegal part. What they do with the copy is irrelevant.
- Re: (Score:1)
  
  by PleaseThink ( 8207110 ) writes:
  
  Training an AI on books is copying. The software industry settled this ages ago, it's why we need a license to run a program. The act of transferring the program from your hard drive or disk to computer memory is legally considered copying it and thus you need a license to do so. The act of reading the book into an AI system for training is an act of copying. Full stop. Now, there are exceptions to copyright law which makes doing that without a license legal and the question is if training an AI counts
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
- Re: (Score:2)
  
  by allo ( 1728082 ) writes:
  
  You should be aware, that that makes the license non-free (just as the CC-NC licenses are).
  I am also not sure if you are allowed to modify the CC license text. The licenses itself are copyrighted by their creators as well and you often not allowed to change them as this would be misleading if you still say it's CC licensed when your new clause takes away freedoms the CC license would grant.
  - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
Useless (Score:2)

by allo ( 1728082 ) writes:

The clause does not change much.
When it comes to copyright there are two options how it may work out in court:
1) AIs can learn under copyright exceptions. No license can change that, as the exceptions mean no license is needed for AI training, so all clauses in the licenses are irrelevant.
2) AI training is not allowed, e.g., because the copyright exceptions are found to be ineffective. Then current licenses would be enough to disallow AI training without clauses targeting AI in particular.
Prohibiting AI training is absurd. (Score:2)

by phozz bare ( 720522 ) writes:

Telling an AI it cannot be trained on a book's contents is tantamount to telling an aspiring author not to read a certain book to prevent the risk of that author imitating the book's style or content in his or her own future writings. By this logic the Tolkien Estate should sue George R. R. Martin, because it is certain that the latter drew inspiration (for profit!) from the works of the former.
- Re: (Score:2)
  
  by pauljlucas ( 529435 ) writes:
  
  It's not the training part that matters. In order to do the training (or the reading, for humans) requires that you have an authorized copy of a work. If you're human, you obtain an authorized copy by buying a book. The folks training AI are not purchasing copies of works. That's copyright infringement. That's the illegal part. If they purchased a copy, then what they do with it afterwards is irrelevant (with the exception of making more copies).
  - Re: (Score:2)
    
    by phozz bare ( 720522 ) writes:
    
    I don't see where the prohibition says "but it's OK to train on the book's contents if you've purchased it legally", have I missed something?
    - Re: (Score:2)
      
      by pauljlucas ( 529435 ) writes:
      
      That's because the publisher is trying to stretch copyright infringement (which has the force of law behind it) to cover a licensing agreement (which is arbitrary). IMHO, publishers are doing themselves a disservice by not focusing on the copyright infringement aspect.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Penguin Random House Underscores Copyright Protection in AI Rebuff (thebookseller.com) 40

Penguin Random House Underscores Copyright Protection in AI Rebuff More Login

Penguin Random House Underscores Copyright Protection in AI Rebuff

Doesn't Mean Much (Score:3)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: Doesn't Mean Much (Score:5, Informative)

Agree (Score:3)

Re: (Score:1)

Re: Agree (Score:2)

For late in life publications by authors (Score:2)

Re: (Score:2)

Much simpler solution (Score:2)

Re: (Score:2)

Re:Agree (Score:5, Insightful)

They can say that all they want (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: They can say that all they want (Score:2)

They have no right to tell us what to do (Score:5, Insightful)

Re: They have no right to tell us what to do (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Useless (Score:2)

Prohibiting AI training is absurd. (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot