![Facebook Facebook](http://a.fsdn.com/sd/topics/facebook_64.png)
![AI AI](http://a.fsdn.com/sd/topics/ai_64.png)
'Torrenting From a Corporate Laptop Doesn't Feel Right': Meta Emails Unsealed (arstechnica.com) 52
An anonymous reader shares a report: Newly unsealed emails allegedly provide the "most damning evidence" yet against Meta in a copyright case raised by book authors alleging that Meta illegally trained its AI models on pirated books.
Last month, Meta admitted to torrenting a controversial large dataset known as LibGen, which includes tens of millions of pirated books. But details around the torrenting were murky until yesterday, when Meta's unredacted emails were made public for the first time. The new evidence showed that Meta torrented "at least 81.7 terabytes of data across multiple shadow libraries through the site Anna's Archive, including at least 35.7 terabytes of data from Z-Library and LibGen," the authors' court filing said. And "Meta also previously torrented 80.6 terabytes of data from LibGen."
"The magnitude of Meta's unlawful torrenting scheme is astonishing," the authors' filing alleged, insisting that "vastly smaller acts of data piracy -- just .008 percent of the amount of copyrighted works Meta pirated -- have resulted in Judges referring the conduct to the US Attorneys' office for criminal investigation."
Last month, Meta admitted to torrenting a controversial large dataset known as LibGen, which includes tens of millions of pirated books. But details around the torrenting were murky until yesterday, when Meta's unredacted emails were made public for the first time. The new evidence showed that Meta torrented "at least 81.7 terabytes of data across multiple shadow libraries through the site Anna's Archive, including at least 35.7 terabytes of data from Z-Library and LibGen," the authors' court filing said. And "Meta also previously torrented 80.6 terabytes of data from LibGen."
"The magnitude of Meta's unlawful torrenting scheme is astonishing," the authors' filing alleged, insisting that "vastly smaller acts of data piracy -- just .008 percent of the amount of copyrighted works Meta pirated -- have resulted in Judges referring the conduct to the US Attorneys' office for criminal investigation."
jail time (Score:5, Insightful)
And no one will receive jail time, probation, or anything.
There will be a fine of probably a fraction of a percent of Meta's daily revenue.
And a few lawyers will make a ton of money. That's it.
Re: (Score:2)
I propose the $250,000 fine per infringed work.
Re: (Score:1)
Perhaps the law is unjust, perhaps unconstitutional. In my moral universe: the information should have been free and should remain free. Freely taken, freely disseminated. The "walled garden" is the evil, not the information, nor its use.
Re: (Score:3)
So your work should be free ?
Good to know, come by my house, I have some yard work that needs doing.
Re: (Score:2)
GP: "the information should have been free and should remain free"
You: "Yard work"
Such logic, much wow.
Re: (Score:2)
Is copying the same thing as enslavement?
Mark Twain planned to extend his books every seven years so people would want to buy his copy.
I'll always buy an original from the author. Creator's Mark moves this into the Fraud category which is already a crime.
That men with guns will cage people who threaten your great-grandchildren's rent seeking isn't the moral high ground you think it is.
In the case of Facebook (whom I loathe) no revenue was lost through their actions.
These are all separate situations. Confl
Re:jail time (Score:4, Insightful)
How definitive your claim about Facebook is says a lot about lack of consideratiion you've given the issue before commenting. Facebook are investing billions in this area and are paying very generously for some of the data they use. If they hadn't torrented the works and their options were spend some money or not have the material they would happily have spent a large amount of money for it.
I'm a long way from happy with copyright law as it stands but arguments against entirely against it need to be a lot more persuasive than those.
Re: (Score:2)
Re: (Score:2)
Did Mark Twain have that plan at a point where ...
Dude, Mark Twain is dead, and some of his works are still under copyright. Do you think he expected that?
If they hadn't torrented the works and their options were spend some money or not have the material they would happily have spent a large amount of money for it.
That's some bullshit right there! "they would happily have spent a large amount of money for it" BULLSHIT! If that were true, they would have done that. Even putting that aside, do you have any clue as to how much material is available through LibGen, and how much of that is not available for purchase at all?
Please note, I'm not claiming that justifies Facebook's actions, but this wasn't theft (it was co
Re: (Score:2)
In the case of Facebook (whom I loathe) no revenue was lost through their actions.
If Facebook had purchased a copy (and perhaps paid a license fee from the authors) then they would have been fine to use the works for their AI. But since they torrented pirated copies, the authors were denied that revenue.
Re: (Score:2)
I doubt it.
Copyright holders have too much power in the US (Second to Japan) and the minute someone who knows their work was in the Z-library/libgen comes forward with a right to sue, every single person who has downloaded LLaMA or any other LLM from Facebook is going to be finding their LLM unusable. And I'm pretty damn sure OpenAI did the same thing.
Re:jail time (Score:4)
> And I'm pretty damn sure OpenAI did the same thing.
Too bad the whistleblower making this claim was *murdered*. I wonder who benefits...
Re: (Score:2)
Exactly this. If you're in a technology race with other companies to be the top dog in one of the only (perceived) new frontiers of business, it's way cheaper to ignore a few laws and pay some dinky fine than it is to lose the race and get left behind the competition.
Re: (Score:2)
This is why in corporate cases like this that I hope someone will take action in the EU.
Seems like over here is the only place which keeps Big Corp reasonably in check.
Re:jail time (Score:5, Interesting)
Re: jail time (Score:2)
Quod licet Iovi, non licet bovi.
Re:jail time (Score:5, Insightful)
Corporate jail time is certainly possible. Their operations can be halted for 90 days or whatever the term is.
Natural People are 100% vulnerable to jail time yet the Courts conclude that the Corporations have all of the rights of a Person and none of the liabilities (other than garnishing money).
We can't have /Citizens United/ and immortal immune psychopathic corporations.
Information wants to be free (Score:1)
Re: (Score:3)
Torrenting the works is a bit different from just "reading" and "remembering" the information and incorporating it into new creative works. As part of the torrent process they also uploaded and shared the raw files with other torrenters. So even if you give them a pass on how they use the data after having "read" it, torrenting raises the issue of sourcing the raw, copyrighted, pirated files to others. It is the aspect of illegally uploading copyrighted materials through the torrent process that is seen
Let's see (Score:3)
Well let's see how this is handled. If nothing results from this and Zuck isn't personally fined HUGE for this (or even better jailed) then that sends a clear message that piracy is an acceptable form of obtaining digital material. My gut feeling tells me a small fine (slap on the wrist) is coming for poor Zuck. Hoist the sails mate!
Re: (Score:3)
Well let's see how this is handled
Wanna take a guess?
If nothing results from this and Zuck isn't personally fined HUGE for this (or even better jailed)
He won't be.
then that sends a clear message that piracy is an acceptable form of obtaining digital material.
No, it sends the clear message that crime committed by large corps is acceptable. If you try it as the little guy you will be destroyed utterly. Remember what happened with the Sony rootkit?
But I suspect you already knew that.
Re: (Score:2)
"Remember what happened with the Sony rootkit?" - yep, that is when Sony products ceased to exist in my world.
Torrenting is not a crime. I think. (Score:2)
Re: (Score:2)
Well it's both, it's torrenting (especially the DISTRIBUTING without permission part) of works that have copyright (that's a protected monopol on distributing the works).
Re: (Score:2)
There are plenty of works that have licenses which allow distribution, regardless of the distribution method.
Re: (Score:2)
The phrasing here, top to bottom, seems to imply that the act of torrenting is illegal in itself, when in fact it's the content of what they torrented that makes the act illegal.
Note sure downloading copyrighted content from torrent is even illegal. It's distributing it which is I think.
I know for sure companies scanning torrent traffic for the movie industry don't file any complaint against somebody until he has the complete torrent and is seeding it thus, distributing it.
Re: (Score:2)
yeah this. I hate fuckerbook as much as the next sane individual but at least where I'm from downloading isn't illegal. It's easy enough to configure your torrent client to not seed or serve anything to peers.
What's interesting is unlike, say, normal consumer movie piracy, this content isn't even being consumed in the usual sense. No one read these downloaded books. It's still a legal grey area as to whether training on such content is a copyright violation.
Information is public: source, knowledge, shared (Score:1)
Copyright is theft of the commons. AI shall be RobInfoHood.
In 1787, James Madison submitted a provision to the Framers of the U.S. Constitution to "secure to literary authors their copyrights for a limited time."
In 1790, U.S. copyright law granted authors a monopoly right over their creations for 14 years, with the option of renewing that monopoly for another 14.
Article I, section 8: "promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right
LOL "The magnitude of Meta's unlawful torrenting" (Score:3)
Double digit TBs?! Just relatively recently Google pulled the plug on the unlimited Gsuite plans, where people were having multiple PBs. Yes, PBs, like about probably all the (video) streaming content from all services and all vaguely popular BDs and DVDs ever ripped and put on p2p. Of course, all music one could find and shadow libraries are a rounding error here.
Just for kicks look for the guy using over one PB on Amazon Cloud Drive (rest in peace) all the way back in 2017.
Re: (Score:2)
There is a HUGE difference between the number of books per unit of storage space compared to the number of movies per unit of storage space. It is NOT about the number of bits that were torrented, rather it is about the number of individual works that were torrented.
Re: (Score:2)
Yes, as mentioned the books are just a rounding error TB-wise but anyone can torrent more books than a huge college library.
Re: (Score:2)
I think they mean a corporation using a torrent that large for infringement purposes. It probably happens all the time but this is the first one to get caught.
eyeroll (Score:2)
And now we have court systems in tiny countries ordering the big internet companies to make worldwide changes or face trilllyyuuuuns of dollars of fines.
Let's just sue and prosecute everyone, for everything.
Metallica (Score:3)
I just hope that there's a Metallica book in there, and Lars loses his shit like he did last time something from Metallica was pirated.
Well, hopefully they didn't leech... (Score:2)
If they made sure that their upload to download ratio was at least 1:1, then they are good... /s
Until copyright is fair, torrent on! (Score:4, Insightful)
Re:Until copyright is fair, torrent on! (Score:5, Insightful)
As an author you could easily be out-lawyered for 5 years by a large company. No, the fair length of copyright is 14 years, with the option of renewing for an additional 14 years, as established by the Copyright Act of 1790.
That gives the creator ample time to make money from their creation without publishers or Hollywood studios using delaying tactics on authors to wait until the copyright expires, and then using their work without paying a dime.
Disney's 95 year act (thanks to Sonny Bono) needs to be repealed. But there is at least poetic justice in the fact that Disney's efforts at perpetual copyright have led in no small part to the complete creative bankruptcy of the Star Wars and Marvel franchises under the Disney umbrella. At least we can enjoy the schadenfreude of seeing Disney lose hundreds of millions of dollars each year on movies and TV shows that no one is watching.
Facebook LLama is OpenSource (Score:2)
Furthermore, here is the real crime: 1% of the well known authors get 99% of the money. That is to say: 99% of those book authors are probably glad the AI might spit out a reference to their obscure work.
Pirate Sites Are Just Digital Libraries. (Score:1)
And just like a real library, you check out copywrited works and use them. If you want it in your collection you go to the store and BUY it.
There is nothing different between a pirate site and a public library. Anyone who says different is just a greedy control freak.
I've torrented from corporate servers and laptops. All with the blessing of the owners who understand reality.
I pay for my internet connection and I pay for the library through taxes in the town I live in. There is no difference.
Now shut the fu
Re: (Score:2)
Bull. The former is taking someone's work and not compensating them for it, assuming they are not giving away their own work. The latter is borrowing an item for a limited time and returning it for someone else to use, an item which has been paid for.
If you think everything should be free you go right ahead and do that for your work. The people who make a living off writing/music/movies/etc need to make money or they won't produ
Re: (Score:1)
I can't help you can't comprehend logic. I never said everything should be free. You just want to argue about straw men. Get the fuck outta here with that shit.
Could possibly be Fair Use (Score:3)
I have no idea what they have in mind but there are a few things that could be considered fair use here including making an index of the papers in question and calculating secure hashes like SHA-256 for all of them. I would not consider ingesting all of them into some quasi-delusional LLM AI model to be fair use though, but that question has yet to be decided. TLDR copying things even in volume is not necessarily a copyright violation.
Don't think Fair Use applies to commercial uses (Score:4, Interesting)
, at least generally. Four points to it, cite below:
- Purpose of the use - Commercial v. educational or not for profit
- Nature of the work used - Technical documentation v., fictional novels
- Proportion of the work used - Five lines from a sonnet differs from five sentences from LOTR
- Effect of the use on the commercial marketability of the work - Probably negligible in most cases
IANAL, which is where these things end up, but Meta's arguments on "Purpose" and "Proportion" isn't readily apparent to me, even assuming they kept careful track of what they were hoovering.
--
https://www.copyright.gov/fair... [copyright.gov]
Re: (Score:3)
But torrenting is more than just using or making a copy for whatever use you have in mind. Doesn't it involve distributing the work as well?
Re: (Score:2)
That is a good point, so the question is did Meta have good reason to believe that the other people who were participating in the torrents were acquiring the data in a way that was illegal or violated copyright law? I imagine they may have, but the government would probably have to prove that to demonstrate that they were guilty of contributory copyright infringement or some other violation. Copyright holders like to make the case that Internet Service Providers are guilty of contributory copyright infrin
Re: (Score:2)
YES! Courts have held peer to peer download against defendants because they are helping distribute not merely copying for themselves; furthermore, there was zero profit being made from infringement. Meta is doing way more; but they are a corporation, so the key is to incorporate your whole family and make everybody an employee then get a corporate defense lawyer...when you lose, just bankrupt the corporation; nobody gets hurt.
Nothing to see, he already kissed the ring (Score:1)
Isn't that the point of all this digital data stuf (Score:2)
Strange ideas? (Score:2)
This is a very strange idea that is for some reason taken for granted?
Its also easier for the author to make and sell copies. If anything, durations should be lower because of how rapidly the author can distribute.
How about we go back to 7 years.