Reddit CEO Says Microsoft and Others Need To Pay To Search the Site (theverge.com) 78
After striking deals with Google and OpenAI, Reddit CEO Steve Huffman is calling on Microsoft and others to pay if they want to continue scraping the site's data. From a report: "Without these agreements, we don't have any say or knowledge of how our data is displayed and what it's used for, which has put us in a position now of blocking folks who haven't been willing to come to terms with how we'd like our data to be used or not used," Huffman said in an interview this week. He specifically named Microsoft, Anthropic, and Perplexity for refusing to negotiate, saying it has been "a real pain in the ass to block these companies."
Reddit has been escalating its fight against crawlers in recent months. At the beginning of July, its robots.txt file was updated to block web crawlers it doesn't have agreements with. Then people began noticing that Reddit results were only visible in Google results -- where Reddit is paid for its data to be shown -- and not other search engines like Bing. Huffman said that Microsoft has been using Reddit's data to train its AI and summarizing its content in Bing results "without telling us" and that Reddit's data has also been sold through the Bing API to other search engines.
Reddit has been escalating its fight against crawlers in recent months. At the beginning of July, its robots.txt file was updated to block web crawlers it doesn't have agreements with. Then people began noticing that Reddit results were only visible in Google results -- where Reddit is paid for its data to be shown -- and not other search engines like Bing. Huffman said that Microsoft has been using Reddit's data to train its AI and summarizing its content in Bing results "without telling us" and that Reddit's data has also been sold through the Bing API to other search engines.
Whose data? (Score:5, Insightful)
Isn't Reddit user generated content?
Re:Whose data? (Score:5, Interesting)
but it belongs to Reddit, not the users. Didn't you read the EULA? /s
After their API change last year I abandoned the site for the Activity Pub side of social media and ended up on Lemmy. Its nice, smaller and feels more like the forums of old, once you filter out all the problematic instances and develop a block list.
Re: (Score:2)
I did read their EULA. Apparently you didn't.
5. By submitting Your Content to the Services, you represent and warrant that you have all rights, power, and authority necessary to grant the rights to Your Content contained within these Terms. Because you alone are responsible for Your Content, you may expose yourself to liability if you post or share Content without all necessary rights.
You retain any ownership rights you have in Your Content, but you grant Reddit the following license to use that Content:
Whe
Re: (Score:2)
And how is that logically distinct from my snide comment?
Ownership of the content implies that the owner can do what ever they want with it... so do you own it or do they? The data, the physical representation of your idea in code and bytes is stored on their server, they own it. This is just a long winded way of Reddit saying "lets share it, but its mine".
Re: (Score:2)
I can give anyone I want the same license. So If I tell Microsoft they have the right to use my content, they have no reason to pay Reddit for it.
Re: (Score:2)
Re: (Score:2)
Thinking the data you add to commercial sites is the same as thinking you own the recent games you bought or you own your Windows (or Chrome) PC.
Internet: welcome to the network of the free and the brave*.
*limits and mandatory content may apply, no guarantees are provided
Re:Whose data? (Score:4, Insightful)
Thinking the data you add to commercial sites is the same as thinking you own the recent games you bought or you own your Windows (or Chrome) PC.
Internet: welcome to the network of the free and the brave*.
*limits and mandatory content may apply, no guarantees are provided
I do think this goes in the bucket with all the swirling contradictions in the discussion about "Are these platforms merely publishers of user content, and therefore have no oversight liability? Or do they control the content and therefore are liable for that content?"
If my comments become commercial property of the platform, and the platform controls the how/when/if of my content being shown or not shown, stored or not stored, resold or not resold, aggregated into algorithm training sets or not, etc. etc.... if my comments don't remain mine, but become a commercial asset of a business, then that business should be legally responsible for its property.
Re: (Score:2)
I do think this goes in the bucket with all the swirling contradictions in the discussion about "Are these platforms merely publishers of user content, and therefore have no oversight liability? Or do they control the content and therefore are liable for that content?"
If my comments become commercial property of the platform, and the platform controls the how/when/if of my content being shown or not shown, stored or not stored, resold or not resold, aggregated into algorithm training sets or not, etc. etc.... if my comments don't remain mine, but become a commercial asset of a business, then that business should be legally responsible for its property.
willful misunderstanding to push your agenda.
You approached them and asked to use the service (aka you created an account). The agreement they presented was that you could use their service in return for a non-exclusive, perpetual, worldwide, irrevocable, royalty-free license to use what you post in whatever way they choose. You accepted the agreement.
You are responsible for what you write. You are the author. You own the copyright. You agreed to give them a license to use it. This is how they are usi
Re: Whose data? (Score:2)
Thatâ(TM)s fine, but a perpetual license does not transfer copyright, thus Reddit has no claim to bar other uses.
Re: (Score:2)
Correct.
According to the agreement, it is a "non-exclusive" license. This means that the copyright holder (the person who wrote the post) can license the content to others as they wish -but Reddit controls the copy on the Reddit servers and can grant or restrict access to that copy as they choose.
Re: (Score:2)
willful misunderstanding to push your agenda.
You approached them and asked to use the service (aka you created an account). The agreement they presented was that you could use their service in return for a non-exclusive, perpetual, worldwide, irrevocable, royalty-free license to use what you post in whatever way they choose. You accepted the agreement.
You are responsible for what you write. You are the author. You own the copyright. You agreed to give them a license to use it. This is how they are using it.
Thank you for the clarification. That makes sense to me. I appreciate the insight. Your explanation improves my understanding.
No thank you for the "willful misunderstanding to push your agenda comment". You have zero knowledge of my internal state, nor my "will", nor my "agenda". It's a discussion forum; I made a comment discussing what I thought should happen. It's unfortunate when people with insight have an aggressively adversarial mentality. It dulls the effective communication of your insights.
Re: Whose data? (Score:2)
The only guarantee is your money will be provided.
Re:Whose data? (Score:5, Interesting)
Reddit users agreed to forfeit the right to their own content and build Reddit's value for free in exchange for not paying a few dollars a month for the service. It's a choice they made. They can't complain now: it's Reddit's content to monetize as they please.
But here's a reminder: it may be their content, but the original authors still have edit rights to it, even years down the line. If you want to get back at Reddit, edit your content and stuff it full of nonsense and untruths. You'll lower the value of Reddit and you'll pollute whichever AI trains on your content at the same time. Hint hint...
Me, I deleted 80% of all my posts and what I left is utter but subtly wrong nonsense.
Re: (Score:1, Interesting)
Re:Whose data? (Score:5, Informative)
Small raindrops make big rivers my friend.
I may be Mister Nobody and whatever I wrote on Reddit over the years may not amount to much, as you're so keen to remind me. But if everybody does what I did, AI will become ever-so-slightly less capable of displacing a human's job, and that job might be yours.
Re: (Score:2)
"No single raindrop thinks it is responsible for the flood."
Re: (Score:3)
Me, I deleted 80% of all my posts and what I left is utter but subtly wrong nonsense.
Might want to check they're still deleted. I seem to recall reading that they were restoring deleted posts?
Re: (Score:2)
Get a script that edits your posts with lorum ipsum or something, and another that deletes them. Execute them a week apart.
Reddit can't afford to manually review every restored post.
Re: (Score:2)
Reddit can't afford to manually review every restored post.
But they can detect "suspicious activity", block your API usage, and revert your content to the last "clean" version from before the "hacking" began. And then change your password for your own protection, and refuse to restore it due to you failing to provide proof you're the owner, since all signals show you're the hacker who hacked the account, not the legitimate account owner. So now your only resource is a small claims court, at which point it's discovered you are indeed the owner. And then their TOS ap
Re: (Score:2)
That's a really long fantasy.
Reality: it works until so many people do it they notice (unlikely), and then they attempt a scripted detect-and-recovery but they've cut too much staff for that. Then they half ass it and maybe ban some accounts (which hurts you how?).
Nobody's going to court or arbitration over it, you'd have to be crazy to think that worth your time.
Re: (Score:2)
Of course it's fantasy. I'm highlighting the fact there's no winning move. Most anything a prole may think of doing, a lawyers thought before and prevented via TOS. At most people may cause a tiny bit of bad PR, as happened a few months ago, but that lasts a few weeks and then no one care anymore.
Re: (Score:2)
There's no reason why they wouldn't store every version of every post you've ever written. If it starts to become a problem it will float up towards the top of a report and be noticed. They could store diffs to save space, or compressed copies, and then basically anything less than random bytes will be a non-problem. But even if all they do is rotate the oldest copies out to a warehouse they could probably keep anything you'll upload forever.
Re: (Score:2, Interesting)
But here's a reminder: it may be their content, but the original authors still have edit rights to it, even years down the line. If you want to get back at Reddit, edit your content and stuff it full of nonsense and untruths. You'll lower the value of Reddit and you'll pollute whichever AI trains on your content at the same time. Hint hint...
Have you double checked that from a different IP while logged out?
After retroactively changing their user agreement, tens of thousands of us that deleted our post history were suspended/ghost banned, and our content restored.
Some that just happened to also be a moderator were banned over this.
I deleted my history, got suspended, and two days later my posts were restored for others, and yet at home none of my posts showed up when scrolling through my subs.
I could see them from work however.
Reddit users agreed to forfeit the right to their own content and build Reddit's value for free in exchange for not paying a few dollars a month for the service. It's a choice they made.
Sorry for quoting
Re: (Score:2)
Have you double checked that from a different IP while logged out?
Yeah and the content appears to be gone. Of course, it probably isn't truly gone. I'm not deluding myself.
However, the stuff I kept and polluted on purpose, I do believe it truly was changed. The alternative would be Reddit doubting its own users and archiving every version of every post ever posted, and at some point somehow determining that a user has gone rogue and quietly starting to flag changes as unreliable, and selling only older versions of flagged posts to AI companies.
I dount Reddit has the resou
Re: (Score:2)
This is easily solved by legislating that users cannot in fact give up certain rights over their own work. Boom, no more problem with businesses that are providing a forum but not the actual content on that forum claiming excessive rights to everything users put there. The EU already took some big steps in this direction and those laws were aimed squarely at the big social media sites.
Re: (Score:3)
Says who? Not Reddit that is for sure.
This is direct from their EULA
You retain any ownership rights you have in Your Content, but you grant Reddit the following license to use that Content:
Re: (Score:3)
"Reddit users agreed to forfeit the right to their own content and build Reddit's value for free in exchange for not paying a few dollars a month for the service."
Can we stop making bullshit claims like that.
You do not " forfeit the right to their own content", you give reddit a perpetual license for your content but you retain full ownership and copyright of all your posts. The EULA is very clear on that.
Re:Whose data? (Score:5, Insightful)
it's reddit content when the topic is profit, it's user generated context when the topic is liability
Re: (Score:2)
Re:Whose data? (Score:4, Insightful)
Not per their EULA, they only have a perpetual license, and not an exclusive one at that.
Re: (Score:3)
Re: (Score:2)
User 'submitted' content. What would happen if someone leaked, say, the source to MS Office on reddit, then bing slurps it up, and you can then find it through a bing 'cached' link.
Obviously that'd be a large cache, but in the old days people used to split files so people could download them in floppy-disk sized chunks, before the ISP cut you off at the 2-hour limit.
If Reddit gets to sell access to the content that user's submit, what's in it for the users?
Good (Score:3)
The sooner we scrape reddit off the face of the earth the better society will be. Let it die on the worst search engine.
Re: (Score:3, Insightful)
This is fine. Google hasn't been my main search engine for most of a decade. Reddit continues to digg its way into irrelevance by making itself more appealing to investors at the expense of users and the rest of the Internet.
Re: (Score:2)
Yep. I've stopped using Google because there are better search engines available (read: which don't index Reddit).
If I want amateur porn, factually incorrect political invective, and technically incorrect conjecture on other topics, I'll just go to an extended family reunion.
Hey! (Score:3, Insightful)
"Only we can resell the content we grifted from our users!"
"Free information". "Royalty Free Innovation." "Profits". It's a beautiful day when Silicon Valley's favorite things all collide into an unsustainable mess.
Fuck everyone in this story.
This flies in the face of the very internet (Score:5, Insightful)
One of the core principles of the web is that linking is free.
When your business is so desperate that you ask money for linking to your content, it's undeniably proof that your business model is utterly broken.
Also, if Google agrees to pay up and set a precedent, other platforms will inevitably try to get in on the action. And Google may very well want to do that in fact, because then they'll be the only ones with enough money to index other sites, pushing other, smaller search engines out of the market and entrenching their monopoly forever.
TL;DR: fuck Spez and fuck Google. In fact, fuck all tech bros who are destroying the hopeful future I was promised as a kid in the 70s. This dystopian shit is beyond depressing.
Re:This flies in the face of the very internet (Score:4, Interesting)
Top Glassdoor reviews for Reddit:
- "The leadership has no strategic vision and hopes to sit tight and cash out." (in 9 reviews)
- "Bad management" (in 8 reviews)
Re: (Score:2)
If Joe Guy decides to post a link to a reddit page there is STILL nothing reddit can do about it - sort of - reddit can still block requests generated from that referrer URL.
Re: (Score:2)
Yes, and that core principle means that how data is displayed and what it's used for is up to the consumer. Consider the quote:
"Without these agreements, we don't have any say or knowledge of how our data is displayed and what it's used for..."
Yes, just as the web intends. And there is an "agreement", when a user requests data and you provide it, you agree that it be used as the web intends.
"...which has put us in a position now of blocking folks who haven't been willing to come to terms with how we'd lik
Re: (Score:2)
> fuck all tech bros who are destroying the hopeful future I was promised as a kid in the 70s. This dystopian shit is beyond depressing.
Wake up and smell the ashes.
Re: (Score:2)
Sure, but that has nothing to do with the case at hand. Reddit is asking to be paid for the privilege of crawling the site. If you instead manually link to the site, that's presumably still fine.
Come and scrape Lemmy instead (Score:2)
Google you idiots! (Score:2)
We will end up with a situation where you have to use multiple search engines to get a different selection of indexed sites.
Re: Google you idiots! (Score:2)
You have to pay to search the internet in the future
Re:Google you idiots! (Score:4, Insightful)
I suspect Google views it as a zero sum game. Agreeing to this causes both expenses AND revenues for Google, but supporting it makes competing with Google harder. No one should consider Google the good guy, ever.
Irrelevant (Score:2)
I only stumbled upon Reddit content via search, so if they want to give up that tiny trickle of revenue I doubt I or they will care. Adios reddit, I barely knew you and I'm OK with that.
Reddit should pay Google (Score:2)
Re: (Score:2)
Re: (Score:2)
Everyone running Reddit knows their days are numbered. They're just stalling on the inevitable so the right shareholders have time to cash out.
The majority of popular posts are writing exercises, trolling, account farming, post farming, etc and its been that way for awhile.
With that track record they know they're not going to survive in the era of chatgpt and the only remaining thing of value after that is a large corpus of pre-LLM text that they can sell off.
Or they could sell it off except it's on the pu
Re: (Score:2)
This is about forbidding Google from indexing the site in the first place. Not indexed? No search results to be referred from.
Really shows how bad Google has gotten (Score:3)
Re: (Score:2)
I don't follow. What exactly shows how "bad" Google has gotten? I'm having trouble connecting the Reddit asking for money, to Google's search quality.
Re: (Score:1)
I don't follow. What exactly shows how "bad" Google has gotten? I'm having trouble connecting the Reddit asking for money, to Google's search quality.
I think they're referencing several articles I've seen recently that say things like "Google searches have gotten so bad, that users are adding 'Reddit' to the end of their search queries in order to get useful results". So, Google searches get so bad that people start using their search engine to search Reddit, now Google has the monopoly on using a search engine that way (to fix the problem they created).
Re: (Score:2)
If you're right, I still don't follow. If you can get better Reddit search results by using Google to search Reddit, how is that an indicator of Google being "so bad"? That seems to indicate that Google search is at least better than Reddit's own search!
My personal experience with Google is that I get the result I want as #1 or #2 in the search results, and usually #1. Maybe I just know how to write a search query, I don't know.
Re: (Score:1)
The internet is dead. (Score:2)
Pay what it is worth (Score:2)
I'm sure that Google and OpenAI can afford to pay what reddit content is worth.... gotta be, what, 10 cents per day?
Can we create a P2P/Blockchain Forum ? (Score:1)
I mean, this is only a forum system boosted with steroids and bullshit, so really, why is nobody had already created a clone using P2P or Blockchain to decentralize the control, and prevents discrimination like this, and most important, render the current Reddit useless and worthless ?
Like they need to be taken down, what they are doing is toxic and in my opinion, this sound like an incentive to the Microsoft and others, to finance and funds any non-profit that would do exactly that, and take Reddit off the
Re: (Score:2)
It is trivial to start a new Internet forum. Even a scalable one.
Now try and make it popular, when there is an established player in the market and the value to users of such sites is the number of existing active users and the daily post count. That is far from trivial and I honestly believe it needs more luck than you can plan for.
But 'adding blockchain' is just nonsense.
Re: (Score:1)
That the typical speech from people profiting from Player A, where they do not want others players to exist, not that I'm saying that is your case, but that is defeatism.
Most today platform where born from an event, like a break point in the need of internet, and I think that we reached this point, let Reddit be the very cause of it slow and agonizing downfall, over the next decade. Who care if it take 10 years from now, let's just get to it, so that one day we can be done.
We need proper freedom of speech,
All your data ... (Score:2)
Reddit CEO Says Microsoft and Others Need To Pay To Search the Site
All your data are belong to the AI bros.
Block 'em all (Score:2)
Why yes, I'd rather never see Reddit, Pinterest, LinkedIn, and Quora on my search results.
Many people don't use Google search (Score:2)
Most people want to be indexed to generate traffic, and the higher the better. Hope the AI deal makes up for all the traffic they won't be getting. Somehow I don't see how that is going to work out like they think.
TL;DR: fuck Spez and fuck Google (Score:2)
--
Rosco P. Coltrane [slashdot.org]: “One of the core principles of the web is that linking is free.”
“When your business is so desperate that you ask money for linking to your content, it's undeniably proof that your business model is utterly broken.”
“Also, if Google agrees to pay up and set a precedent, other platforms will inevitably try to get in on the action. And Google may very well want to do that in fact, because then they'll be the only ones with enough m
Per Reddit's own EULA (Score:4, Insightful)
Looking at Reddit's EULA, they do retain any ownership of the content, they merely have a license. They cannot bar someone else from using your creations, only the copyright owner has that standing.
5. By submitting Your Content to the Services, you represent and warrant that you have all rights, power, and authority necessary to grant the rights to Your Content contained within these Terms. Because you alone are responsible for Your Content, you may expose yourself to liability if you post or share Content without all necessary rights.
You retain any ownership rights you have in Your Content, but you grant Reddit the following license to use that Content:
When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world. This license includes the right for us to make Your Content available for syndication, broadcast, distribution, or publication by other companies, organizations, or individuals who partner with Reddit. You also agree that we may remove metadata associated with Your Content, and you irrevocably waive any claims and assertions of moral rights or attribution with respect to Your Content.
What about /. ? (Score:2)
This made me think, which level AIs browse Slashdot at, and how far back? Is it going to be traumatized by the less wholesome comments?
I can see where Reddit is comming from (Score:2)
If they allow Bing to crawl Reddit for its search engine, there is nothing to stop Microsoft also feeding that data into their AI and making AI profits off it without giving Reddit any of that money.