Reddit Wants To Get Paid for Helping To Teach Big AI Systems (nytimes.com) 46
Reddit has long been a forum for discussion on a huge variety of topics, and companies like Google and OpenAI have been using it in their A.I. projects. From a report: Reddit has long been a hot spot for conversation on the internet. About 57 million people visit the site every day to chat about topics as varied as makeup, video games and pointers for power washing driveways. In recent years, Reddit's array of chats also have been a free teaching aid for companies like Google, OpenAI and Microsoft. Those companies are using Reddit's conversations in the development of giant artificial intelligence systems that many in Silicon Valley think are on their way to becoming the tech industry's next big thing. Now Reddit wants to be paid for it.
The company said on Tuesday that it planned to begin charging companies for access to its application programming interface, or A.P.I., the method through which outside entities can download and process the social network's vast selection of person-to-person conversations. "The Reddit corpus of data is really valuable," Steve Huffman, founder and chief executive of Reddit, said in an interview. "But we don't need to give all of that value to some of the largest companies in the world for free." The move marks one of the first significant examples of a social network's charging for access to the conversations it hosts for the purpose of developing A.I. systems like ChatGPT, OpenAI's popular program. Those new A.I. systems could one day lead to big businesses, but they aren't likely to help companies like Reddit very much. In fact, they could be used to create competitors -- automated duplicates to Reddit's conversations.
The company said on Tuesday that it planned to begin charging companies for access to its application programming interface, or A.P.I., the method through which outside entities can download and process the social network's vast selection of person-to-person conversations. "The Reddit corpus of data is really valuable," Steve Huffman, founder and chief executive of Reddit, said in an interview. "But we don't need to give all of that value to some of the largest companies in the world for free." The move marks one of the first significant examples of a social network's charging for access to the conversations it hosts for the purpose of developing A.I. systems like ChatGPT, OpenAI's popular program. Those new A.I. systems could one day lead to big businesses, but they aren't likely to help companies like Reddit very much. In fact, they could be used to create competitors -- automated duplicates to Reddit's conversations.
Reddit Problems (Score:5, Insightful)
Reddit needs to fix their moderator problems before worrying about how their data is being used in openai. Many of their biggest communities are moderated by children or adults with a childish attitude. Reddit needs to pay their moderators and vet them accordingly with a transparency system that allows average users to see actions taken and by which moderator.
Re:Reddit Problems (Score:5, Informative)
Re: (Score:3)
I cant stand to go there anymore. I got banned from my hometown subreddit for who knows what, something "bad". Fuck that shit. Total lack of transparency.
Re: (Score:1)
what you complain about is a different issue.
May be substantial, but having nothing to do with whether they should allow free access to the content.
Re: (Score:3)
"protect" from "predators"? (Score:2, Insightful)
Re: (Score:3, Informative)
Except Slashdot moderators don't have the power to perma-ban you from the site because they are butt hurt about something you said about SBFs in r/racing and the mod is Ford guy. Its totally reasonable he should be able to prevent you from posting about network design in r/firewall!
Reddit is terrible site, its a garbage design, with garbage for rules.
Re: (Score:2)
Reddit needs to fix their moderator problems before worrying about how their data is being used in openai.
Why "before"?
I mean, I get that they need to fix their moderator problem. But your claim is that this should happen BEFORE they worry about how their data is used for AI-training, and I don't see why one should come before the other?
Re: (Score:2)
Sorry what? A free-to-post-on forum wants to be paid, while not owning the content on it? Sounds like Reddit wants to be paid for the same reason OpenAI wants to be paid.
Let's just put all the cards on the table. If Reddit isn't paying it's contributors and moderators, then it doesn't have a leg to stand on to ask GPT developers.
It's somewhat more credible to have AI developers pay Wikipedia, because there is a legal standing there that all content on Wikipedia is public domain, and paying Wikipedia keeps W
Re: (Score:2)
Sorry what? A free-to-post-on forum wants to be paid, while not owning the content on it?
I am pretty sure they OWN content on it. I didn't read their regulations but expect to find there a clause giving them all rights to posted content.
AI Reddit Mod Turing Test (Score:5, Funny)
AI > Hello. Your comment has been removed and you have been banned from Reddit for insulting someone in my group.
Person > I replied to someone insulting me. They have not been banned and their insult is still visible.
AI > No insults, you're banned.
Person > Can I appeal?
AI > You can fill in a form that goes to
Well done AI Reddit Mod, you passed the Turing Test. You are indistinguishable from human Reddit Mods.
Re: (Score:3)
Exactly... What a shithole.
Re: (Score:2)
Re: (Score:1)
They are not going to pay, in line with their ToS...
Re: (Score:2)
Does their terms of service include a clause that I have to pay them for information I get off their website too? If not why should they get paid.
If its simply based of fairness, then why shouldn't the contributors get paid?
Re: (Score:2)
Re: (Score:1)
see my answer above: their ToS prohibit commercial use.
(Which is pretty common, isn't it? Do you complain about any non-commercial clause, btw?)
Re: (Score:1)
Thow different issues mixed together:
1. From their ToS:
"...reddit is designed and supported for personal use only. (...)"
So you must ask Reddit for their permission for any non-personal use - and they made it clear they will require financial compensation to agree.
2. paid contributors:
Their current business model probably does not count with any income derived from content.
In the future, maybe they will share the income with their users who create the (now income-generating) content. Until then - if you don
Oh hell no (Score:5, Insightful)
Re: (Score:2)
Another source (Score:2)
Re: (Score:1)
Abuse of the TEMPORARY copyright PRIVILEGE we give them.
Plagiarizing Chatbots will get their comeuppance (Score:1)
ChatGPT is no more AI than the chatbot pretending to offer customer service on Amazon. The only difference is that ChatGPT shamelessly steals content without attribution or payment to creators. I for one am looking forward to the class action lawsuits.
Re: (Score:2)
The only difference is that ChatGPT shamelessly steals content without attribution or payment to creators
Should fit right in on a pro-piracy site.
Steals? (Score:2)
Unless it's a specific kind of software, then you have to pay.
Really?
Product liability (Score:2)
I'm thinking of not just accuracy and factual correctness, but being free of bias or libellous statements. It seems to me that AIs from rich and large organisations will become soft targets for the more predatory kind of lawyer. And if someone could demonstrate that the source of a contentious statement was sold to an AI trainer, that trainer would be more than willing to at least try to pass
Class action lawsuits (Score:2)
I'm waiting for artists or some big monied IP holder to start a class action suit against a lot of the generative AI platforms.
So where's our money? (Score:2)
If Reddit can get in on this action, what about Slashdot? I think the conversations here would be at least equally useful in training AI. For that matter, there are probably lots of other Web forums that LLM's would benefit from. Why single out Reddit?
Reddit is Garbage (Score:1)
What would Reddit teach? (Score:2)
the arrogance of toilets (Score:3)
This is like a toilet charging you for its clogging and overflowing. Years ago my longterm account was banned for unsupportable reasons, and I never went back. We know Reddit like Twitter was linked to government political activism. Choosing to train AIs on Reddit material introduces avoidable biases in the bots.
We need to shift to properly curated source material, maybe books that have been around and accepted for a long time.
Although that could lead to problems too:
"ChatGPT, recommend a healthy dinner menu for me.
Bot: "Green eggs and ham!"
Me: No.
Bot: "Diet of Worms!"
Me: lays head down on desk.
Great. Remove Reddit from the data (Score:1)
If they based ChatGPT on SO and the official documentation only, I'd be far more willing to use it. /r/count already caused weird glitch tokens. If it's dependent on Reddit so much, then no wonder ChatGPT is such a self-confident bullshitter.
Reddit has some occasional diamond posts. But the upvote /downvote metric makes the site into a purile popularity contest that swamps and overlooks anything that would add actual value to a machine learning data set. Extracting and validating the rare example of usef
Then don't be free. (Score:2)