OpenAI Suspends ByteDance's Account After It Used GPT To Train Its Own AI Model (theverge.com) 32
TikTok's parent company, ByteDance, has been secretly using OpenAI's technology to develop its own competing large language model (LLM). "This practice is generally considered a faux pas in the AI world," writes The Verge's Alex Heath. "It's also in direct violation of OpenAI's terms of service, which state that its model output can't be used 'to develop any artificial intelligence models that compete with our products and services.'" From the report: Nevertheless, internal ByteDance documents shared with me confirm that the OpenAI API has been relied on to develop its foundational LLM, codenamed Project Seed, during nearly every phase of development, including for training and evaluating the model.
Employees involved are well aware of the implications; I've seen conversations on Lark, ByteDance's internal communication platform for employees, about how to "whitewash" the evidence through "data desensitization." The misuse is so rampant that Project Seed employees regularly hit their max allowance for API access. Most of the company's GPT usage has been done through Microsoft's Azure program, which has the same policy as OpenAI.
In response, OpenAI said that it has suspended ByteDance's account: "All API customers must adhere to our usage policies to ensure that our technology is used for good. While ByteDance's use of our API was minimal, we have suspended their account while we further investigate. If we discover that their usage doesn't follow these policies, we will ask them to make necessary changes or terminate their account."
In response, OpenAI said that it has suspended ByteDance's account: "All API customers must adhere to our usage policies to ensure that our technology is used for good. While ByteDance's use of our API was minimal, we have suspended their account while we further investigate. If we discover that their usage doesn't follow these policies, we will ask them to make necessary changes or terminate their account."
AI is training other AI Bots? (Score:1)
Re: (Score:3)
Don't worry too much. "AI training other AI" can mean a lot of different things, none of which will result in a terminator situation.
The first thing most people think of, and what the summary seems to imply, is using the output of one model to train another. This will always result in a lower-quality model thanks to a well-known phenomenon recently termed 'model collapse'. An easy way to think about this is that because models will necessarily have some error (they won't perfectly model whatever it is th
Re: AI is training other AI Bots? (Score:1)
Bots training bots: human centipede for AI
Re: (Score:2)
Wait, massive bubble in the stock market making some of the largest blue chip stocks look like meme stonks, and you're telling me AI is like copying a cassette / VHS?
Not feeling any sympathy here (Score:5, Insightful)
I mean you went around using other peoples data to train your LLM without their consent. Your only argument has been if it counts as copyright infringement or not. If you think you're safe from that behavior just because you beat people to the punch in doing it, before they wrote laws to say no explicitly, think again. What's good for the goose is good for the gander, people are doing the same thing you did now.
They're using /your/ data to train their LLM. Fun game isn't it?
I don't have a position either way on what's ethical data training for LLMs, I don't know enough about the impact, how it affects artists / other users data and content, but I certainly don't have sympathy if you're upset people are doing the same thing you did.
Re: Not feeling any sympathy here (Score:3)
"quote, quote, quote" Nuh uh, you're a dummy
You offered zero with that reply.
Re: Not feeling any sympathy here (Score:2)
Re: Not feeling any sympathy here (Score:2)
Heh-heh-heh. Hilarious! (Score:2)
Doesn't feel great when someone robs your shit to seed their AI model, huh?
Re: (Score:1)
I think they should be more concerned that Elon Musk swiped the source code to their entire chat application.
It's funny how easily people use the terms "steal" and "rob" in this context when /.'era get their panties in a wad over copyright violations not being theft. The couldn't have "robbed" their shit since they still have it!
Re: (Score:1)
Yea, well it's false equivalence, because back when that argument first surfaced (over the illegal copying of digitally-compressed versions of CD hardcopies of commercially-licensed music) nobody was using those bootlegged copies to also illegally feed a giant AI that then turns around and directly competes commercially on equal footing with the originals works. They were largely just exercising what was in the prior era of cassette tapes deemed as "fair use" by law; keeping rebundled copies for personal ar
Grok? (Score:2)
What about Grok, which comes right out and claims to be ChatGPT?
What? Unauthorized use of data at an AI startup? (Score:5, Funny)
Say it ain't so!
Re: This is hilarious (Score:2)
Faux pas? (Score:2)
"This practice is generally considered a faux pas in the AI world,"
The cause of the embarrassment or insult to others has to be a mistake to be a faux pas though.
Chinese AI had US AI as nanny ... (Score:3)
Do they get to be friends later in life?
In other words, OpenAI steeling data is ok (Score:1)
I love how it's totally cool for OpenAI to use unlicensed data from the internet and not credit open source projects to build their own model. As soon as someone else does that to them, it's not right. What a load of hypocritical shit.
I contribute to open source and I don't mind OpenAI using it to train models. But just like a human, you better damn well cite and give credit to the projects.
China's innovation (Score:2)
Re: (Score:1)
It didn't work for Europe when they tried it against the US, and it won't work for the US trying it with China. Nor will it work for China when they inevitably try it against some other country.
Oh, the irony! (Score:2)
Silicon valley thieves just ran into the true masters of the concept that, "if it's not nailed down and protected by armed guards, it's ours. And sometimes even then.
If we discover that their usage doesn't follow (Score:2)
Fake news! (Score:2)
A company breaking terms and conditions, just so that it can make money? Ridiculous!
Re: (Score:2)
"Like a dog returning to it's vomet ..." (Score:2)
Oh look (Score:1)
Shithole company from shithole country doing shithole things. And not just shithole thing, completely stupid things. Any model that used any of GPT as training is inherently extra stupid. Anyone involved at ByteDance should feel ashamed of their stupidity - from whoever had the idea all the way to anyone falling in line to work on it. The downside being in order to have this level of stupidity all around means that no one there is capable of feeling ashamed. They're too stupid to even recognize that they ar