OpenAI Claims DeepSeek Distilled US Models To Gain an Edge (bloomberg.com) 59
An anonymous reader shares a report: OpenAI has warned US lawmakers that its Chinese rival DeepSeek is using unfair and increasingly sophisticated methods to extract results from leading US AI models to train the next generation of its breakthrough R1 chatbot, according to a memo reviewed by Bloomberg News.
In the memo, sent Thursday to the House Select Committee on China, OpenAI said that DeepSeek had used so-called distillation techniques as part of "ongoing efforts to free-ride on the capabilities developed by OpenAI and other US frontier labs." The company said it had detected "new, obfuscated methods" designed to evade OpenAI's defenses against misuse of its models' output.
OpenAI began privately raising concerns about the practice shortly after the R1 model's release last year, when it opened a probe with partner Microsoft Corp. into whether DeepSeek had obtained its data in an unauthorized manner, Bloomberg previously reported. In distillation, one AI model relies on the output of another for training purposes to develop similar capabilities.
Distillation, largely tied to China and occasionally Russia, has persisted and become more sophisticated despite attempts to crack down on users who violate OpenAI's terms of service, the company said in its memo, citing activity it has observed on its platform.
In the memo, sent Thursday to the House Select Committee on China, OpenAI said that DeepSeek had used so-called distillation techniques as part of "ongoing efforts to free-ride on the capabilities developed by OpenAI and other US frontier labs." The company said it had detected "new, obfuscated methods" designed to evade OpenAI's defenses against misuse of its models' output.
OpenAI began privately raising concerns about the practice shortly after the R1 model's release last year, when it opened a probe with partner Microsoft Corp. into whether DeepSeek had obtained its data in an unauthorized manner, Bloomberg previously reported. In distillation, one AI model relies on the output of another for training purposes to develop similar capabilities.
Distillation, largely tied to China and occasionally Russia, has persisted and become more sophisticated despite attempts to crack down on users who violate OpenAI's terms of service, the company said in its memo, citing activity it has observed on its platform.
Cry me a fucking river (Score:3, Informative)
Companies worth ridiculous sums of money are complaining about unfair business. Hire some more competent people then.
Re: (Score:3, Insightful)
The same people that hoovered up the world's information without compensation.
Re: Cry me a fucking river (Score:2)
Re: (Score:2)
Companies worth ridiculous sums of money are complaining about unfair business. Hire some more competent people then.
They did.
It just so happened to be their own AI bot.
YMMV.
Re: (Score:3)
This is an admission that they can't compete, nothing more. They need to get the competition banned or they are screwed.
Re: (Score:2)
they screwed themselves either way. Overpromised since years.
Unfair? (Score:5, Insightful)
What the fuck is unfair? If you don't like how they are using your service, stop them. Cut them off, feed them garbage, get clever and solve the problem.
What are lawmakers going to do about it? Order President Trump to nuke China? More tariffs? Sanctions on Russia?
Re:Unfair? (Score:5, Interesting)
It's also continually ironic that the companies that used the world's knowledge without compensation to train their models are complaining when another company uses their models to train models (though they almost certainly are paying for the privilege).
Re: (Score:2)
Re: (Score:3)
Then use more sophisticated methods.
If only OpenAI had access to a giant computer brain that could analyze this problem and offer some solutions...
Re: Unfair? (Score:5, Interesting)
The big AI labs are also using sophisticated methods to avoid detection. This is one of the big drivers behind web-scraping botnets that run on residential IP addresses in crapwear and outright malware. Which in turn is why so many small websites are opting into DDoS protection services. Which in turn is why everyone is complaining about CAPTCHA interruptions everywhere they browse.
Re: (Score:3)
Reminds me of the times when my poor webserver was trying to cut OpenAI off, and OpenAI was using increasingly sophisticated methods to evade detection.
Re:Unfair? (Score:4, Interesting)
It's interesting, because the Chinese have a habit of this kind of reverse engineering. They're very good at it, and you often hear the West (Note, I am American so definitely a Westerner) decry Chinese practices are unfair. From their perspective, what's unfair is how China was more than 50% of global GDP for nearly 1,000 years, and their economy literally defined major events; the Silk Road was the path to wealth, but the Middle Eastern nations were in the way and getting a cut, leading to Bartolomeu Dias being the first to round the Cape of Good Hope (so Europe could bypass Middle Eastern merchants to deal directly with China), and Columbus sailing West; it was always about reaching China's riches. Then in the 1800s Europeans usurped China's position primarily through growing opium in India and selling it to China, destabilizing and corrupting the government. As far as they're concerned, that was 150 years of unfairness to them, so turnabout is fair play.
History is obviously much more complex than that, but "unfair" is a subjective term and a crybaby term. More importantly, it shows the weakness of AI spending if you have to spend billions of dollars building something and then someone can copy you for millions.
Said the companies that distilled the internet (Score:5, Insightful)
Next they will be interrogating humans in torture chambers to train their models. I'm calling it now.
Re:Said the companies that distilled the internet (Score:5, Insightful)
Re:Said the companies that distilled the internet (Score:4, Informative)
Re: (Score:2)
Re:Said the companies that distilled the internet (Score:4, Interesting)
> Next they will be interrogating humans in torture chambers to train their models. I'm calling it now.
They already are. It's called "social media."
Re: (Score:3)
Next they will be interrogating humans in torture chambers to train their models. I'm calling it now.
Given that the Web now runs on psychological manipulation, I would argue that they're already doing low-intensity interrogation in a mild-and-virtual torture chamber. All of us using the Web are experimental subjects. Many of us are suffering various degrees of pain and damage as a result; though most aren't aware of the fact.
Most likely along with vast amounts of copyrighted (Score:2)
.. works.
Such irony ;)
No polite response available. (Score:5, Insightful)
Dear OpenAI,
Go fuck yourselves.
For years, individuals, other companies, publishing industries across the board, and user groups have been fed the ever-loving fuck up with you stealing everything that isn't bolted down to train your fucked up vision of a future devoid of work, in a world that doesn't value humans when they don't work. And you and your ilk have proclaimed repeatedly that this is just the way things have to be, that the information belongs to you if it exists because you can access it. That there is no unfairness in you taking any information you can find because all information should go into training your new computer god.
And now that someone has done the exact fucking same god damned thing that you've been doing all along, you whine about it like a spoiled fucking child that had one tiny piece of its candy taken from a pile of ever-expanding, continually self-replenishing candy that will never end and never could end.
Fuck you, you entitled pieces of god damned shit. Combine this with your future visions of completely disrupting society for your financial benefit, while potentially causing the entire economy to collapse whether your visions come true, or you crash out in your quest and take the entire dream-o-sphere of Wall Street with you, and you are beyond disgusting. We're sick of your shit, and I hope to crap that this begins to drive home the fact that the term "corporation" does not deserve the respect it gets in society today. You are a parasite, through and through. And just because you found a tapeworm within your shit, it doesn't mean you aren't a tapeworm in society's shit yourself.
Good grief, some of us can't wait until you AI based companies stop being the 100% focal point of all world governments and all economic concerns. It's like we've handed the reins of society over to the most narcissistic, self-obsessed idiots in all of existence, and somehow they always find a way to double-down on the ugliest parts of humanity in their quest.
I found your challenge to be verbose (Score:3)
Either Arkell v Pressdram [wikipedia.org] or Cleveland Stadium Corp's response to Dale O. Cox, esq [loweringthebar.net] would suffice.
Re: (Score:2)
Re: (Score:2)
Well said! I especially liked:
You are a parasite, through and through. And just because you found a tapeworm within your shit, it doesn't mean you aren't a tapeworm in society's shit yourself.
I think the fundamental fact of the corporate sector's parasitism is a point which needs to be hammered home again and again. At some point maybe a majority of folks will consciously realize the specific ways in which we're all being fucked over. Then maybe an effective rebellion can begin. I live in hope...
Re: (Score:1)
I mean, look at who runs OpenAI and look at his affinities. It's ideologically and ethnically consistent with their behavior. What do you expect? They cry out as they strike you.
Re: (Score:2)
It's ideologically and ethnically consistent with their behavior.
You pieces of shit really are getting bold, aren't you?
Re: (Score:1)
How is this any different than saying "Black urban centers are riddled with violent crime and white people like destructive egalitarianism and uppers" ? They're true.
Re: (Score:2)
destructive egalitarianism and uppers is adorable- how about a better one- white people love their kiddy porn.
Re: (Score:2)
1. Louisana
2. New Mexico
3. Alabama
4. Tennessee
5. Missouri
6. South Carolina
7. North Carolina
8. Mississippi
9. Arkansas
10. Maryland
States ranked by population of black people:
1. Mississippi
2. Louisiana
3. Georgia
4. Maryland
5. Alabama
6. South Carolina
7. North Carolina
8. Delaware
9. Virginia
10. Tennessee
The part that makes it racist, is that it's obvious there are confounding factors. You want to make it about race, but the evidence suggests it is not simply about r
Re: (Score:1)
"White people"? You mean jews? Everyone on the Epstein list is Jewish.
Re: (Score:2)
"White people"? You mean jews? Everyone on the Epstein list is Jewish.
Complete lie.
And no, it's statistically impossible for the over-representation of white people in kiddy porn trafficking statistics to be accounted for by the amount of Jews in the United States, you antisemitic piece of shit.
boo fuckin' hoo (Score:5, Insightful)
They have no respect for the law or the wishes or creators. Now they want the other companies to be forced to respect those things? They can get fucked all day.
We at OpenAI have the greatest AI ever (Score:2)
What ?!
The Chinese copied our AI ?!
Aaand it's gone.
Let me translate (Score:3)
ClosedAI wants to be the only freeloaders (Score:2)
The Chinese companies are publishing open weights models at a breakneck pace, some of which people like me are using to start business ventures that wouldn't be feasible if Sam Altman had his way.
They stole from us. (Score:4, Funny)
DeepSeek is OpenWeights (Score:2)
waaaah! they stole the summary of our (Score:2)
US exceptionalism (Score:2)
Their problem. (Score:2)
The USA used to have the most Chinese PhD CS experts in the world. Guess what has been changing since 2016?
Pirated training data , invasive training data, spying on people for training data, wikipedia, is not going to be a huge problem for China and OpenAI having some of their data stolen... if any... I bet one can best them without touching their data. They seem... they want us to think it's all exclusively coming from them and it's an existential threat so they get the massive investments and loopholes e
Tiny violin time. (Score:2)
It couldn't be more obvious (Score:1)
Rich and powerful complaining to each other (Score:1)
Re: (Score:2)
Because Americans do not want a "Chinese" future, but a future based on Western "enlightenment" principles and (re)publican governance. In Abrahamic cultures, the individual is (ultimately) only answerable to G*d [ "... thou shall not have strange gods before me"] . Not so with various Eastern collectivist memes.
Re: (Score:2)
Because Americans do not want a "Chinese" future, but a future based on Western "enlightenment" principles and (re)publican governance. In Abrahamic cultures, the individual is (ultimately) only answerable to G*d
Only because Yahweh pulled a hostile takeover to secure a monopoly (In Abraham's time he was part of a pantheon). The result is that western capitalisms get vendor lock-in with regards to their deity while eastern markets enjoy the benefits of competition.
Re: (Score:2)
That mission died when they made GPT-3 an API-only model. It took until 2025 for them to release the next open model.
But at least they managed to choose an open license other than, for example, Google with their long lists of things that may void your license (in case it would be enforceable).
They must be joking... (Score:1)
Why is this a concern? If all it takes to build a competitive AI is to get responses from another AI and use it to train your own then why are any of the US companies valued as much as they are? At that rate, there is no c
Nobody cares about your crocodile tears (Score:2)
While I am sympathetic to the training is not a copyright violation thing because it isn't the fact is still you are (some think unfairly) exploiting the efforts, knowledge and ideas of everyone. You don't get to turn around and cry and whine when someone does it to you.
Most of a models capabilities and compute costs are a result of pretraining. Distillation requires a relatively minuscule amount of compute to pull off.
What is old is new again (Score:2)
*Open*AI? (Score:2)
Maybe they should have closed it?
Re: (Score:2)
The irony being that if they stayed true to their original mission/charter, they would welcome researchers training their AI models against OpenAI's.
Well, well, well (Score:2)
Some companies scrape the entire Internet. Others download the Anna's Archive. Some feed on the former two. Nothing is off limits in the brave new world of AI.
You mean (Score:2)
They are using the work of someone else in a transformative manner to train their AI?
Wow! Good for them, I guess.
Nothing that hasn't been done before, *right*?