DeepSeek Has Spent Over $500 Million on Nvidia Chips Despite Low-Cost AI Claims, SemiAnalysis Says (ft.com) 146
Nvidia shares plunged 17% on Monday, wiping nearly $600 billion from its market value, after Chinese AI firm DeepSeek's breakthrough, but analysts are questioning the cost narrative. DeepSeek said to have trained its December V3 model for $5.6 million, but chip consultancy SemiAnalysis suggested this figure doesn't reflect total investments. "DeepSeek has spent well over $500 million on GPUs over the history of the company," Dylan Patel of SemiAnalysis said. "While their training run was very efficient, it required significant experimentation and testing to work."
The steep sell-off led to the Philadelphia Semiconductor index's worst daily drop since March 2020 at 9.2%, generating $6.75 billion in profits for short sellers, according to data group S3 Partners. DeepSeek's engineers also demonstrated they could write code without relying on Nvidia's Cuda software platform, which is widely seen as crucial to the Silicon Valley chipmaker's dominance of AI development.
The steep sell-off led to the Philadelphia Semiconductor index's worst daily drop since March 2020 at 9.2%, generating $6.75 billion in profits for short sellers, according to data group S3 Partners. DeepSeek's engineers also demonstrated they could write code without relying on Nvidia's Cuda software platform, which is widely seen as crucial to the Silicon Valley chipmaker's dominance of AI development.
Wait... (Score:2)
Ok, this is going to sound crazy, but what if we valued a company by actual sales and not how high you can get off hopium?
Re: (Score:3, Interesting)
Re: (Score:3)
Even if this new model does lower the computing power required for training, this could still be good news for nVidia. The big deal about this model is not just the (alledged) low cost of training, but the low cost of running it, and the fact that it's a high quality open-source model. This puts AI in the hands of the masses.
Someone likened this to Watt's invention of the condenser, which meant that steam engines suddenly used 80% less coa
More information needed (Score:2)
So $500 million in their chips were purchased for this sole purpose, and yet the share price plummeted 17%?
Ok, this is going to sound crazy, but what if we valued a company by actual sales and not how high you can get off hopium?
I'm interested in your proposal, but would like more information.
Exactly how would you set the stock price based on sales?
Additionally, sales are usually reported every 3 months. Would your plan keep the stock price constant until the next quarterly report?
If a company wants to issue more stock, for example to fund expansion or capital improvements, how would that work? Would the total value of all stocks go down to compensate?
What advantages would come from implementing your proposal?
Whether your proposal
Re: (Score:3)
Exactly how would you set the stock price based on sales?
Ok, this is nuts, but you could look at it year over year and see if they are selling more now. Oh, maybe you could even make a projection for them to meet! But here is the killer part, we educate until the smooth brain investing goes away and logical agency creeps into place. We are breaking ground here, by god these people will be so much better off and the world richer in many ways.
Re: (Score:2)
Re: (Score:2)
So $500 million in their chips were purchased for this sole purpose, and yet the share price plummeted 17%?
Ok, this is going to sound crazy, but what if we valued a company by actual sales and not how high you can get off hopium?
I'm interested in your proposal, but would like more information.
Exactly how would you set the stock price based on sales?
...
Exactly how do you set the stock price based on hopes and dreams?/s
Investors looking more at the actual performance of companies rather than hype (good or bad) sounds like simply advising on classically good investment strategy.
Re: (Score:2)
Hear you go, results from November [nvidia.com]. Just about every metric was up double digits from the previous year and/or quarter.
Re: (Score:2)
Hear you go, results from November [nvidia.com]. Just about every metric was up double digits from the previous year and/or quarter.
So it only makes sense the valuation is lower. I think I’m starting to understand but I must consult Neurology first.
Compared to Meta and Microsoft... (Score:2)
Microsoft has said they plan to spend $80bn on AI infrastrucuture in 2025 alone. Meta has said they plan to spend $65bn this year. If even a modest percentage of those infrastructure costs are going towards Nvidia GPUs, it'll dwarf what DeepSeek has allegedly spent in their entire history.
Re: (Score:2)
Microsoft has said they plan to spend $80bn on AI infrastrucuture in 2025 alone. Meta has said they plan to spend $65bn this year. If even a modest percentage of those infrastructure costs are going towards Nvidia GPUs, it'll dwarf what DeepSeek has allegedly spent in their entire history.
Microsoft and Meta both report quarterly earnings on Jan 29. They will certainly be asked about their data center spending plans and whether they have changed. If they don't reiterate their already announced plans, then that's a bad sign for Nvidia. However, if they do reiterate, then that news outweighs the Deepseek news.
Re: (Score:2)
The Spending Doesn't Matter (Score:5, Interesting)
China has successfully introduced a whole lot of FUD into the AI world, especially in the United States while making a sizeable dent to the financial markets at the same time.
It's actually quite brilliant
Re: (Score:2)
China has successfully introduced a whole lot of FUD into the AI world, especially in the United States while making a sizeable dent to the financial markets at the same time.
It's actually quite brilliant
It's actually a very bad weakness of our markets.
Just the dot.com bubble, the subprime loan, and AI will be next. A metric fuckton of money that pulls a disappearing act. Money invested in vapor.
At some point, it's just bots on steroids. Eventually, if continued, it will be bots referencing only other bots, at that point "truth" might be anything. I'm waiting for AI to eliminate the laws of physics.
Right now the trick is to refurbish shut down nuclear generation stations before the bottom drops out
Re: (Score:2)
It's actually a very bad weakness of our markets.
That's why its brilliant. They're taking advantage of our weaknesses, and that it's open source is a plus because it's taking advantage of the overall distrust our political bickering has fostered.
Re:The Spending Doesn't Matter (Score:5, Insightful)
China has successfully introduced a whole lot of FUD into the AI world
Has it? What, specifically. The US market is wall-to-wall bullshit spiritually led by the one-man-FUD-factory Altman... by comparison what's coming out of DeepSeek is straightforward.
They aren't actually claiming anything magic (like most of the US players), they've released the model so you can check for yourself. What it appears is they have put together a bunch of techniques cleverly and carefully to achieve a big speed up, by making somewhat better use of the GPU resources they had. There's also a question of "cost": I'm not sure how the GPU cost was included.
Re: (Score:2)
they've released the model so you can check for yourself
Access to the model won't tell you anything about how much it cost to train.
What it appears is they have put together a bunch of techniques cleverly and carefully to achieve a big speed up, by making somewhat better use of the GPU resources they had
I'm going to guess that you didn't read the paper...
Not that I blame you, it's absolutely stuffed with absurd hyperbolic language, e.g. DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors [...] DeepSeek-R1-Zero exhibits super performance on reasoning benchmarks. It's hard to read with your eyes rolling constantly.
From what I could stomach, I didn't see anything there that any sane person w
Re: (Score:2)
I'm going to guess that you didn't read the paper...
I believe this [arxiv.org] is the paper that's getting the US stock market to freak out, while
Not that I blame you, it's absolutely stuffed with absurd hyperbolic language, e.g. DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors [...] DeepSeek-R1-Zero exhibits super performance on reasoning benchmarks. It's hard to read with your eyes rolling constantly.
this [huggingface.co] seems to be the one you're referring to. Correct?
Here's [bitrue.com] a comparison for those that are interested since they do different things.
Re: (Score:2)
It's open source, probably because they thought that if it wasn't they would get this kind of accusation. You can go and confirm it for yourself if you like.
I'm sure Meta and OpenAI have, and would be calling it out if it was fake.
Re: (Score:2)
It's open source
It's not.
probably because they thought that if it wasn't they would get this kind of accusation.
If that were the case- they'd have released the source.
You can go and confirm it for yourself if you like.
You can. You will quickly see you cannot possibly reproduce their results with their source, because it's not open source.
The weights themselves are freely available, and they did release a paper describing their training methodology, so it is possible for someone to try to reproduce their work, in general, but frankly- nothing they describe is particularly novel, there's no reason to think it doesn't work. It's an iterative improvement.
Re: (Score:2)
There is no FUD here, just irrational panic in our markets.
Re: (Score:2)
Re: (Score:2)
They published their paper (Score:5, Insightful)
Re: (Score:2)
The trade is on the idea that nVidia has a locked-in monopoly and deserves to be priced at 500x future earnings.
The details of who owned which chips and how details were allocated is less of the issue more than a typical MythBusters episode.
Re: (Score:3)
The trade is on the idea that nVidia has a locked-in monopoly and deserves to be priced at 500x future earnings.
A bit of hyperbole, but still an important point to consider. There are two fundamental components to NVDA pricing.
Market share and competitors is one part. Nvidia has about 80% of the market if ASICs are considered and about 98% otherwise. That hasn't changed. If someone can demonstrate better ASIC utility or efficiency (beyond what can already be done today), that would crater NVDA much more than this week's Deepseek news.
The second part is the total addressable market. This is the part that Deepseek affe
Re: (Score:2)
Formerly there was the OpenAI + NVIDIA monolith for LLM-based AI. For whatever reason investors thought this monolith was going to stand the test of time, and most investment and projects were only geared towards this. With the Deep Seek release, it's now reasonable to assume that one will not need as many high-end NVIDIA chips...maybe not even NVIDIA at all. And who really needs OpenAI for that matter.
My guess is research like this quickly advances other avenues of research and very soon someone coul
Re: (Score:3)
If you applied similar optimization to newer hardware you should see even better results.
What this shows is laziness on the part of those who have access to the latest hardware, they are content to just keep throwing more powerful hardware (and more money) at the problem rather than trying to make efficient use of the resources available.
Re: (Score:3)
If you applied similar optimization to newer hardware you should see even better results.
Not necessarily, newer hardware is the focus of new code and optimizations, many times those optimizations are not ported back to older hardware because it is deemed obsolete for future use, so why bother with the effort.
If all they have is older hardware due to sanctions, it makes sense they focused on optimizing for older hardware because that is all they have in bulk - they may have also simply ported optimizations for current hardware back to the old hardware they were using.
Re: (Score:2)
Re: (Score:2)
It's no secret, no conspiracy, no crazy collusion to get you to think more communist. They found some tricks to take better advantage of older, less capable hardware, and those evil bastards even published their research! You too can use shittier chips to train good models! You don't even have to convert to communism to get the benefit. I shouldn't be so surprised by the way Americans are reacting to this, but I still am. How can people who are able to understand how all of the intricacies of deep learning and the math behind it be so hook-line-and-sinkered by the fear porn being published by investors who simply didn't understand what role top of the line chips actually play in AI? (Not as big of one as they thought.)
Differential analysis - the first rumbles of the bubble bursting
Re: They published their paper (Score:2)
A very small number of people actually understand, say, stochastic gradient descent, on the mathematical level. Of course you don't need to; you just run the ADAM optimizer.
A relatively small number of people even understand what a model is, and those people mostly didn't believe in Altman's bullshit.
Let's bring the fallacy into sharp relief by applying your complaint to another GPU intensive field: crypto. "How can someone understand the math behind ZKPs and cryptography but still lose millions trading NFT
Re: (Score:2)
A very small number of people actually understand, say, stochastic gradient descent, on the mathematical level.
You've got to be kidding. The concept is simple enough to for a bright teenager to understand and the math, well, let me put it this way: vector calc is a 200 level course, often a required prerequisite to linear algebra, which you'll need if you intend to do anything related to data science. That's millions of people.
A relatively small number of people even understand what a model is,
I would expect anyone who took a statistics course to understand what a model is.
and those people mostly didn't believe in Altman's bullshit.
I would have thought so a few years ago, but here we are... Wishful thinking is more powerful than reason, a
Re: (Score:2)
They found some tricks to take better advantage of older, less capable hardware
Tell me you didn't read the paper without telling me you didn't read the paper.
Lack of trust... (Score:2)
Re: (Score:2)
On one hand CCP would happily "invest" into destabilizing US markets, on other hand Wall Street would not hesitate to push out "consultancy firm" disinfo to protect investments.
Exactly this. Propaganda to manipulate markets as well as popular sentiment - who knew?
Some things you just cannot spin as bad (Score:5, Informative)
https://x.com/pmarca/status/18... [x.com]
Re: (Score:2)
So, define "training cost" ... (Score:2)
If I build a datacenter, or update it with some new GPUs, then use it for training some models, what should I call the "training cost"?
I don't think there is any standard definition, but including the cost of the GPUs in the training cost for one model would seem odd since you are going to use them over and over for training many models as well as for inference.
If seems that total number of FLOPs used for training would be a better way to measure cost, even if some companies have cheaper FLOPs (e.g. TPU vs
Re: (Score:2)
The bottleneck is not in the FLOPS, but in the communication channel between nodes, right?
Re: (Score:2)
Typical GPU utilization is well below 100%, but I think that's more to do with poor usage patterns than inter-GPU communication speed.
My point was that "training cost" can mean almost anything (cost to buy GPUs, electricity cost to run them, etc), but at least number of FLOPs consumed during a training run, or number of FLOPs per training token, is a meaningful concrete number, reflective of model efficiency and therefore cost to run, and could be compared between models - it would be a more useful thing to
Re: (Score:2)
The deepseek paper uses GPU hours on H800.
Re: (Score:2)
Re: (Score:2)
I believe there are two costs of an LLM. The first is training it. The second is having it answer queries. The latter is going to be much higher than the former although the conversation so far has focused on training costs.
There's a third cost, which is the subject of this thread, i.e., the R&D cost. The last training run to produce the weights is the easiest and least resource intensive part. Figuring out the architecture and training procedure, as well as procuring clean and useful data, are non-obvious and take a lot of people, hardware, and money.
Now, the question is whether Deepseek's current with has solved and eliminated the need for future R&D. If so, then Deepseek's efficiency benefits future models. If not,
Re: (Score:2)
If I build a datacenter, or update it with some new GPUs, then use it for training some models, what should I call the "training cost"?
I don't think there is any standard definition, but including the cost of the GPUs in the training cost for one model would seem odd since you are going to use them over and over for training many models as well as for inference.
Yes, definitely the GPU capital costs should be amortized over the lifetime runs on that hardware. The Deepseek (and other model companies) approximate this amortized cost by counting GPU hours and then converting that to a comparable cloud cost.
over the history of the company? (Score:2)
Re: (Score:2)
And their expenditures don't match their claims.
Re: (Score:2)
If they spent $500m on hardware to get started, and then discovered a method that didn't require that much horsepower... Sure, on the company books there's a big 500m entry in the expense column, but you don't have to count that as part of the new method.
Is this what happened? Don't know. I suspect not, but have no real information. Regardless, their claim is not obviously false.
No fucking shit? (Score:5, Insightful)
>DeepSeek said to have trained its December V3 model for $5.6 million, but chip consultancy SemiAnalysis suggested this figure doesn't reflect total investments. "DeepSeek has spent well over $500 million on GPUs over the history of the company," Dylan Patel of SemiAnalysis said. "While their training run was very efficient, it required significant experimentation and testing to work."
No fucking shit? You mean you have to start inefficient and work your way to greater and greater efficiency? Stop the fucking presses!
I thought people just pulled process innovations out of their asses, no prior processing needed!
God damn people are fucking stupid. Especially these so called "experts" that shitty clickbait "news" articles find.
Re: (Score:2)
>DeepSeek said to have trained its December V3 model for $5.6 million, but chip consultancy SemiAnalysis suggested this figure doesn't reflect total investments. "DeepSeek has spent well over $500 million on GPUs over the history of the company," Dylan Patel of SemiAnalysis said. "While their training run was very efficient, it required significant experimentation and testing to work."
No fucking shit? You mean you have to start inefficient and work your way to greater and greater efficiency? Stop the fucking presses!
I thought people just pulled process innovations out of their asses, no prior processing needed!
God damn people are fucking stupid. Especially these so called "experts" that shitty clickbait "news" articles find.
You're missing the point. It's not that prior experimentation and testing was needed; it's that they excluded the cost of that in their $5.6 million figure.
Investors believe a fantasy... (Score:2)
...that future AI will be exclusively owned by big corporations and everyone will pay a lot to use it.
The knowledge will be widespread and available to all.
Expect more open source and small company advances, from labs all around the world
Nividia will add a crypto kill switch (Score:2)
The chips redesigned and sold for a low cost, but they will be "licensed" and require a yearly license fee to be paid or they'll stop working.
Re: (Score:2)
Re: (Score:2)
China is likely already capable of making chips at the H800 level.
This is likely not correct. SMIC currently doesn't have a working 4nm process node. They have 7nm working and claim to have found a 5nm non-EUV solution. They also don't have CoWoS capabilities.
Re: (Score:2)
China is likely already capable of making chips at the H800 level.
This is likely not correct. SMIC currently doesn't have a working 4nm process node. They have 7nm working and claim to have found a 5nm non-EUV solution. They also don't have CoWoS capabilities.
I don't see how your post contradicts with mine. You will use more silicon and the chips will draw more power which might not make a great general purpose CPU. But if you are willing to use large dies and draw more power, you should be able to get equivalent performance. That's what I mean by H800 level. Not that they have all the technologies, but by sacrificing power and silicon area you can get similar performance with older fabrication technologies.
So? (Score:2)
AI learning is a numbers game. You can do it with fewer more powerful processors, or with more less powerful processors. The tradeoff favors more powerful processors if you have high labor and other costs for building around them.
let me guess (Score:2)
Technology Efficiencies are By Design and Need (Score:3)
You can look to the Telecommunications Bubble of 2001 which was powered by a race to dig and install fiber optic cable by the telecoms, niche players and speculators in the early 2000s. The entire business model was based on absurd Internet data exponential growth predictions and that, in general, only one light wave signal could transmit only a single strand of fiber optic cable. That whole business model assumption drove the digging and installation of millions of miles of soon to be unneeded "dark" fiber optic cable, under sea cables, and cables on the right-of-way of train tracks and other areas that didn't even have a termination connection. The industry spent billions in infrastructure planning for a future that would not be needed for decades to come. Wavelength-division multiplexing gave existing fiber cables 100 times more capacity by just replacing transmission and receiver equipment thereby blowing the original fiber optic telecom economic model out of the water and destroying several large telecommunication companies in the process.
Re: (Score:2)
But, hey, I guess that we should thank them for giving us a stock buying opportunity?
Re:Are you telling me.. (Score:5, Insightful)
More like pump and dump with stock prices.
Re: (Score:2)
Worked for Nancy Pelosi [newsweek.com]
Re:Are you telling me.. (Score:5, Insightful)
No, its saying the Chinese company did something shocking and evidence of bad faith: they did research and development!!!1! That's cheating! They should have had the right algorithm right out of the gate and not needed to learn how to do any of this!
This is a really stupid article and needs to be binned. Especially as it'll just attract the usual US vs China crap in the comments.
Re:Are you telling me.. (Score:5, Interesting)
Was thinking the same. Reads like a pissed off loser taking pot shots at the winner.
Re: (Score:2)
Give it a year or two and the mythology will have developed into "DeepSeek stole this from a US university where it was invented!"
Re: (Score:2)
Give it a year or two and the mythology will have developed into "DeepSeek stole this from a US university where it was invented!"
You just called the prejudicial statements of others stating "Chinese lie" racist and then went right on ahead to make a similar appeal to prejudice yourself.
Re: (Score:2)
You can't understand the difference between "(people of a certain race) are dishonest" and "the current propaganda narrative"?
Explains a lot.
Re: (Score:2)
Give it a year or two and the mythology will have developed into "DeepSeek stole this from a US university where it was invented!"
You can't understand the difference between "(people of a certain race) are dishonest"
and "the current propaganda narrative"?
Both are prejudicial statements that rely on the very same type of error in reasoning and judgement.
There are well known issues with widespread scientific fraud in China so there is an objective basis to support the statement at least some "Chinese lie". One I presume can make the same argument about whatever type of shitting on the US "DeepSeek stole this from a US university" is supposed to represent by citing examples or trends related to "current propaganda narrative".
In both cases whether it is presu
Re: (Score:2)
It's 2025, don't you really think those tired old arguments to justify racism are going to fly?
Re: (Score:2)
It's 2025, don't you really think those tired old arguments to justify racism are going to fly?
All I've done was point out hypocrisy. If you disapprove of prejudicial statements consider an apology for yours.
Re: (Score:2)
Pst mate.
Scroll down. You can find people actually doing the thing you're mad about, instead of getting mad at people who are saying it's going to happen.
Re: Are you telling me.. (Score:5, Funny)
I only read Shakespeare in the original Klingon!
Re: (Score:2)
This is a really stupid article and needs to be binned. Especially as it'll just attract the usual US vs China crap in the comments.
No worries! The article is paywalled. Problem solved!
Re: (Score:2)
No, its saying the Chinese company did something shocking and evidence of bad faith: they did research and development!!!1! That's cheating!
Yeah, I don't get the hubub on this. Were people expecting that there were no costs in developing this? That they would whip Deepseek out of thin air? Software runs on hardware. Hardware costs money.
Re: (Score:2)
Some people likely suspected DeepSeek wasn't using NV hardware, or at least not a whole lot of it. There was the suspicion that they might have been using some homegrown solutions. Of course NV was quick to point out the opposite, and now we have some corroboration.
Re: (Score:2)
It's really indicative of the investors not really understanding the tech, perhaps damningly so.
DeepSeek's own initial press releases made it clear that v3 was trained on ~2,048 H800s. I guess maybe they think some company other than nvidia is making those??
Re: Are you telling me.. (Score:2)
Chinese copy Western IP and take as their own! (Score:2)
Chinese copy Western IP and take as their own!
Re:Are you telling me.. (Score:5, Insightful)
No, only that people can't read, which is nothing new. This is the literal quote from the deepseek v3 paper: "Note that the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data."
In other words, this is only what deepseek has been saying from the very beginning in its public tech report.
Re: (Score:2)
People can read, but reading is a lot of work. Especially if you're an important leader type. Much better to play telephone with a sea of self-referencing "thought leaders" and f1st p0st3rs.
Re:Are you telling me.. (Score:5, Interesting)
I don't think it's a lie, you can find the exact claim here: https://arxiv.org/html/2412.19... [arxiv.org]
Lastly, we emphasize again the economical training costs of DeepSeek-V3, summarized in Table 1, achieved through our optimized co-design of algorithms, frameworks, and hardware. During the pre-training stage, training DeepSeek-V3 on each trillion tokens requires only 180K H800 GPU hours, i.e., 3.7 days on our cluster with 2048 H800 GPUs. Consequently, our pre-training stage is completed in less than two months and costs 2664K GPU hours. Combined with 119K GPU hours for the context length extension and 5K GPU hours for post-training, DeepSeek-V3 costs only 2.788M GPU hours for its full training. Assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M. Note that the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data.
So they basically calculated what it would have cost if they had rented the GPU's as an indication of efficiency. They never really claimed that they actually just spent 5 million on it. There really isn't all that much indication that those numbers are infeasible. It's true that models like Llama and ChatGPT 4 took more time to train, however Llama 3 is not a mixture of experts model (so hard to compare), and for ChatGPT 4 as far as I know we don't know how many hours it took, but it's also a way bigger model, estimated at well over 1000B parameters compared to 671B for deepseek v3.
We don't really have enough comparison material to say anything smart about how long it 'should' take.
Re: (Score:3)
There was no lie. It was clearly understood that they cited power costs. They never said they spent $5.6 million to buy hardware. Why blame them for the lack of technical literacy of readers?
Re: (Score:2)
No. The Chinese told the truth, published their methods, and released their model. The Americans jumped to conclusions.
Ot are you telling me.. (Score:2)
That Chinese lie?
No way!
That the media misrepresented a story? That's IMPOSSIBLE!
Re: (Score:2)
I doesn't seem to be a lie, at all really 500 million over the entire history of the project for a major research project seems small, well relatively speaking. Seriously 200 million for the second season of Severance, a show mainly set in a office building. From the summary Nvidia lost $600 billion just on the news that is $500 million is just 8% of that. And the key thing here is over the life time of the project, that research does not need to be done again, its about running cost of training new models.
Re: (Score:2)
Sorry 500 million is 0.08% of 600 billion
Re: (Score:2)
They published the cost of training R1. Not what it cost to develop. Now that its developed, they can switch to Huawei hardware and make it even cheaper.
Or did you just need to fill your racist quota?
Re: (Score:2)
They published their methodology as well as their results so anyone can verify them.
They said they used 2048 GPUs as well as some innovative programming to achieve the low cost.
If they are lying it would be simple for anyone to check the results.
Until someone gets off the couch to verify or disprove their results I think we have to take their word for it.
Re: (Score:3)
I'm pretty sure that several other companies have already replicated their results and are hastily developing their own versions. I wonder if they have any patents on the process.
Re: (Score:2)
I wonder if they have any patents on the process.
Do we honor Chinese patents? (Serious question.)
Snarky corollary: Cuz I'm pretty sure they don't care much about honoring ours.
The DeepSeek code is released under the MIT license, and opinions about how that relates to patents differ. To my knowledge, it has yet to be tried in the courts.
Re: (Score:2)
We do, and they do. In fact patents are how they have tied up a lot of automotive EV stuff.
Re: (Score:2)
I think we have to take their word for it.
Given the absolute deluge of fake scientific output from China, I very strongly disagree.
They published their methodology as well as their results so anyone can verify them. If they are lying it would be simple for anyone to check the results.
Just because they've published their methodology and ... 'results' ... doesn't mean that replication is trivial.
Re: (Score:3)
Yeah, their (Deepseek's) rock-bottom pricing is all you need to know about this sitch. If the numbers don't work w.r.t. profitability (i.e. they are lying about the upfront investment), then their investors will roast them alive and/or they will fall behind with time as their fabricated budge won't be able to keep pace with the actual (i.e. hidden) training costs.
More germanely, it is entirely plausible that their model was orders of magnitude cheaper to train than even their own preceding models (e.g. Dee
Re: (Score:2)
The CCP lies. Communists lie. It isn't racist to criticize a corrupt system of government. That you can't separate in your mind the Chinese people and the tyrannical regime that oppresses them says something about you, not anyone else.
Re: (Score:2)
Who gives a fuck what you appreciate
Most of us...
I get it you appreciate naked racism and don't appreciate someone calling it out, but I'd rather slashdot didn't turn into the kind of place you appear to want.
Re: (Score:2)
Calling out the phrase "That Chinese lie?" as racism is not shilling for the CCP.
You are conflate the CCP with Chinese people in general, so you can be racist about the Chinese while trying to maintain plausible deniability that it's the CCP you're against. Except you aren't smart enough to pull that off. It's very very obvious.
Re: (Score:2)
Didn't you hear? Political correctness isn't cool anymore. Especially when it both serves no purpose except to derail a conversation.
It's also pretty clear given the topic of conversation that "The Chinese" is not referring to individual people of Chinese decent or heritage. The company, which is based in China, where there's pervasive involvement/contribution/control by the CCP would be one of many which intentionally misrepresented achievements in recent years.
So yes, it would be more correct to say "
Re: (Score:2)
That's a whole lot of waffle to attempt to gaslight everyone about a clearly racist statement that very clearly wasn't, in or out of context, about the Chinese Communist Party.
Why are you putting so much effort into defending boulat?
Re: (Score:2)
Re: (Score:2)
You're deeply confused. Is that why you posted that nonsense AC? AmiMoJo is one of the few users here who regularly make posts worth reading.
Re: (Score:2)