Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
AI Technology

DeepSeek Outstrips Meta and Mistral To Lead Open-Source AI Race (semianalysis.com) 13

DeepSeek has emerged as the leading open-source AI model developer, surpassing Meta's Llama and Mistral, after releasing its latest model V3 with breakthrough cost efficiencies, research and consultancy firm SemiAnalysis reported on Friday.

The Chinese startup, backed by hedge fund High-Flyer, reached this milestone through innovations in Multi-head Latent Attention technology, which cut inference costs by 93.3% versus standard methods. Despite offering services below cost to gain market share, its performance matches or exceeds OpenAI's GPT-4.

DeepSeek Outstrips Meta and Mistral To Lead Open-Source AI Race

Comments Filter:
  • Sooo..who owns SemiAnalysis these days? Do they have a fitting name for this best-of-the-best AI analysis? I'm still trying to figure out what the hell "oustrips" might imply with regards to AI performance, since my first thought was copper mining. Just curious as to who is pimping "leading" AI stats these days, and how they outstripped and outdid themselves into that claim.

    If we thought dick-measuring contests were bad before, just wait until AI lubes the bullshit-stained skids.

    • by shanen ( 462549 )

      Pretty good FP. I just wrote a piece that could be adapted to continue the discussion along your lines [what I called "trust" issues], but I'm sorry to say I don't have motivation to find the time to do that for today's Slashdot, so I'm just going to paste a version of my initial reaction here:

      What is Deep Seek like? I'll call it DS for short. This is intended as my shared initial reaction based on two conversations. So far neither of my DS conversations has gone very far. That's largely because of "trust" issues, and the second conversation has already started explicit discussion of those matters.

      First of all, DS feels quite similar to ChatGPT. That includes the imbalanced verbosity issue. In normal English conversation the turns are fairly balanced. Each side speaks for a while and then the other side responds at similar length. In contrast, in today's discussion, I started with an 8-word description of a topic and DS responded with a huge essay about "life, the universe and everything". [What? No Oxford comma in the original? Color me outraged.] That did motivate me to find the "Stop" button. It's the input button while DS is "typing". [The quotes this time are scare quotes warning of anthropomorphic thinking--but if I used them everywhere they are called for, then this would be much harder to read than it already is...]

      This first session actually wound up running over 300 words from my side. One of the diverting topics was word counting. DS explicitly denied being able to count words and offered to explain how I could do it, but I noted that was tedious but if DS counted them it could help balance the conversation... For which suggestion DS again "sincerely thanked" me. [Faking sincerity. As always. Another recurring theme of all the GAIs I've played with.]

      So far I have not introduced any substance into either of the conversations. In the first conversation, I am concerned that the data source may "offend" the operators/owners of DS. It's a big data source that almost certainly includes some politically sensitive material. The second conversation involves some ideas that might be quite valuable. Why should I let the operators/owners of DS take the money and run? [But one of my many problems is that I don't really care much about money... However if I do help solve some problems in the real world it would be nice to get some credit?]

      That's about all I have time for now... Not sure if I'll continue later on this theme. Your conversational turn:

  • Can we get some bitcoin stories for old times sake?

    This is nothing but a method to pump up stock quickly before the bottom falls out.

  • by larryjoe ( 135075 ) on Friday January 31, 2025 @01:06PM (#65132909)

    The linked SemiAccurate article is actually quite insightful, but the summary is off-mark and not insightful. The article talks gives a broad perspective across companies/models and years. The author argues that DeepSeek advances are "simply" part of the skyrocketing pace of progress that have been happening before DeepSeek. The article is a good read, although the last half is paywalled.

  • Llama is not open source. I don't know about deepseek. Maybe I need to sue meta for unfair and deceptive advertising...
  • Engineering has a track record of incremental improvement of existing methods. I cannot be a surprise that someone, somewhere, will improve the energy efficiency of the training process. Example: ASICs for crypto mining. The entire crypto mining game changed significantly when that happened. My expectation was that someone would do same for AI training. I don't know if an ASIC is likely in the context of AI training, but we've already seen a few attempts at this, someone said they could improve the efficien
  • Are misrepresented, they only reported the final training and not the costs associated with massaging the data. No one knows how much they spent. I'm glad though because maybe these idiot AI companies will start to actually think of ways to save energy than to 'throw now hardware at it'.

    If you are worried about your power bill get ready for it to double or triple, these energy guzzlers will make power scarce and drive up costs. I'm many data center installations, they also want the power companies to suppor

  • The race to be stupid faster and cheaper?

    • Seconded. This is like Tesla announcing that while their self-driving tech will still kill you 15% of the time, they've found a way to make it drive faster... "um, great, but you guys ARE going to make it not kill me, right?"

Support bacteria -- it's the only culture some people have!

Working...