Meta Sets Up War Rooms To Analyze DeepSeek's Tech (businessinsider.com) 39

Posted by msmash on Monday January 27, 2025 @12:48PM from the panic-everywhere dept.

Meta has set up four war rooms to analyze DeepSeek's technology, including two focusing on how High-Flyer reduced training costs, and one on what data High-Flyer may have used, The Information's Kalley Huang and Stephanie Palazzolo report. China's DeepSeek is a large-language open source model that claims to rival offerings from OpenAI's ChatGPT and Meta Platforms, while using a much smaller budgets.

Meta Sets Up War Rooms To Analyze DeepSeek's Tech

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 39 Comments Log In/Create an Account

Comments Filter:

"Analyze DeepSeek's technology" (Score:5, Informative)

by Rei ( 128717 ) writes: on Monday January 27, 2025 @12:56PM (#65122197) Homepage

They literally released an open paper about it [github.com], so, I mean, wow, much analysis.
Also, everyone copies everyone else's advancements while incorporating their own new ones. That's how the field works. There's also an open project [github.com] from HuggingFace to replicate it open-source, incl. training code.

- Re: (Score:2)
  
  by itzdandy ( 183397 ) writes:
  
  'copies' vs theft is a serious concern. and their 'paper' doesn't necessarily mean fact.
  - Re: (Score:3, Insightful)
    
    by saloomy ( 2817221 ) writes:
    
    A white paper is meant to contain enough info to replicate the solution. That is the point.
    - Re: (Score:2)
      
      by Hodr ( 219920 ) writes:
      
      That may or may not be the point of this particular white paper (I haven't read it), but in general no, that's not the definition of nor requirement for something to be deemed a white paper.
- Re: (Score:2)
  
  by hjf ( 703092 ) writes:
  
  also it's hosted in Meta's very own ollama repo: https://ollama.com/library/dee... [ollama.com] ahahaha
- Re: (Score:3)
  
  by ZipNada ( 10152669 ) writes:
  
  Good info, thanks.
  According to the paper they "pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens" which certainly is a goodly amount. The main innovation appears to be optimization of the training; "requires only 2.788M H800 GPU hours for its full training". And then they helpfully describe the optimization techniques. I assume that Meta will wade through their code so as to thoroughly understand it all and incorporate key features into their own products.
  This could save all the AI play
  - Re: (Score:2)
    
    by allo ( 1728082 ) writes:
    
    Including the Chinese tokens that may be of higher entropy it is really much. Llama3 was trained on 5T tokens.
- Re: (Score:3)
  
  by Tony Isaac ( 1301187 ) writes:
  
  My guess is that the parts that really matter, aren't in the paper.
- Re: (Score:2)
  
  by ceoyoyo ( 59147 ) writes:
  
  Yes, "analysis" will be reading the paper, discussing it, and experimenting with the methods described. I suppose describing it as a "war room" isn't helpful but what did you think it meant?
  Hopefully Facebook has taken the lesson that just copying stuff verbatim and it bigger might not be the most efficient method.
  - Re: (Score:2)
    
    by Rei ( 128717 ) writes:
    
    Meta has been doing plenty of their own research. One of the big things they've been pushing on is moving away from tokens to patches.
- Re: (Score:2)
  
  by ayesnymous ( 3665205 ) writes:
  
  Also, everyone copies everyone else's advancements while incorporating their own new ones. That's how the field works.
  
  That is definitely how Meta works. Well, not so much the "incorporating their own new ones" part.
DeepSeek (Score:3)

by ArchieBunker ( 132337 ) writes: on Monday January 27, 2025 @01:08PM (#65122235)

Really getting pumped this morning.

DDOS war (Score:2)

by mspohr ( 589790 ) writes:

Looks like they have decided to go to war with a DDOS on DeepSeek.
- - Re: (Score:3)
    
    by ack154 ( 591432 ) writes:
    
    Saw this pop up a bit ago: https://www.cnbc.com/2025/01/2... [cnbc.com]
Ran A Copy Locally (Score:2)

by DewDude ( 537374 ) writes:

No body should be touching this.
- Re: (Score:2)
  
  by smooth wombat ( 796938 ) writes:
  
  How about an explanation rather than a blanket statement. Why shouldn't people touch this? What reason(s)?
- Re: (Score:2)
  
  by EvilSS ( 557649 ) writes:
  
  No body should be touching this.
  Really, you ran the R1 685B parameter model? You just happen to have about 700GB of vRAM available? Or did you run one of the smaller distills? Because those are not the same as running the full R1 model.
Copyrights aren't applicable to AI.. Not like that (Score:5, Insightful)

by dmay34 ( 6770232 ) writes: on Monday January 27, 2025 @01:42PM (#65122385)

US AI companies in 2024: "Copyrights aren't applicable to training data."
US AI companies in 2025: "Chinese AI companies are stealing our user data!"

- Re: (Score:2)
  
  by Tony Isaac ( 1301187 ) writes:
  
  The IP in question here is a patent, not copyright. https://www.reuters.com/techno... [reuters.com]
  - Re: (Score:2)
    
    by dmay34 ( 6770232 ) writes:
    
    Right. Tooooootttttaaaallllllyyyyy different.
    - Re: (Score:2)
      
      by Tony Isaac ( 1301187 ) writes:
      
      Yes, actually they are...totally different.
      A copyright protects your *specific work* from duplication without your permission.
      A patent protects your *methods* from duplication without your permission.
      Copyright is automatic, your work is copyrighted unless you specify otherwise.
      Patents must be applied for and approved, and even then, you must protect them yourself by suing anyone who infringes.
Copy Pasted other's code (Score:2)

by akw0088 ( 7073305 ) writes:

I would imagine DeepSeek is a literal copy paste of some other publicly available code+data with minor changes if any
- - Re: (Score:2)
    
    by HiThere ( 15173 ) writes:
    
    IS the code open source? The source I ran across earlier (on Slashdot) said the weights were open source, but not the code. Do you have a source that claims otherwise? (Preferably a link)
OT: in other news (Score:1)

by FudRucker ( 866063 ) writes:

Meta/Facebook has decided Linux is malware, check distrowatch weekly News
War Rooms? Why? (Score:2)

by Spinlock_1977 ( 777598 ) writes:

That 'war room' should already be up and running. Competitive intelligence is a cruicial part of any significant business. Automakers take each other's cars apart. Sandy Munroe takes everybody's apart and sells the info. High tech companies are constantly studying each others' advancements to learn. If Meta really had to set up two new war rooms, I'd say it's asleep at the wheel.
- Re: (Score:1)
  
  by Senshi ( 10461927 ) writes:
  
  They're more like scat-fest rooms base on what kind of creepers running em.
- Re: (Score:2)
  
  by KlomDark ( 6370 ) writes:
  
  Sandy Munroe? The artist behind XKCD??
  - Re: (Score:2)
    
    by dargaud ( 518470 ) writes:
    
    No, that's Randall Monroe...
This is how competition is supposed to work (Score:2)

by Tony Isaac ( 1301187 ) writes:

It causes stress for the companies producing the technology, but we all benefit from the competition.
- Hype and Data. (Score:2)
  
  by stooo ( 2202012 ) writes:
  
  Not really. You think in terms of old school selling products.
  Tech companies in the US now are in the business of selling two things:
  1) Data from their users.
  2) Hype to feed speculation bubbles
  Products are just a mean to achieve those goals.
  - Re: (Score:2)
    
    by Tony Isaac ( 1301187 ) writes:
    
    Everybody, from nonprofits to big tech, has some kind of ulterior motivation. Thankfully, competition doesn't require "pure" motives to be effective. Any kind of (non-criminal) motivation will do.
Mainframe Decentralization Metaphor (Score:2)

by CaptainDork ( 3678879 ) writes:

First, I never met a phor I didn't like.
Isn't this the plot line in the story about mainframes with dumb terminals decentralized to distribute workload to PCs?
Western AI relies on ridiculous centralized data centers while DeepSeek R1 pushes the load to the desktop. Look at the specs to run a local copy of it.
Let's suppose this is true (Score:5, Interesting)

by kwelch007 ( 197081 ) writes: on Monday January 27, 2025 @06:45PM (#65123417) Homepage

For the sake of discussion, let's presume DeepSeek has actually found a far more efficient (~20x) way to train and run these models. That is, they can keep up with the likes of OpenAI, Google, xAI, Meta, whoever, which 1/20th the hardware and electricity. Let's just say that's true.
What happens when OpenAI/Google/xAI/Meta reverse-engineer and implement their own version of this and then run it on their massive compute platforms? Does that mean ChatCPT/Gemini/etc are now 20x more powerful?
I'm sure the curve isn't quite that straight, but if even close, I'm not sure I see how this makes DeepSeek so valuable, or conversely the other player less valuable. The standard of product per unit of input just gets higher.
That's assuming this is real, and that it can scale. There's a lot of "assume" in that.

- Re: (Score:2)
  
  by chas.williams ( 6256556 ) writes:
  
  The training datasets will simply get larger. Data is growing exponentially and faster than computing power. Even with a 20x speedup, that's a single blip in the computing power curve.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Meta Sets Up War Rooms To Analyze DeepSeek's Tech (businessinsider.com) 39

Meta Sets Up War Rooms To Analyze DeepSeek's Tech More Login

Meta Sets Up War Rooms To Analyze DeepSeek's Tech

"Analyze DeepSeek's technology" (Score:5, Informative)

Re: (Score:2)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

DeepSeek (Score:3)

DDOS war (Score:2)

Re: (Score:3)

Ran A Copy Locally (Score:2)

Re: (Score:2)

Re: (Score:2)

Copyrights aren't applicable to AI.. Not like that (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Copy Pasted other's code (Score:2)

Re: (Score:2)

OT: in other news (Score:1)

War Rooms? Why? (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

This is how competition is supposed to work (Score:2)

Hype and Data. (Score:2)

Re: (Score:2)

Mainframe Decentralization Metaphor (Score:2)

Let's suppose this is true (Score:5, Interesting)

Re: (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot