Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
AI Technology

Nvidia Unveils $3,000 Personal AI Supercomputer (nvidia.com) 71

Nvidia will begin selling a personal AI supercomputer in May that can run sophisticated AI models with up to 200 billion parameters, the chipmaker has announced. The $3,000 Project Digits system is powered by the new GB10 Grace Blackwell Superchip and can operate from a standard power outlet.

The device delivers 1 petaflop of AI performance and includes 128GB of memory and up to 4TB of storage. Two units can be linked to handle models with 405 billion parameters. "AI will be mainstream in every application for every industry," Nvidia CEO Jensen Huang said. The system runs on Linux-based Nvidia DGX OS and supports PyTorch, Python, and Jupyter notebooks.

Nvidia Unveils $3,000 Personal AI Supercomputer

Comments Filter:
  • Just getting that much VRAM costs more than 3k... If it's not crippled in some way I can see these flying off the shelves.

    • by Rei ( 128717 )

      I suspect NVidia will be siphoning $3k out of my bank account this year :P

      • by Registered Coward v2 ( 447531 ) on Tuesday January 07, 2025 @11:16AM (#65069979)

        I suspect NVidia will be siphoning $3k out of my bank account this year :P

        Per TFA, the siphoning will start at $3k

        • Yes, but the difference in prices will be storage-related, because per TFPR, "Each Project DIGITS features 128GB of unified, coherent memory and up to 4TB of NVMe storage." The max price shouldn't be too much higher given the difference between a plausible lower bound on provided storage (I'd say 512GB) and the price of a 4TB NVMe SSD. Unless they go full Apple and it's soldered, I guess. But even then, why would it matter? As long as local storage is significantly larger than memory, you're not going to bo

          • by Rei ( 128717 )

            I mean, I'd just connect it via NFS to my fileserver regardless. Most storage needs for training have no performance constraints, and for inference, it's just about how long it takes to load the model when you change models.

            • by dfghjk ( 711126 )

              "Most storage needs for training have no performance constraints..."
              Perhaps for this device's intended use, but not generally true. Nothing says sophisticate quite like spending thousands on a dedicated AI compute module for your desk, then not caring how data is fed to it. I mean, it's there for bragging rights, not to get work done!

              The real problem here is that the device runs Python AI code bases, those code bases use python libraries to import data sets, python is slow as shit, and this Nvidia processo

          • The max price shouldn't be too much higher given the difference between a plausible lower bound on provided storage (I'd say 512GB) and the price of a 4TB NVMe SSD. Unless they go full Apple and it's soldered, I guess

            If they go full Apple, the price delta will be a small part of the price difference...

            • That's definitely where I was going with that. Soldering the storage makes it possible to overcharge for more of it.

        • Fully expect to see these evolve into general purpose compute boxes used in a typical office and cut out more expensive cloud computing.

          • by Rei ( 128717 ) on Tuesday January 07, 2025 @12:35PM (#65070249) Homepage

            A killer app here would be local inference servers for commercial coding tasks. A lot of companies refuse to let their code leave their office for queries to eg. Claude or whatnot. A server like this could serve a really hefty local model to dozens if not hundreds of programmers at once. At ~$3k per unit (or even more), what software development company that wasn't adamantly opposed to AI wouldn't buy one?

            In addition to the open coding models out there, I could picture companies like Anthropic offering servers that are already set up for an open-weights version of one of their flagship models that's been optimized for coding, running on an inference server specifically optimized to both the hardware and the model, shipping with a full commercial license for any number of programmers, for an addon fee atop the base hardware price. Whereas those who are "on the cheap" will just install the best free open model and inference server on their own.

            • 1. Put AI to answer on a general questions phone support line
              2. Ask "Did that answer your question?" or "Did that solve your issue?"
              3. Ask "How would you rate the quality of this call today?"

              And use positive and high rating answers to #2 and #3 to up the rating of the conversation for future training of the AI.

            • by dfghjk ( 711126 )

              Because the most important problem to solve when using shitty AI coding crutches is keeping it a secret.

              "what software development company that wasn't adamantly opposed to AI wouldn't buy one?"
              Ones that solve the problem in other ways?

              This is a product that gets a very specific CPU down to a price and form factor suitable for a single user at a desk. It's not a server for "hundreds of programmers". Perhaps "what software development company" would be a company that understands that reality.

      • Re: (Score:2, Insightful)

        And NVidia's AI will be siphoning out the rest...
    • Just getting that much VRAM costs more than 3k... If it's not crippled in some way I can see these flying off the shelves.

      I suspect it will as well, especially if you can run a server on it so a small company could setup its own secure AI system free of cloud fees to access the system's power. We are experimenting with using AI for a product and run a model on an M3Max Mac; this would be a whole new level of capability at a bargain price.

    • But how is it at email?
    • by ClickOnThis ( 137803 ) on Tuesday January 07, 2025 @11:30AM (#65070035) Journal

      Just getting that much VRAM costs more than 3k... If it's not crippled in some way I can see these flying off the shelves.

      Some more details are in this other article. [theregister.com] An excerpt:

      Project Digits vaguely resembles an Intel NUC mini-PC in terms of size. Nvidia hasn’t detailed the GB10’s specs in full but has said the machine it powers delivers a full petaFLOP of AI performance. But before you get too excited about the prospect of a small form factor desktop outperforming Nvidia’s A100 tensor core GPU, know that the machine’s performance was measured on sparse 4-bit floating point workloads.

      Specs we’ve seen suggest the GB10 features a 20-core Grace CPU and a GPU that packs manages a 40th the performance of the twin Blackwell GPUs used in Nvidia’s GB200 AI server.

      So, 1/40th the performance of twin Blackwells. I don't suppose that counts as "crippled" but there you go.

      • by Rei ( 128717 )

        But that's like a $70k piece of kit that does 40 PFLOPS at FP4 with 384GB VRAM.

        So by that estimate, this would do 1 PFLOP at FP4, with 128GB VRAM, for $3k. I'm not complaining. And it's still on the same chip architecture, so efficiency should be just as good.

        • Thanks for the reply. The question was whether it's crippled. I'm going with "probably not" but let's wait for the independent specs.

          I might even get one myself.

    • But can it run Crysis?

  • by bugs2squash ( 1132591 ) on Tuesday January 07, 2025 @11:29AM (#65070029)

    It reminds me of the old "Byte" days of reporting progress in flops - like page 143 featuring this advert for a "Screamer 500" [vintageapple.org] from 1997.

    Running on a 500 MHz 21164 that bursts at 1 gigaflop, a dot product kernel we use for compiler testing runs at a mindboggling 940 megaflops! ! !

    They may not have been able to compete with modern performance, but "Screamer", that's a great name

  • For who? (Score:3, Interesting)

    by CEC-P ( 10248912 ) on Tuesday January 07, 2025 @11:30AM (#65070037)
    Any company would use a bunch of servers and a central data storage, not a standalone "personal" AI device. Based on Gemini reactions and Copilot sales, no individuals have any interest in AI, let alone running their own LLMs or anything else related to AI. So who are they trying to sell this to and for what purpose? I think this is just "make stock go up" AI bullshit before the bubble bursts.
    • by bjamesv ( 1528503 ) on Tuesday January 07, 2025 @11:50AM (#65070115)

      So who are they trying to sell this to and for what purpose? I think this is just "make stock go up" AI bullshit before the bubble bursts.

      Well, this is a small ARM board with big GPU (Blackwell is ARM) and I used their previous $3000 small ARM board (Orin AGX) for compute heavy operations in mobile, battery powered robotic platforms. The AGX is several years old, had 64GB and could do 0.275 petafiop at 60W, or about half that on 15W. https://www.ebay.com/itm/22493... [ebay.com] (Used, 2.6k usd)

      This seems like an update to the years-old AGX, small portable package with 4x performance and 2x ram at the same price-point, so I imagine targets the same developer audience who needs small mobile/distributed compute.

      • *Edit, sorry "Grace" is ARM cpu. Will be interesting to see if this uses the mobile/Tegra version of Nvidia CUDA or if this cut-down Blackwell GPU runs the ARM datacenter/"Titan" version of CUDA libs. Personally I find the datacenter Cuda distribution easier to work with
        • by night ( 28448 )

          It runs their cloud/server stack that's in DGOS, an Ubuntu derivative. It's much more closely related to the GB200 than the Tegra related devices like the orin.

    • by EvilSS ( 557649 )
      It's for AI developers. The official press release makes this pretty clear: NVIDIA Puts Grace Blackwell on Every Desk and at Every AI Developer’s Fingertips [nvidia.com] . Being able to run 200B models locally for $3K is insanely good. Plus you can link two of them to run 405B models. For companies doing AI development this could be a huge cost savings vs using cloud or datacenter GPUs for their developers.
    • You can probably pip install jupyterhub and make Jupyter available on it over the network.

      • by night ( 28448 )

        Indeed, the entire CUDA related stack and any tools that use it should "just work".

    • Wow, have you not even tried this yourself??? Get a halfway decent video card and try ollama? Educate yourself so you donâ(TM)t get left behind and make silly statements.
  • by Anonymous Coward
    I have one question about this: how did they get that many hamsters to run on their wheels all at the same time?
    Now in addition to money and resources being wasted on AI nonsense, they're going to lure people into impoverishing themselves even farther than they're already impoverished. People struggle to afford their rent and Nvidia wants to convince them to buy a computer that starts at $3000?
  • by necro81 ( 917438 ) on Tuesday January 07, 2025 @11:36AM (#65070063) Journal
    The summary says "up to one petaflop" but, as HotHardware points out [hothardware.com], they're really talking about 4-bit floats, not a more general 32- or 64-bit floating point operation. That's appropriate for AI workloads, but makes comparison to other systems a bit tricky.

    Still, it's a slick package and a lot of power. The available images don't show any active cooling, which is hard to fathom. They probably just omitted that (and heat sinks generally) from the press materials. Is it just me, or do the front and rear panels look like they're made of copper sponge?
    • by night ( 28448 )

      FP16 should run approximately 4x slower on the nvidia personal AI, so 250 Gigaflops. FP16 is much more commonly quoted for other CPUs and GPUs.

  • 128GB max ram? the old m 2 mac pro can do more and other PC hardware can go much higher.

    • by EvilSS ( 557649 ) on Tuesday January 07, 2025 @12:08PM (#65070181)
      128BG of unified RAM so the GPU has access to it. Important distinction when talking about LLMs.
      • by dfghjk ( 711126 )

        Quite a differentiation over Apple Silicon! /s

        • by EvilSS ( 557649 )
          At the 3K price point it's a good competitor to Apple in this space*. A Mac Mini with half the RAM costs about what this costs.

          *Pending third party performance testing of course.
          • by dfghjk ( 711126 )

            But the comment wasn't about cost, it was about an "important distinction" being unified RAM which all Apple Silicon has.

            But if you're gonna make stupid comments and move the goalposts, a Mac mini with half the RAM is a far more capable device over a broad range of applications. This is a very specific device for a particular application, it does not replace a general purpose computer (well). But again, to be clear, unified memory is not a special feature, or even a good one. It is well suited to this ca

      • Apple Silicon is unified RAM.

        • by EvilSS ( 557649 )
          And? I didn't say this was the ONLY device with unified RAM. If you think I did, please point out where in my post you got that idea from.
          • by dfghjk ( 711126 )

            You said it was an "Important distinction when talking about LLMs" specifically when compared to an Apple silicon Mac. You literally said that unified memory distinguished this device over a Mac, and it was "important".

            But no, you didn't say it was the ONLY device with unified RAM, and neither did the person you responded to. What you said was that unified RAM distinguished this device over an M2 Mac Pro, which is not true. And now you're making bad faith arguments. It must really suck being you.

        • by EvilSS ( 557649 )
          And to be clear, yes, it does. However, Joe also put this in the same sentence: "and other PC hardware can go much higher." which is why I responded as I did, as PC's generally do NOT have unified RAM, which tells me Joe doesn't understand the distinction. Also that M2 Mac Pro baseline config is over 2x the costs of this and only has 64GB of RAM. Even the Mac Mini M4 maxed out costs what this does with only 64GB. The only caveat here being the NVidia product uses LPDDR5X so that's why I mentioned 3rd party
          • by dfghjk ( 711126 )

            This machine is not a general purpose computer, it is a specific function device optimized for development tasks. It has a truly shitty CPU complex compared to a Mac. Your argument is very SuperKendall-esque, cherry picking details and intentionally misrepresenting both facts and previous arguments. You sound like an old Apple fanboy.

            "PC's generally do NOT have unified RAM"
            Not true, and that feature is not new. It's generally an inferior approach used on cheap PCs, but repopularized specifically in Appl

    • VRAM has vastly higher bandwidth.
    • by night ( 28448 )

      Right. But the vast majority of x86 laptops and desktops have 128 bit wide memory and would be severely bottlenecked for AI workloads. This widget should have a faster memory system (widely speculated to be 256 to 512 bits wide). No official word from nvidia ... yet.

      • by dfghjk ( 711126 )

        Right, and tiered memory architectures have been proven inferior, right? That's why we only have cache-less, unified memory systems today.

  • up to 4TB of storage so only 1 m.2 slot?
    does it have sata?
    what is the number of pci-e lanes?
    what kind of IO does it have?
    have pci-e slots?

    • It's there to do your thinking for you, dude, not to store your p0rn.

      • It's there to do your thinking for you, dude, not to store your p0rn.

        When it can store as much porn as our brains, then it can do our thinking for us. Until then, it's just a useless business toy. Sigh.

  • can I use this machine to mine Bitcoin?

  • Really?
    Nobody?
  • Look at the picture they posted to go along with this press release. Look closely. Exactly what alphabet is this obviously AI generated image using? Klingon? Hallucinations are pretty funny. AI THINKS that those shapes look like language, as though we just write shapes on paper and they have some meaning, throwing away an established alphabet.

    • Very good catch! Even the window titles, and OS-level "icons" are obviously bull.

    • by dfghjk ( 711126 )

      "AI THINKS that those shapes look like language..."
      What makes you think that AI believes those shapes are supposed to be language?

      "...as though we just write shapes on paper and they have some meaning..."
      That is exactly what "we" do.

      "...throwing away an established alphabet."
      What evidence is there that AI is even capable of that? An "established alphabet" is a fundamental component of how AI works with language, AI couldn't "throw away" its alphabet without discarding itself.

      • by alta ( 1263 )

        The AI placed that 'text' where text would go.

        WE have already agreed on alphabets to use. We don't make them up each time we write a new document.

        It's a bit of hyperbole, it didn't literally throw it away. But it didn't use an established alphabet. It didn't even use it's own as the 'letters' don't have re-use, each is unique. And image diffusion works quite differently from LLMs, including their use of language. So saying away it would discard itself is a little silly.

        But I'm glad you had the time to

        • by dfghjk ( 711126 )

          "The AI placed that 'text' where text would go."
          You don't have any reason to believe AI was even involved in generating that image.

          "WE have already agreed on alphabets to use. We don't make them up each time we write a new document."
          But we did at one time. Alphabets are literally just shapes written on paper, to use your specific language. We literally write shapes on paper to create writing. You didn't say the shapes were entirely new each time, although that has historically also been true. Not all lan

  • It's all about the GB/s and memory pool.

  • "Can operate from a standard power outlet" without any specification of where in the world this standard power outlet is supposed to be amounts to between 1.5 and 3 kilowatts. So this has your virtual companion's electricity costing you roughly the same as the regular 100..200 pound human chained to the radiator in your basement is costing you in pizza rolls.
    • by dfghjk ( 711126 )

      LOL the GB200 has a TDP of 2700 watts, this is 2-3% of that processor. Not that a 3KW wall outlet is a problem, BEV chargers are commonly 3x more powerful than that.

      You might want to take a look at the size of the product, then wonder where 3000 watts could possibly go.

      • So when the electric company collectors show up looking for their money, you just explain to them that the thing isn't drawing any more power than a 12kW rack server generates in waste heat, and also you could run fully three of them out of an EV socket and get away with it maybe. Might work as a Chewbacca defence I guess.
        • | sed 's/12kW/120kW/'
        • by dfghjk ( 711126 )

          Wow stupid has no limits, huh?

          No, the argument is that you could run 30 of them out of an "EV socket" because your assumption that it consumes that much power is bullshit. Get it, short bus boy?

  • I still remember the days of the Riva 128 graphic cards that polygon-cracked whenever the player turned. From those humble beginnings I too played a part in feeding what is turning out to be quite the evil monopoly. Sigh. If only AMD would make a stand. If only 3DFX had survived their mistakes. Sigh.
  • Don't forget to add on the Trump Tariffs
    (for any components not made in the USA)

  • At $0.17/KWhour, what will it cost to run? And keep cool?

    Can it heat my house?

    • by dfghjk ( 711126 )

      Did you see the package? Where do you think a heat exchanger of your imaginary capacity could possibly go?

      What will it cost to run? Probably about as much as a desktop computer. It requires no special power source or cooling.

The finest eloquence is that which gets things done.

Working...