Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Power AI Microsoft

Microsoft Needs So Much Power to Train AI That It's Considering Small Nuclear Reactors (futurism.com) 113

An anonymous reader shares this report from Futurism: Training large language models is an incredibly power-intensive process that has an immense carbon footprint. Keeping data centers running requires a ludicrous amount of electricity that could generate substantial amounts of greenhouse emissions — depending, of course, on the energy's source. Now, the Verge reports, Microsoft is betting so big on AI that its pushing forward with a plan to power them using nuclear reactors. Yes, you read that right; a recent job listing suggests the company is planning to grow its energy infrastructure with the use of small modular reactors (SMR)...

But before Microsoft can start relying on nuclear power to train its AIs, it'll have plenty of other hurdles to overcome. For one, it'll have to source a working SMR design. Then, it'll have to figure out how to get its hands on a highly enriched uranium fuel that these small reactors typically require, as The Verge points out. Finally, it'll need to figure out a way to store all of that nuclear waste long term...

Other than nuclear fission, Microsoft is also investing in nuclear fusion, a far more ambitious endeavor, given the many decades of research that have yet to lead to a practical power system. Nevertheless, the company signed a power purchase agreement with Helion, a fusion startup founded by OpenAI CEO Sam Altman earlier this year, with the hopes of buying electricity from it as soon as 2028.

This discussion has been archived. No new comments can be posted.

Microsoft Needs So Much Power to Train AI That It's Considering Small Nuclear Reactors

Comments Filter:
  • BMCoD (Score:5, Funny)

    by dskoll ( 99328 ) on Saturday September 30, 2023 @10:41PM (#63890931) Homepage

    Blue Mushroom Cloud of Death...

    • That actually makes a lot of sense. Put Skynet's training nest inside a nuclear reactor, and the USAF will think twice about blowing it up while it has the chance.....

      (.... then again, the Russians may have no such qualms)

    • Nuke the entire site from orbit--it's the only way to be sure.

  • I mean, MS could probably just write a check and buy a grid-scale-battery-tech startup if it wanted to.

    That, plus wind/solar/other-green tech that doesn't make nuclear waste means you won't have to fight the environmental lobby. Well, not as much anyway.*

    * Not all forms of renewable energy are "green" once you factor in the hit to the environment for building you plant and replacing/maintaining its parts that wear out, such as batteries. But at least you won't have the political problem that comes with nu

    • by Kernel Kurtz ( 182424 ) on Saturday September 30, 2023 @11:06PM (#63890963)

      I mean, MS could probably just write a check and buy a grid-scale-battery-tech startup if it wanted to.

      They could, but they would still have to charge the batteries which means many more checks.

    • by quenda ( 644621 )

      I mean, MS could probably just write a check and buy a grid-scale-battery-tech startup if it wanted to.

      They could, and I'm sure are looking at that too. "less risky" means investing in multiple areas, not putting all your chips on one number. We need to be investing in nuclear, batteries and much more.

      plus wind/solar/other-green tech

      Solar is mature and cheap, with wind getting there. They don't need help. It is the grid-level storage as you say, that needs risk-tolerant investors.
      Another approach is to make the computers cheaper, so they can build more and only run them when the sun is shining. No batteries needed.

    • by guruevi ( 827432 )

      They're trying to DECREASE their carbon footprint, not increase it.

    • You are correct that the problem with zero carbon power is political not technical, but could you please stop repeating the fake talking point about storing nuclear waste. We are only storing nuclear waste because we are too stupid to recycle it. The problem is caused by Russian subversion of the “green” movement.
  • by SuperKendall ( 25149 ) on Saturday September 30, 2023 @10:53PM (#63890945)

    For one, it'll have to source a working SMR design.

    NuScale already has a small modular design design certified [energy.gov]

    The waste is roughly equivalent to conventional reactors [powermag.com], but don't forget that there will be a lot less of it. Waste storage is very much a solved problem.

    • by Anonymous Coward

      Waste storage is very much a solved problem.

      That was news to me. Could you show me where there is one in operation?

    • Gates and Buffett are already putting money into a molten-salt reactor design. Proposed site is in Wyoming on land previously used for a coal-burning power plant (indirectly owned by Buffett). I actually give them a better chance of getting something up and running than the NuScale-based project in Idaho. The NuScale project is likely to run into serious financial problems before much longer. The Gates/Buffett project is, at least, backed by people who can pick up the phone and call people who might be
    • Waste storage is a solved problem? That's certainly news to me. I'm not trying to say anything about nuclear in a broad sense but last I heard we still hadnt been able to find a place to permanently store our waste here in the US which means this problem has not been solved.

    • by AmiMoJo ( 196126 )

      Have you looked at NuScale's design? It's... not great.

      It needs more regular refuelling, which means more waste. Long term storage is an unsolved problem in many countries, including the US. Short term they need a bigger pool for it.

      And you still need a cooling pool for the reactor, and a containment building. All weather, Earthquake, and terrorism proof. Which means surveys, more land needed...

      They aren't commercially viable. Demonstration devices at best. Unproven serial production. Very far from plug-and

  • by zenlessyank ( 748553 ) on Saturday September 30, 2023 @10:55PM (#63890947)

    And Musk puts it in his rocket. AI takes it over. This is so fucking cool.

  • by paul_engr ( 6280294 ) on Saturday September 30, 2023 @10:56PM (#63890953)
    LLMs and this kind of AI is a stupid fucking idea
    • Better this than doing proof of work for random shit cryptocurrencies. At least there's an application for an LLM.

      • Is there, though? The work output I've seen is all garbage. Something trained on society can only poorly puppet society. We're a dumb species.
        • by Rei ( 128717 ) on Sunday October 01, 2023 @04:06AM (#63891219) Homepage

          I seem to go through about three ChatGPT sessions per day, with each session averaging about 4 generations. So while I can't speak for others, I do use it quite a good bit.

          Last night I was trying to figure out what to do with a kohlrabi bulb I got from the garden. I asked it for ideas. It suggested a bunch of recipe ideas. I selected the kohlrabi fritters. It made a recipe. I asked for it converted to metric. Then I had it make it egg-free, as I had no eggs in the fridge. Then I had it cut down the recipe, telling it I only had one small kohlrabi bulb instead of two large ones. Then I asked it for ideas for sides. I chose the couscous one and asked for a recipe.

          It was all delicious.

          Also in yesterday's sessions:
            * Asking it for a way to read zstd files without fully decompressing their contents first, akin to zcat.
            * Asking for code to read random blocks of zstd files
            * Asking for ideas for why a tenant might not be able to get evicted from an apartment, to employ in a snide remark
            * Generating an AI-generated response to a person complaining about AI-generated content, as meta-commentary.
            * Asking for a Google Sheets formula to selectively average a column across certain parameters
            * Asking it to generate a list of positive things about fossil fuels, since whether AIs like ChatGPT would answer that question was a topic on Slashdot yesterday.

          This is all just ChatGPT. I actually ran about 7000 LLM queries locally yesterday, but that was almost entirely generating training data to specialize a lightweight AI to a specific task.

          • by jenningsthecat ( 1525947 ) on Sunday October 01, 2023 @09:02AM (#63891573)

            All the time I was reading your answer, I was lamenting your abandonment of personal creativity and synthesis to a resource which others control and can make too expensive to use - or even deny entirely - at any time.

            I can't criticize you - I use search engines so much that it would be hypocritical of me to do so. I'm simply pointing out how much personal agency and effort we're giving up. In some sense this process is almost the very definition of civilization and its growth; but we're approaching near-verticality on that curve, and it's going to be one hell of a descent when all of these mind-surrogates collapse and leave us in free-fall...

            If mankind survives, future generations will talk about this the way we talk about the fall of the Roman empire.

            • by Rei ( 128717 )

              Do you spin your own flax too?

              • Do you spin your own flax too?

                No. If I was that dedicated to independence then I'd go to the library instead of crafting search-engine queries, and I'd walk more than I do, and so on.

                Further to the point I think you were making, I am acutely aware of how reliant I am on the work of others to ensure my own survival, never mind my own comfort and ease. How far along this interdependence path constitutes 'too far' is debatable; but given the economic hardships we're now undergoing - caused at least in part by globalization taking a couple

                • The alphabet is a crutch.

                  I read or heard recently where someone was discussing an ancient people who didn't care to use writing because it diminished their memory training. I forget now who it was, but I think it was Druids or Germans.

                  I should have written it down.

                • Considering that the US still has a major illiteracy (and by extension education) problem this is not exactly as ridiculous of a concern as some might believe. Our schools are nowhere near doing what is needed to foster the next generation of thinkers, creatives, and warriors needed to maintain the US.

                  Without proper parenting and education, LLM tech will likely end up replacing, rather then augmenting the intelligence of our children. Kids are already using it to copy-paste answers from ChatGPT rather then

          • So, how much did they charge you? As in, if it needs a nuke to power it, it is not free or cheap
          • by AmiMoJo ( 196126 )

            Be careful with recipes. People have been getting poisoned by bad info on mushrooms from ChatGPT. I wouldn't trust it with stuff like allergy advice.

        • "AI" is incredibly good at generating hype.

      • by Misagon ( 1135 )

        Yes, breaking other people's copyright.

        Microsoft had been looking for years to crack the open source movement, until neural network technology was mature enough that they could create Github Copilot.

      • Better this than doing proof of work for random shit cryptocurrencies. At least there's an application for an LLM.

        My thought exactly. At least you can get good kohlrabi bulb recipes out of an LLM.

    • by jwhyche ( 6192 )

      What the hell are they training it to do that they need so much power?

    • It's more of a sign that we need to start planning for a future where we're looking to an energy system and power grid that can provide terawatts and petawatts cheaply and efficiently rather than megawatts and gigawatts. Minor improvements in energy conservation are nice and all, but they'll never overcome the upward trend of energy usage we really need to consider and prepare for. Real jumps in advancement that can ultimately address humanity's longest term problems (asteroid defense, mitigating supervol
    • No, it's a sign that the technology is still in its infancy. Every technology starts out being wasteful and expensive. Over time, engineers streamline it and improve its efficiency.

      With any technology, first you make it work, then you make it work better and more efficiently. A year ago, few people had even heard of LLMs. Give it time.

  • Big data is the perfect use case for SMRs, and could be just what they need to achieve critical mass (no pun intended) in other industrial uses. Don't like Microsoft but will give them props if they ever become due.
    • Big data is the perfect use case for SMRs ... Don't like Microsoft but will give them props ...

      You DO realize that it'll be controlled by Microsoft Windows, with it's many perfect patches and random system reboots?

      And they'll need an additional patch to update the the BSOD color to Cherenkov Radiation Blue. What's the QR error code for "Out Of Container Experience (previous: OOBE) melting in the background?"

  • Once they start growing and harvesting humans for energy, we know we're done for...

  • by ChunderDownunder ( 709234 ) on Sunday October 01, 2023 @12:35AM (#63891041)

    Look, tbh, I'm as brainwashed a NIMBY as anyone, living in a country with a moratorium on nuclear energy. So the initial challenge is selecting a jurisdiction that would allow assembly, safe operation and, ultimately, decommissioning of said resource for a for-profit US corporation without cutting corners on pos-tprocessing that delicious radioactive waste.

    I guess being in the cloud, they can host these language models anywhere. And if they're language models using anonymized training data then there are less pesky regulations about storing user sensitive information on foreign servers.

    • by AmiMoJo ( 196126 )

      The initial challenge is developing a proven, commercially viable, small reactor. They don't exist. The only company even building prototypes is NuScale, and theirs is just a demonstrator. It will need a significant amount of improvement, really a second generation, to become useful.

  • More than iron
    More than lead
    More than gold I need electricity
    I need it more than I need lamb or pork or lettuce or cucumber
    I need it for my dreams
    Racter - 1984

  • Economies of scale? (Score:5, Interesting)

    by ctilsie242 ( 4841247 ) on Sunday October 01, 2023 @01:07AM (#63891061)

    Getting companies to hop on the nuclear bandwagon might just give the economies of scale that can bring mainstream nuclear power out of the 1950s-era to a modern age of new gen designs, thorium reactors, and designs which can be implemented, built, and inspected at the factory, trucked (or flown) to a location, used, and easily decommissioned, with breeder reactors ready to take the spent fuel to reprocess it.

    We spent fifty years in a propaganda haze where nuclear was viewed as "scary". It might be time to come out of the energy Stone Age.

    Energy is wealth, and having nuclear energy bring down energy costs can greatly help every facet of life, be it not just handling CO2 emissions, but having things in place to mitigate them, be it desalination plants, thermal depolymerization factories which can take the waste plastic and return monomers or even mineral oil. Even things like converting CO2 to gasoline, while electrical infrastructure is brought up to par for BEVs. With nuclear, we can have that net-zero carbon emissions.

    Of course, nuclear isn't everything. Getting battery energy density by weight and volume to within an order of magnitude of diesel or gasoline would radically change energy usage as we know it... but nuclear is something we can use in an interim until fusion is viable, storage batteries are usable so PV solar can handle base load, and fuels can be synthesized from CO2 and reused.

    The more companies throwing money at making usable reactors the better. Maybe we can get a reactor that can not just power a data center, but also power the data center's cooling. Any excess power can be thrown on the grid. Even if it just power at night due to need for cooling at the day, that will help with EV charging.

    • Energy is wealth...

      I agree with pretty much everything you said. But I would re-word this one quote as "Energy is salvation".

      I'm inherently anti-nuke because of its high and potentially terminal short-term risk when compared with other sources of energy. But I have come to believe that at this late stage of global warming, rapid rollout of nuclear energy, accompanied by a large and rapid drop in fossil fuel use, is the only path that might save our sorry asses.

  • I apologize for the mistake in my previous response. You are correct that I didn't need to access that nuclear power source.

    M5 [youtube.com]

  • Sleeping humans in pods sounds like a better power source for A.I. (ðY)
  • ... who looks into that topic. The most likely result will be that unless some magical "disruptive" event happens, it'll still be by far the most expensive way to power a data center.
    However hiring someone to look into that is essentially free for Microsoft, and it frees upper management from having to do hard thing like logic or math.

  • by Walt Dismal ( 534799 ) on Sunday October 01, 2023 @02:36AM (#63891111)
    Odd isn't it that the human brain does all this but operates on 20 watts. Hint hint - LLMs can be useful but they are on the wrong path. The architecture can't match the immense parallelism of the human brain. That should be clearer but the AI field is a herd these days of stupid clever people all gobbling GPUs. Wise up -- the vector-knowledge GPU approach is flawed in some ways.
    • That. In the end, it will come to efficiency, and we'll discover that it's much more efficient to breed a smarter monkey.

    • by Rei ( 128717 ) on Sunday October 01, 2023 @04:43AM (#63891251) Homepage

      The brain has several things going for it. Sparse activation - at any point in time in the cortex, only a couple percent of neurons are firing, and in response to any given stimulus, only 10%-ish might fire at all. It rewires itself in response to stimulus, with specific tasks only having the compute capacity they need. Rest energy consumption is very low. The brain seems to train severalfold (but not orders of magnitude) more efficiently than LLMs as well.

      Perhaps the biggest is extreme synaptic efficiency. The heaviest-weight GPT-3 model (175B parameters) has 96 layers with 96x128 heads (remember that "parameters" include not just biases, but the weights on every single connection between neurons, so parameter counts will always vastly exceed "synapses"). If you're generating say 10 tokens per second per 300W card, then that's equivalent to ~120k synapses per second and ~400 synapses per watt. But the human brain does a couple quadrillion synapses per second, give or take an order of magnitude. It's not having to use matrix math with large matrices to do things like sum weights or do backpropagation for each neuron - that just emerges naturally from the chemistry. It's like the difference in energy required to pour a beaker of a liquid and measure it, vs. the amount of energy to computationally sum up the masses of all of the atoms in that liquid.

      It's not entirely lopsided, mind you - the brain has a lot of waste. There's a lot of redundancy, because neurons can die. It has to use neurons (white matter) basically as a data bus instead of just using a data bus. Per-neuron firing rates are low (though more than offset by having huge numbers of them). Etc. But overall, it's very efficient. And it has to be. Our brain is only ~2% of our body weight but ~20% of our energy consumption. As a species, we wouldn't have made it this far if our brains had been significantly less efficient.

      • Yeah, convolutional neural nets need a completely new approach in how they are implemented. At the moment we're basically simulating them, which is horrendously inefficient. It reminds me of things like the SAW filter - which is basically able to do absurdly high speed convolution operations very efficiently and cheaply compared to using a DSP. If CNNs continue to produce progress, I'd imagine this sort of area is where the big breakthroughs are going to occur.

        • by Rei ( 128717 )

          And the thing is, it's not like one needs super-high precision, esp. in inference. Like, some people are running models quantized down to as low as *two bits* per parameter. You can take a LOT of analog noise during inference without problems.

    • by Njovich ( 553857 )

      Great idea, why don't you just pull one of these brain type machines or algorithms out of your hat so we can use those. You can probably collect billions of dollars and a nobel prize with it too so that would be double smart.

    • In theory, the training of these trillion+ parameter models is the most expensive aspect. Humans have a very long and expensive training process. Around a quarter of our calories are used by the brain, but I'm going to ignore that distinction since you do need to interface with the outside world to train that brain.

      It takes a human brain about 20 years to be useful. During that time we eat an average of 1500kCal / day = 6MJ / day, and after 20 years we've used up 45 gigajoules (12MWH). This gives us our ba
      • by ThosLives ( 686517 ) on Sunday October 01, 2023 @09:50AM (#63891643) Journal

        What do you mean, 20 years for humans to be "useful"? Humans 4 years old can talk, often in multiple languages, with semantic understanding even if basic. Many can even read at that age. They can do pattern recognition. They can walk around without running into things. They can perform eye-hand coordination tasks. Adolescents used to be employed to do a huge range of complicated physical tasks. They can play sports (well!). Teenagers can operate vehicles, draw, read, sing, compose and play music. This is not "takes 20 years to be useful."

        What these companies are doing isn't "training" like education - they are simulating millions of years of evolution process to find the correct "structure" of the neural nets that demonstrates the flexible learning and wide range of capabilities of the human mind. This is why it's so "expensive."

        I agree with the grandparent that the current state of the art is "doing it wrong": trying to compute a neural net is going to be much more energy expensive and, I would say, flawed, compared to just building an analog (not digital) neural network. The breakthrough is going to be when we figure out what (analog) structure to build, not just set up a giant matrix solver.

        • I want to start this by saying, I agreed with my calculations that humans take way less power to 'train' than 'AI'. But human's don't take '20 watts' to train, there's a big setup cost which requires a surprisingly small of energy. Although it's really hard for a human to compete with the upper bound of ChatGPT's running expenses so they're shockingly energy efficient there in the same way I can't compete with the electric costs of a pocket calculator in my energy efficiency at solving math.

          > "What do yo
    • You've identified the right problem but underestimate the extent to which it's being worked on. In your post you referenced GPUs and not CPUs because we understand it's a parallel problem. Lots of smart people are working on it and gpus are incredibly parallel compared to what was possible in such a small space 20 years ago. On the other hand we're still working with silicon because it's the best we've got. It also has some massive advantages: even if it was power efficient, no one would be very interested
    • You beat me to the 20W model. A Meat Language Model can be trained to university entrance level with 20 * 24 * 365 * 18 = 3154 kWh equivalent computing unit energy dissipation. Dissipations other than thermal give a success rate of maybe 8% but we still come out far ahead.
  • by Fons_de_spons ( 1311177 ) on Sunday October 01, 2023 @02:47AM (#63891123)
    Train the thing when there is an excess of wind or solar power. We can wait.
  • "Driver fault in reactor 2632 - Ignore, Restart, Evacuate"

    Looking forward to a bright future...

  • come on. We all know since 24 years that for AI purposes, human batteries are much more effective than nuclear power. Microsoft should know that.
  • Surely it would be cheaper to employ more people. Natural, not Artificial, Intelligence.

  • Sub nukes use highly enriched uranium because the design goal (many years ago) was to be small.
    Modern reactors are small and have been designed with safety in mind, without such fuels

  • The "AI" threat a planet full of still employable humans constantly worries about finds perfect justification to cross 4,725 miles of red tape that often extends across several decades to consider nuclear power....

    ...but bitcoin mining was/is somehow a corrupt waste of unjustified power consumption?

    Skynet thanks you for your support.

  • What do you think will be completed first? A nuclear fusion powerplant or GNU/Hurd?

  • by VeryFluffyBunny ( 5037285 ) on Sunday October 01, 2023 @07:45AM (#63891425)
    AI aaaaannnd nuclear power aaaannnd privatised?!! I think I just came in my pants.*

    *I say that in facetious jest. One's trousers are unsoiled.
  • If it gets us to finally go nuclear, then good.
  • Are there any other cases of a Private Nuclear Power Plant?
  • More like gather, categorise, and store vast amount of data it arguably has no right to have. Only then do you get to training, and what does that even mean?
  • Or about how Microsoft is again demonstrating that it can't fundamentally really do anything without fucking it up for nearly 50 years?
  • Siting small reactors with a data center and eliminating grid fees brings the medium-term KWH cost of something like a NuScale reactor to be roughly the same as the cheaper carbon sources and the long term cost to be even less. Ownership has its advantages.
  • Hopefully MS doesn't insist the reactor's control systems run Windows. :-O

    (Noting that several articles indicate they typically use RTOS like QNX -- including this from 2006 Nuclear plant powers up on real-time OS [itbusiness.ca].)

  • So, Microsoft now has both the financial resources and the power need that could justify a serious research/development project to produce a production Thorium reactor. -- The original problem that the DOE had with Thorium reactors was that they were almost useless for producing nuclear weapons (an important 'side effect' of Nuclear power back in the '70s). Microsoft has no need (one would hope) for nuclear weapons, but could definitely use the thorium promise of a far smaller radiation waste footprint.

"What the scientists have in their briefcases is terrifying." -- Nikita Khrushchev

Working...