Microsoft's CTO Hopes to Swap Most AMD and NVIDIA GPUs for In-House Chips (theregister.com) 44
"Microsoft buys a lot of GPUs from both Nvidia and AMD," writes the Register. "But moving forward, Redmond's leaders want to shift the majority of its AI workloads from GPUs to its own homegrown accelerators..."
Driving the transition is a focus on performance per dollar, which for a hyperscale cloud provider is arguably the only metric that really matters. Speaking during a fireside chat moderated by CNBC on Wednesday, Microsoft CTO Kevin Scott said that up to this point, Nvidia has offered the best price-performance, but he's willing to entertain anything in order to meet demand.
Going forward, Scott suggested Microsoft hopes to use its homegrown chips for the majority of its datacenter workloads. When asked, "Is the longer term idea to have mainly Microsoft silicon in the data center?" Scott responded, "Yeah, absolutely...
Microsoft is reportedly in the process of bringing a second-generation Maia accelerator to market next year that will no doubt offer more competitive compute, memory, and interconnect performance... It should be noted that AI accelerators aren't the only custom chips Microsoft has been working on. Redmond also has its own CPU called Cobalt and a whole host of platform security silicon designed to accelerate cryptography and safeguard key exchanges across its vast datacenter domains.
Going forward, Scott suggested Microsoft hopes to use its homegrown chips for the majority of its datacenter workloads. When asked, "Is the longer term idea to have mainly Microsoft silicon in the data center?" Scott responded, "Yeah, absolutely...
Microsoft is reportedly in the process of bringing a second-generation Maia accelerator to market next year that will no doubt offer more competitive compute, memory, and interconnect performance... It should be noted that AI accelerators aren't the only custom chips Microsoft has been working on. Redmond also has its own CPU called Cobalt and a whole host of platform security silicon designed to accelerate cryptography and safeguard key exchanges across its vast datacenter domains.
Re:Didn't microsoft make a deal with AMD? (Score:4, Insightful)
CUDA owns the AI market right now. If you have a massive amount of high quality coders and a few years, and then a few more years advertising your stuff you everyone as good for GPU supply, you may get to where nvidia is today. Being the only thing that really matters in AI hardware.
Nvidia spent a better part of a decade getting to where it is, from nurturing top talent to being the only one with long term vision to become what it is today. It's why even people really pressed for functional silicon like the Chinese don't give a toss about nvidia alternatives. Everyone wants something that works with CUDA, because all the software is on CUDA.
Re: (Score:2)
Re: (Score:2)
Re: CUDA Emulators (Score:2)
Re: (Score:2)
The big question about such an approach: what is the performance hit for having such middleware? The main goal of CUDA is to speed things up. The more layers, the more slowdown ("We can solve any problem by introducing an extra level of indirection... except for the problem of too many levels of indirection").
It seems to me that to be competitive, the hardware itself has to be designed to align with the CUDA API - the idea that it will work equally well with all hardware is probably a pipe dream. Maybe ther
Re: (Score:2)
Actual main goal of CUDA is to generate a unified way to address nvidia GPUs from today to long term future. CUDA cores across nvidia architectures are not the same. In some cases, they don't even look similar, and they have hilariously different levels of performance. I.e. "CUDA cores" are not unified. They're vastly different.
But they can all universally be addressed through CUDA. Hence the name. Compute Unified Device Architecture. Its purpose is to ensure that current and far future nvidia GPUs can be a
Re: (Score:2)
What is to stop Microsoft (or any other large company) from reimplementing the CUDA API - so software at most has to be recompiled? Are there legal constraints? Is CUDA too specific to NVidia's architecture so benefits would be vastly smaller for other GPUs? I doubt it is too complex - certainly not when billions of dollars are at stake.
Re: (Score:2)
Yes.
Competition (Score:3)
Hopefully this will lead to more competition in the GPU market. Particularly since Intel and nVidia have semi-merged and no longer will be competing with eachother in the GPU segment. (Very sad scenario there, those Intel GPUs were just starting to get competitive, probably why nVidia had to buy them out.)
Even better if the Microsoft GPUs actually incorporate the "G" part, like with raytracing and such. If they're just AIPUs, they'll probably come on the market around the time the bubble pops, and be pretty worthless.
It could potentially be something great to put in an Xbox, but it sounds like they're winding Xbox down too.
Re: (Score:3)
Hopefully this will lead to more competition in the GPU market.
These are targeted as AI accelerator(s). Just like Google's TPUs, or AWS's Inferentia/Tranium chips. They will have little to no impact on the consumer GPU market (except, perhaps, sucking up more fab capacity and making GPUs even more expensive).
Re: (Score:3)
Competition?
The headline should read "Microsoft to replace one TSMC chip with different TSMC chip."
Re: (Score:2)
Since no consumers buy their hardware directly from TSMC, having a variety of middlemen (not just nVidia) to choose from counts as competition from the perspective of consumer pricing.
Re: (Score:2)
Then there's the whole design thing. TSMC doesn't design chips, as far as I know.
Re: (Score:2)
Not if TSMC's production capacity is maxed out.
Re: (Score:2)
I was under the impression most of nVidia's markup was artificial, and not driven by supply constraints. I could be wrong.
Re: (Score:2)
Re: (Score:2)
All swimming pools in New York City are equivalent because they all use the same water.
Re: (Score:2)
Hopefully this will lead to more competition in the GPU market.
Unlikely. These chips are only to be used by MS internally for a very niche purpose not for general gaming and videos. They will never be sold to anyone especially to consumers.
I don't think you'll get any competition (Score:2)
It will reduce demand on Nvidia and AMD gpus though so there is that aspect. But it won't be full on competition in the same market space. So the effect will be muted somewhat.
Pretty absurd (Score:2)
There is literally nothing to gain by doing this, even in the long run.
Re: (Score:2)
What do you mean?
At the scale that they are operating, you probably can design special purpose chips that will be significantly cheaper than an off the shelf device.
Re: Pretty absurd (Score:1, Insightful)
no , Microsoft can't beat the long time leaders in any hardware realm. call me on your Microsoft phone with rebuttal. or type on your Microsoft keyboard made by Incase.
Re: (Score:2)
But producing hardware for retail is quite different from producing hardware for internal business use.
And note that MS produces/license all kind of hardware for retail as well: both laptop and gaming systems.
Re: (Score:3)
"both laptop and gaming systems"
and the core components of both are made by other companies
Re: (Score:2)
no , Microsoft can't beat the long time leaders in any hardware realm. call me on your Microsoft phone with rebuttal. or type on your Microsoft keyboard made by Incase.
But MS is not trying to beat leaders in the hardware realm. They are trying to lower costs and get more control of a single part for a very specific use case. It is the same reason Google started designing their own data center CPUs more than a decade ago. Google could have purchased Intel/AMD/ARM or whatever chips. Same reason Amazon did too. I would assume that Apple also uses their own CPUs for their datacenters. This is not some brand new strategy here.
Re: Pretty absurd (Score:2)
By the time they even get anywhere close to breaking even (either monetarily or technologically) on this endeavor, the need for these chips will long be gone.
Not news (Score:4, Interesting)
Microsoft, along with every hyperscalar and large company in the US and the entire world, hopes to replace Nvidia GPUs. They've spent many billions of dollars and years to try and hope. Microsoft is a bit behind the other hyperscalars, so they probably need even more hope. The existence of that hoping is not news. What would be news is how much progress they're making and more importantly if they will be the first to crack the puzzle of making ASIC AI processors equal or better than GPUs.
Hate windows and coplilot (Score:2)
Re: (Score:2)
... and you'r note the least bit concerned that the company entered the enshittification phase years ago and has been caught up in a bubble?
suuuure (Score:2)
No one's using AMD see we all know they won't be dropping them.
Also FFS AMD can you like you know stop being compete Muppets and not be objectively worse than the GPU vendor who shall not be named. If you worked with torch without being a massive ballache I'd be much obliged. K thx Bai.
utterly mystifying (Score:1)
Re: (Score:2)
FML those assclowns.
I can get an NVidia GPU from anywhere, from a cheapass 1650 from CEX second hand to an H200 rented through google, pip install pytorch and it Just. Fucking. Works.
It's mental how little effort they put in. It's not like they need to replace CUDA with a compatible system. Just dealing with Pytorch and a few inference only frameworks (onnx-infer, etc) would be game changing.
Oh yeah and their AI chips. They have a competitor to Jetsons right now, except they just can't be arsed.
putting money where the mouth is - bad idea (Score:2)
I actually bet against NVDA based on the expectation that DGEMM-type matrix multiplication would soon(er) be in hardware optimized for LLMs plus putting part of the transformer into hardware. An expensive undertaking. It's incomprehensible given the market size and overpayment for Nvidia products that things have been slow. I don't get the technical rationale why some of the heavy GPU buyers don't put effort into making PyTorch, Transformers and more of the stack more affine to non-CUDA accelerators. in HPC
Re: (Score:2)
I am an HPC person and the answer to that is very simple. NVIDIA makes sure that gemm works well on their hardware, while the other don't. You can't just rip standard gemm codes and get good performance. There is a ton of engineering that needs to happen to make the operations work transparently with what comes before the gemm and what comes after the gemm.
Just look at how complex cutlass is (which is how cublas works) and you'll get a sense of the problem. Now it is not unsolvable, but it's not simple. And
I love in-house chips (Score:2)
Yum!
Probably inference chips (Score:2)
Even though the hyperscalers can batch a lot of queries, utilization of GPUs is still much worse than training. They need more memory bandwidth and less compute and most of the interconnect can just go in a nice straight pipeline.
own homegrown accelerators..better be US made (Score:2)
Re: own homegrown accelerators..better be US made (Score:2)
And the name? (Score:1)