Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
AI Hardware

How Amazon's Secret Weapon in Chip Design is Amazon (ieee.org) 18

In 2015 Amazon purchased chip designer Annapurna Labs, remembers IEEE Spectrum, "and proceeded to design CPUs, AI accelerators, servers, and data centers as a vertically-integrated operation."

The article argues that while AMD, Nvidia, and other big-name processor companies may also want to control the full stack (purchasing server, software, and interconnect companies) — Amazon Web Services "got there ahead of most of the competition." (IEEE Spectrum interviews Ali Saidi, technical lead for the AWS Graviton series of CPUs, and Rami Sinno, director of engineering at Annapurna Labs, on "the advantage of vertically-integrated design — and Amazon-scale...") Sinno: I was working at Arm, and I was looking for the next adventure, looking at where the industry is heading and what I want my legacy to be. I looked at two things: One is vertically integrated companies, because this is where most of the innovation is — the interesting stuff is happening when you control the full hardware and software stack and deliver directly to customers.

And the second thing is, I realized that machine learning, AI in general, is going to be very, very big. I didn't know exactly which direction it was going to take, but I knew that there is something that is going to be generational, and I wanted to be part of that. I already had that experience prior when I was part of the group that was building the chips that go into the Blackberries; that was a fundamental shift in the industry. That feeling was incredible, to be part of something so big, so fundamental. And I thought, "Okay, I have another chance to be part of something fundamental."

[...] At the end of the day, our responsibility is to deliver complete servers in the data center directly for our customers. And if you think from that perspective, you'll be able to optimize and innovate across the full stack. It might not be at the transistor level or at the substrate level or at the board level. It could be something completely different. It could be purely software. And having that knowledge, having that visibility, will allow the engineers to be significantly more productive and delivery to the customer significantly faster. We're not going to bang our head against the wall to optimize the transistor where three lines of code downstream will solve these problems, right...?

We've had very good luck with recent college grads. Recent college grads, especially the past couple of years, have been absolutely phenomenal. I'm very, very pleased with the way that the education system is graduating the engineers and the computer scientists that are interested in the type of jobs that we have for them.

It's an interesting glimpse into the unique world of designing chips at Amazon.

Graviton technical lead Saidi: I've been here about seven and a half years. When I joined AWS, I joined a secret project at the time. I was told: "We're going to build some Arm servers. Tell no one...

"In chip design, there are many different competing optimization points. You have all of these conflicting requirements, you have cost, you have scheduling, you've got power consumption, you've got size, what DRAM technologies are available and when you're going to intersect them... It ends up being this fun, multifaceted optimization problem to figure out what's the best thing that you can build in a timeframe. And you need to get it right."
This discussion has been archived. No new comments can be posted.

How Amazon's Secret Weapon in Chip Design is Amazon

Comments Filter:
  • Saidi: This might sound weird, but I’ve seen other places where the software and the hardware people effectively don’t talk. The hardware and software people in Annapurna and AWS work together from day one. The software people are writing the software that will ultimately be the production software and firmware while the hardware is being developed in cooperation with the hardware engineers. By working together, we’re closing that iteration loop. When you are carrying the piece of hardware

    • by gtall ( 79522 )

      The consequence if that if your engineering team misses something fundamental, then you now have it both hardware and software.

  • by DrMrLordX ( 559371 ) on Sunday September 15, 2024 @03:30PM (#64789195)

    Here's what Amazon's in-house efforts have bought them:

    https://www.phoronix.com/revie... [phoronix.com]

    If you like what's on offer then you'll like Graviton 4 better than 3 (or 2), but it still gets whipped by EPYC instances. However, if you read between the lines, Graviton 4 is winning on a performance per dollar basis, at least in db workloads.

    • At the scale of AWS, the savings of an in-house CPU might be large enough to instead do things like offer EC2 at-cost, which would quickly eat up a lot of marketshare of arch-agnostic code (most modern code that's not heavily optimized in assembler).
    • I assume that this is if-we-even-suspect-that-you-know-we-have-to-kill-you type commercially sensitive information; but it would be very interesting to know what the indirect savings of having an at least credible alternative in house look like: seems like the sort of thing that probably doesn't hurt when you are asking Intel or AMD if that's really the best price they can give you.

      It's certainly not the case that the program exists just to scare discounts out of the x86 vendors(anything sufficiently ina
    • winning on a performance per dollar basis, at least in db workloads.

      That's the entire point of ARM servers, so... duh?

      • Well they are also beating Intel on overall performance. There's that point to ARM server hardware. Kudos to AMD for remaining relevant even when facing Amazon's homebrew Graviton series.

  • First, it's not AMD and Nvidia and Intel that want to control the full stack, it's Amazon, Microsoft, Google, Facebook and etc. where that's trendy. Second if every major datacenter company built their own hardware stack top to bottom, they compete with every other major datacenter company for every hardware engineer in existence. Then they have to design every part of the stack top to bottom independently. Then they have to implement it independently. This entire argument hinges on the assumption of a zero
    • Is it? Intel is in a pre-split AMD-GlobalFoundries situation, where they have been unable to turn around their fab business and now need to use external foundries to stop the bleeding of customers for cutting edge. Unless Taiwan gets invaded in the next three years, I don't see Intel getying ita house in order. NVidia seems to mostly be interested in the AI play and less on pure HPC. AMD may use their position to increase margins on their EPYC line. Makes perfect sense to cut that margin with chip designs t
      • ARM realizes they've been undercharging for what they could do, especially as a public company now, and are already trying to spring out vastly higher profit margins. See the Qualcomm lawsuit as an example. Just licensing ARM designs isn't going to do anyone any favors soon enough. Which appears to be close, Zen 5 might not please gamers but the astonishing performance per watt on other workloads means the upcoming cloud focused Turin processor is going to beat Amazon's Graviton in performance per dollar as
        • ARM realizes they've been undercharging for what they could do, especially as a public company now, and are already trying to spring out vastly higher profit margins. See the Qualcomm lawsuit as an example.

          That's a bad example. According to ARM, they licensed their technology to Nuvia with the contract stating the IP could not and would not be transferred to any other party even if Nuvia was sold. Nuvia was sold and Qualcomm just ignored that clause.

          Just licensing ARM designs isn't going to do anyone any favors soon enough.

          ARM has different licenses. Companies can buy ARM designs if they want, but the most expensive licenses the ones that Apple, Samsung, and Qualcomm have which allows them to make their own ARM designs.

          • by Anonymous Coward
            Dude seriously just google "nuvia qualcomm answer" (which is the court filing made in response to the ARM complaint) and read what they wrote. ARM is way out of line, and ARM's original complaint is vague because they can't point to specific provisions that were violated in specific ways. I would easily bet $1000 on Nuvia winning this lawsuit. ARM is freaked out because Qualcomm showed that ARM's reference designs are pretty terrible -- Apple already showed that but Nuvia showed that it could be done again
            • Dude seriously just google "nuvia qualcomm answer" (which is the court filing made in response to the ARM complaint) and read what they wrote

              Dude, you know that Qualcomm's answer means what they say right? Why would I trust anything Qualcomm said.

              ARM is way out of line, and ARM's original complaint is vague . . .

              All of your points are you believe only Qualcomm. That's it.

              Risc V is going to eat up ARM from the low end while competing ARM designs muscle out ARM's own designs at the high end

              Whatever you want to believe will be true, right? I have this bridge to sell you then.

              The fact is, CPU designs are just not that hard to do.

              Sure, go ahead and design a CPU. That's why there are dozens of chip design companies like ARM . . . oh wait there are not.

              In fact one of the Raspberry Pi designers made his own design "Hazard3" for RISC V and it works just as well as ARM.

              And the Hazard3 has replaced ARM CPUs in smartphones . . . oh what it is a microcontroller that does not begin to compete with the range

        • by Calken ( 646945 )

          Just licensing ARM designs isn't going to do anyone any favors soon enough.

          ARM also make considerable income from partnering with these companies to design their chips.

  • by The Cat ( 19816 )

    Amazon? Since you're not going to do anything substantial with publishing, ebooks, audiobooks or comics, could you at least tell authors and readers in advance and stop wasting their time and money?

    While you're at it, can we have the .book TLD back? Shouldn't have been allowed to buy it in the first place.

    P.S. Eat shit.

  • Is it me or does Rami Sinno look like a 3rd of 4th gen clone copy of Jeff Bezos?

  • All this talk about chip design is great, but what is needed are fabs. Problem is that TSMC will be destroyed in seconds if China decides to be aggressive (China knows there will be zero consequences if they knock out TSMC fabs in Taiwan), SMIC isn't exactly the best partner, Mikron Group is reliable... but Russian, Intel... well, watch the news. Samsung? Their plant in Texas has shed a number of workers, perhaps due to low yield at that process level.

    Amazon needs their own fab. It sucks that they have

I'd rather just believe that it's done by little elves running around.

Working...