Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
IBM AI Cloud Supercomputing Virtualization

IBM Says It's Been Running a Cloud-Native, AI-Optimized Supercomputer Since May (theregister.com) 25

"IBM is the latest tech giant to unveil its own "AI supercomputer," this one composed of a bunch of virtual machines running within IBM Cloud," reports the Register: The system known as Vela, which the company claims has been online since May last year, is touted as IBM's first AI-optimized, cloud-native supercomputer, created with the aim of developing and training large-scale AI models. Before anyone rushes off to sign up for access, IBM stated that the platform is currently reserved for use by the IBM Research community. In fact, Vela has become the company's "go-to environment" for researchers creating advanced AI capabilities since May 2022, including work on foundation models, it said.

IBM states that it chose this architecture because it gives the company greater flexibility to scale up as required, and also the ability to deploy similar infrastructure into any IBM Cloud datacenter around the globe. But Vela is not running on any old standard IBM Cloud node hardware; each is a twin-socket system with 2nd Gen Xeon Scalable processors configured with 1.5TB of DRAM, and four 3.2TB NVMe flash drives, plus eight 80GB Nvidia A100 GPUs, the latter connected by NVLink and NVSwitch. This makes the Vela infrastructure closer to that of a high performance compute site than typical cloud infrastructure, despite IBM's insistence that it was taking a different path as "traditional supercomputers weren't designed for AI."

It is also notable that IBM chose to use x86 processors rather than its own Power 10 chips, especially as these were touted by Big Blue as being ideally suited for memory-intensive workloads such as large-model AI inferencing.

Thanks to Slashdot reader guest reader for sharing the story.
This discussion has been archived. No new comments can be posted.

IBM Says It's Been Running a Cloud-Native, AI-Optimized Supercomputer Since May

Comments Filter:
  • The next big thing!
    • by postbigbang ( 761081 ) on Sunday February 19, 2023 @01:46PM (#63306151)

      This is an amazing non-event, a PR release that says nothing, does nothing, and makes one wonder why IBM would tout such a thing There are lots of GPU-laden for-rent nodes in the cloud. Calling it a supercomputer is meaningless as it's not far from the architecture of other advanced server designs. There are no test links verifying its uniqueness, it's wow-factors, nada.

      Nothing to see here. Move along.

      • Of all the companies. They’re also running it on their own special “cloud” nodes that aren’t available to anyone outside a fraction of IBM employees.

        It’s called a datacenter IBM. You built yourself a data center and connected it to the internet.

      • For techies, no. (Score:3, Interesting)

        by Anonymous Coward

        Nothing to see here. Move along.

        That would be misreading the fluff.

        Notice that it's not POWER10, but "scalable xeon". "Scalable" here probably mostly means "give us more money so you can give us more money yet later". Notably not even AMD, which currently has intel beat. So a nice, "safe" choice for technically weak middle management.

        And the rest of the buzzword salad seems to underscore that. This is a vehicle that's "safe, proven" ("online since may"), and has plenty of buzzwords from both the latest fad and from "nobody ever got fire

        • Buzzword salad def describes it.

          Somewhere, there is something very wrong about the detachment necessary to spurt a release that is so superficial and detached from current market reality.

          GPU-based cloud instances, while not cheap, are getting loads of use from non-aligned vendors. That it's not POWER10 is a thankful blessing, but reminds us all again of how IBM, like other vendors we know, has this not-invented-here ego.... now showing it's capitulated to still another gasping-for-air vendor, Intel. Egads.

      • by fuzzyfuzzyfungus ( 1223518 ) on Sunday February 19, 2023 @02:13PM (#63306215) Journal
        Honestly, it's worse than that.

        It doesn't say much; but what it does say suggests basically nothing aside from reasonably deep pockets. 8 80GB A100s on NVswitch basically means a system based on the HGX A100 [nvidia.com] boards that Nvidia sells to a variety of partners along with a reasonably high end but not at all atypical Xeon system.

        You can get the same 8-socket HGX baseboard paired with either intel or AMD(quite possibly Ampere as well, I know that the PCIe A100s are supported on that, not certain about the HGX boards) from a variety of vendors; Supermicro, Inspur, HPE, Lenovo, Dell; and major hyperscalers have their own pet variants. Such systems aren't cheap; but they're off the shelf stuff you can talk to your rep and have in fairly short order. If IBM is putting out a puff piece about how they've bought some; but (while hyping it as 'cloud') they only do unspecified internal research on it then it seems reasonable to suspect that they've got nothing exciting to put out a press release about.

        AWS will sell you access to essentially the same nodes right now for for $41/hr on-demand [amazon.com]; and Microsoft is busy burning cycles on their own variant to run a chatbot to make Google nervous. Tell us, IBM, what did you hope to gain by talking about this?
      • Attempting to keep the stock price up. Makes IBM look really bad when they used to be the front runner on IBM but failed with Watson and are now getting lapped.
    • It is elementary Watson, this is not IBMs first AI.
    • Imagine a beowulf cluster of beowulf clouds.

  • for anything that has "cloud-native" in the product description.

  • IBM finally realizes that the business model changed and it's called the cloud. On-premises is not going to cut for selling AI.
  • So....let me get this straight, you're putting AI test beds/engines/development on internet connected servers and you have the ability to scale for additional processing resources as needed AND you're also renting out parts of this to other development teams to develop AI's in different areas (AI Coding, AI Security Pen testing, AI Genomics research, AI Language research, AI Financial and Economic modeling, AI Weapons development, etc....) but it's all OK b/c like Virtual Machines can never get out of their
  • "IBM says it's been running a datacenter with spicy autocomplete since May."
  • Why do they even make headlines anymore? The next big one is "IBM Declares Bankruptcy, GBM Goes With" - They don't innovate. They don't make Thinkpads. They don't make good software anymore. Here they are pretending they're a cloud company - and they're not even using their server architecture. What does IBM have that others don't? Age.
    • IBM is pretty big in the HPC and banking arena because it is able to put its weight behind producing a product and delivering something that works, even if that product isnâ(TM)t very innovative.

      Yes, Ceph exists, but GPFS is still the product of choice, yes, you can get a large cluster of SuperMicro servers and have them fail over, but people still prefer mainframes.

      • It's funny, IBM was actually recently caught trying to hide the fact that their sole source of revenue is the mainframe business. Paleolithic!! Who knows though, maybe mainframes will make a comeback :')
        • by guruevi ( 827432 )

          Again, if you want something that works... modern mainframes are nothing but a cluster of x86 or POWER servers. It's just the management and failure modes have been carefully managed.

          Sun Solaris at one point had a similar product (FMS, HSM, ZFS) but between Oracle and the community forks the focus on reliability and manageability has made way for novelty and feature creep.

You know you've landed gear-up when it takes full power to taxi.

Working...