Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Technology Hardware

Supercomputing: Raw Power vs. Massive Storage 346

securitas writes "The NY Times reports that a pair of Microsoft researchers are challenging the federal policy on funding supercomputers. Gordon Bell and Jim Gray argue that the money would be better spent on massive storage instead of ultra-fast computers because they believe today's supercomputing centers will be tomorrow's superdata centers. They advocate building cheap Linux-based Beowulf clusters (PCs in parallel) instead of supercomputers." NYTimes free reg blah blah.
This discussion has been archived. No new comments can be posted.

Supercomputing: Raw Power vs. Massive Storage

Comments Filter:
  • Ny Times free reg?! (Score:5, Informative)

    by krisp ( 59093 ) * on Monday June 02, 2003 @10:16AM (#6095478) Homepage
    No Registration Required [nytimes.com]

    Just use the google link!
  • by soboroff ( 91667 ) on Monday June 02, 2003 @10:18AM (#6095505)
    Gordon Bell and Jim Gray are not just "a pair of Microsoft researchers". They are two of the biggest names in high-performance computing. Gordon Bell awards, anyone?
  • Beowulf?! Bah! (Score:1, Informative)

    by LilMikey ( 615759 ) on Monday June 02, 2003 @10:21AM (#6095551) Homepage
    Worlds fastest supercomputer: SETI@home.
  • by seangw ( 454819 ) * <seangw.seangw@com> on Monday June 02, 2003 @10:25AM (#6095592) Homepage
    Apparently I should read the article first, as I see now mention of Linux, just an allusion towards a scheme somewhat like that.

    The article mentions shifting focus towards other aspects of high speed computing other than the pure processing power. Such as increasing network speeds, increasing the storage pools, etc.

    Nowhere did it specifically mention "linux" however it definitely seems to be saying something somewhat like a beowulf cluster would address.
  • Partner = Slashdot (Score:5, Informative)

    by Zach Garner ( 74342 ) on Monday June 02, 2003 @10:28AM (#6095615)
    You could at least use partner=SLASHDOT [nytimes.com]
  • funny .. (Score:1, Informative)

    by teemu.s ( 677447 ) on Monday June 02, 2003 @10:32AM (#6095653)
    nothing about this mentioned on their researcherteams homepage :

    http://research.microsoft.com/barc/Scaleable/

    (just in case somebodys interessted)
  • by Bad Dude ( 14345 ) on Monday June 02, 2003 @10:35AM (#6095679)
    By rewriting existing scientific programs, they say, researchers will be able to get powerful computing from inexpensive clusters of personal computers that are running the free Linux software operating system

    That's pretty clearly mentioning Linux.
  • Re:(-1, Troll) (Score:3, Informative)

    by watzinaneihm ( 627119 ) on Monday June 02, 2003 @10:36AM (#6095685) Journal
    On page two of the article, there is a mention of Linux, Beouwulf etc. Moreover x86 is not mentioned explicitly.
    From the article
    By rewriting existing scientific programs, they say, researchers will be able to get powerful computing from inexpensive clusters of personal computers that are running the free Linux software operating system. Many scientists are now adapting their work to these parallel computing systems, known as Beowulfs, which make it possible to cobble together tremendous computing power at low cost.
    And if you are going to rewrite Unix code, it is easier to rewrite it for Linnux than for Windows.And how much can a MS cluster scale anyway?
  • by Anonymous Coward on Monday June 02, 2003 @10:37AM (#6095690)
    First paragraph on page 2:
    'Dr. Gray and Dr. Bell, a legendary computer designer who oversaw the national supercomputer centers for two years during the 1980's as a director for the National Science Foundation, call their current approach to computing "information centric" and "community centric." By rewriting existing scientific programs, they say, researchers will be able to get powerful computing from inexpensive clusters of personal computers that are running the free Linux software operating system. Many scientists are now adapting their work to these parallel computing systems, known as Beowulfs, which make it possible to cobble together tremendous computing power at low cost.
    So they do in fact mention Linux.
  • Microsoft does make their own clustering software [microsoft.com]
    Of course they will suggest to use that instead of Linux because...[fill in the blank]
  • Holy Shit! (Score:2, Informative)

    by kikta ( 200092 ) on Monday June 02, 2003 @10:38AM (#6095704)
    Wow. That's the first time I've seen an attempt to RTFA result in someone correcting themselves incorrectly. You apparently didn't make it to page 2 [nytimes.com]:

    By rewriting existing scientific programs, they say, researchers will be able to get powerful computing from inexpensive clusters of personal computers that are running the free Linux software operating system. Many scientists are now adapting their work to these parallel computing systems, known as Beowulfs, which make it possible to cobble together tremendous computing power at low cost.

    "The supercomputer vendors are adamant that I am wrong," Dr. Bell said. "But the Beowulf is a Volkswagen and these people are selling trucks."
  • by afidel ( 530433 ) on Monday June 02, 2003 @10:45AM (#6095762)
    By rewriting existing scientific programs, they say, researchers will be able to get powerful computing from inexpensive clusters of personal computers that are running the free Linux software operating system.
    "The supercomputer vendors are adamant that I am wrong," Dr. Bell said. "But the Beowulf is a Volkswagen and these people are selling trucks."


    All the people who are responding saying they don't mention Linux didn't read the second page.
  • Username/Password (Score:5, Informative)

    by YeeHaW_Jelte ( 451855 ) on Monday June 02, 2003 @10:45AM (#6095765) Homepage
    I saw that it could be google too, but anyhow, I made a username/password for y'all:

    slashdot124
    slashdot

    Be wary however, I registered as a North Korean military R&D official under high salary.
  • by elwinc ( 663074 ) on Monday June 02, 2003 @10:47AM (#6095783)
    One of the big reasons for using supercomputers over the past decade or more has been to simulate nuclear explosions. When we (the USA) simulate weapons instead of testing them, it allows us to lead by example when we argue for a ban on nuclear tests. Because simulation is technically challenging, it slows down nuclear proliferation. It's a creative form of deterrence.

    All this for the price of a few supercomputers every year. And the market for supercomputers pushes several technologies; for example, high speed interconnect and gallium arsenide, and sets the bar for high performance silicon. Pretty good deal, doncha think?

    But now the Moron-in-Chief wants to bring back nuclear testing. [reuters.com] (pardon me, 'nookyuler.' Bush can't be wrong about something as simple as pronunciation, can he?). Farewell to deterrence. Farewell to common sense...

  • by martyn s ( 444964 ) on Monday June 02, 2003 @10:48AM (#6095797)
    I don't mean to be a language nazi here, but it's not "tow the line" it's "toe the line". In other words, if someone were to toe the party line, that means basically, their toes are lined up, which means that THEY are in line, It has nothing to do with dragging or pulling. Just trying to nip this one in the bud before it takes on other meanings :)
  • by im_57 ( 410403 ) on Monday June 02, 2003 @10:55AM (#6095857)
    http://www.research.microsoft.com/~Gray/talks/CSTB _SuperComputing_Study_Group.ppt
  • by goombah99 ( 560566 ) on Monday June 02, 2003 @10:59AM (#6095892)
    I dont know a lot about gordon bell so I cant critisize his work. but I do know that the bell prize is based on gigaflops per dollar. this creates computers that shortchange interconnection speed and parallelism for raw gigflops.

    this is not what high performance computing is about. this is the class of problems that are embarassingly parallel and dont need good disk access. in short pointless benchmarks like computing pi rather than solving real tightly coupled physics probelms like say asteroid impacts, or molecular dynamics. or problems where processors have to access the disk a lot, or share data.

  • Re:Nice (Score:1, Informative)

    by Anonymous Coward on Monday June 02, 2003 @11:06AM (#6095940)
    Management nightmare,
    Rather than managing one computer with one
    operating system you are managing hundreds
    or even thousands of operating systems.
    Weird OS's? IRIX isn't weird, in fact its
    one of the nicest UNIX OS's out there.
    UNICOS and AIX are not bad either.
    Traditional supercomputers are still nice
    when it comes to memory intensive apps.
    Remember with clusters you need to load the
    datafiles into the memmory for each node,
    whereas on shared memmory systems it only
    needs to be loaded once.

    IE:

    128 nodes * 2GBmem/per node != 256 gigs.

    If the application has data files of say
    a gigabyte (rather typical of the size the data
    files that are used by the oceonographic models
    used at where I work), but the files are copied
    to each node, so you end up using 128GB of the
    total memmory of the cluster.
  • Re:Nice (Score:5, Informative)

    by anzha ( 138288 ) on Monday June 02, 2003 @11:09AM (#6095958) Homepage Journal

    Mod this guy up. He's really telling the truth!

    Loosely coupled clusters like PDSF [nersc.gov] are great for work like what the high energy physics people do, like SNO [slashdot.org].

    However, somethings work better on vector architectures such as climate models and fusion work: there is a reason why the Spanish Met troops [cray.com] bought a Cray. Additionally, some chemistry, many fusion and several other codes work best on vector architectures.

    There guys presented their global warming work where at my job. They've developed their climate code though as a parallel one. See here [ucar.edu]. One of the places that they have been running is on seaborg [nersc.gov], an IBM RS/6000 with over 6k and near 7k processors.

    Interestingly, the PCM guys presented what they wanted for an uber'puter. While it had massive amounts of storage, it was also a 500 *PETAFLOP* SUSTAINED PERFORMANCE machine.

    *clickety clack* That'd be something like 166,666,666 Athlons. IDK of any interconnects that handle that. Can you imagine being an admin? Better hope you're good on rollerblades zipping to and fro replacing those oh-so-reliable commodity disks and CPUs...even if you have a .05% failure rate, that's still too damn much. As an admin, that'd be a huge waste of time. It'd also wreck havoc on the guys running stuff.

    Or is that what grad students are for? To attempt such a silly thing and then admin it? ;)

    Seriously tho. To get from here to their, we're going to need some exotic techs...not just more 'attack of the killer micros'.

  • Re:Nice (Score:4, Informative)

    by RobertFisher ( 21116 ) on Monday June 02, 2003 @11:13AM (#6095984) Journal
    This poster is wrong on several accounts, and should be modded down accordingly.

    Actually, when you say you did you take a look at the top 500 list, you should put actions behind your words. The top cluster is at #5 on the most recent list (LLNL's NetworX machine - http://www.top500.org/list/2002/11/), and is less than 20% behind the #2 spot. Guaranteed that within a year, linux clusters will indeed fill the #2 spot on down.

    Second, hydrodynamic problems (which are a class of hyperbolic PDEs), deal with nothing but local communications, and scale quite well even on Linux clusters. The more challenging set of problems are non-local PDEs (elliptic and parabolic -- like Poisson's equation and heat transfer). Because these problems couple every point in space to every other point in space at ever time, they reamin tough to solve on a parallel machine no matter what platform you are on.

    The Earth Simulator is a highly special case. The Japanese government made an enormous investment (well over $500 M) to purchase that machine. Even with the support of the DOE and private industry (increasingly biotech), the US just does not have the political willpower to spend that much on a single platform. It is often neglected that the current paradigms of high-performance computing are lacking in many respects -- some refer to the recent move towards very large parallel machines as "a great step backwards". We have to pursue technically innovative solutions which will be both cheaper to purchase than the Earth Simulator, and more efficient to use.
  • Re:(-1, Troll) (Score:2, Informative)

    by jwgoerlich ( 661687 ) on Monday June 02, 2003 @11:21AM (#6096047) Homepage Journal

    > And how much can a MS cluster scale anyway?

    Windows 2000/2003 WLBS can scale theoretically scale to 32 nodes, but I have seen performance decreases after 16 or so.

    Windows 2000 MCS can scale up to two nodes with Advanced Server, and four nodes with Datacenter.

    Windows 2003 MCS can scale up to four nodes with the Server, and eight nodes with Enterprise.

    jwg

  • We have 2 Linux clusters here at NCSA already, with a third in progress. See:
    The Titan Cluster [uiuc.edu]
    The Platinum Cluster [uiuc.edu]
    TeraGrid Clusters Successfully Installed at NCSA [teragrid.org]
    These clusters run either RedHat or SuSE Linux and are available for researchers nationwide.

    These clusters are not beowulf; they allow access through a general scheduler and have MPI [anl.gov] to run programs that use a group of nodes at once. This gives the greatest flexability to the users to create a computational system that can be optimzed for the size and needs of their problem. The size of a cluster that can be supported at a national center allows enough computational power to solve problems that can't be solved elsewhere. Given that a cluster of a 128 nodes is now considered an instituitional asset and within the purchasing power of any university, it makes sense to use federal funds to create systems to handle problems beyond the scale of a cluster that any university might own.

    Another aspect of this issue arises in the asumption that cluster computing is so easily accomplished that it might be compared to the setup of a single system. I respectfully submit that the simpliest of clusters is none too easy to deploy and use as of today, not to mention the lack of support one gets for the application of their scientific research to a stock parallel computing platform. The national centers can afford to have consultants and researchers on staff that specialize in these matters, as well as full-time admins.

    Note: The opinions expressed here are my own and not necessarily representative of my employer or the federal government. In addition, given that I am employed by NCSA, a slight element of bias may be present in my statements. :)

  • Re:Nice (Score:2, Informative)

    by fitten ( 521191 ) on Monday June 02, 2003 @11:43AM (#6096201)
    A "cluster" is a broad term. A "Beowulf" cluster is one made from commodity parts connected with low-cost (100Mb - faster as the price point drops) Ethernet. A cluster can have exotic interconnects, which knocks it out of the Beowulf category. For example, IBM SP and SP2 machines are really just clusters. The Cray T3D and T3Es were really just clusters as well, if you think about it. ASCI Red and Sandia's C-Plants are also clusters.

    What usually governs what the machine is good at is more towards latency/bandwidth of accessing the data rather than the architecture. For example, a cluster with a high speed interconnect like Myrinet or a T3Es DMA works well on problems that require low-latency and/or high-bandwidth even though they are clusters. They can be almost as effective as SMP machines on some problems that SMP machines are typically better at doing.

    Lots of work has been done lately on TCP/IP (reducing the number of copies in the stack, etc) to decrease latency but it is still a ways off from things like Myrinet.

    Also remember that some benchmarks are basically just measures of Bisection Bandwidth of a system. Given enough Ethernet routers and CAT5 cables, you can get pretty high scores on that, too.

    I guess this was kinda rambling, but basically it comes down to the fact that currently, there is no one type of architecture that is the best at everything. Sometimes we make compromises (because of cost and such) and use non-optimal architectures (Beowulfs were originally researched simply because they were a cheap alternative that gave good enough performance to do real work, not because they were the end-all, be-all of HPC).
  • by Anonymous Coward on Monday June 02, 2003 @11:47AM (#6096222)
    Well yeah, beowulf clusters blow the pants off of so called fast computers on any problem that is embarassingly parallel. (i.e. very low proscessor to processor communication and no vector processing and asyncronous low bandwith disk acess) That's why i use them in my own work in biology. (and yes I use systems 300+ processors, and soon maybe 2000 processors) but there are classes of problems, particularly ones using couple differntial equations, where this is not true.
  • woah dude (Score:1, Informative)

    by Anonymous Coward on Monday June 02, 2003 @11:53AM (#6096289)
    Okay, first of all, the really tough CFD problem de jour is incompressible flow, which is--you got it--elliptic. Resolving or modeling the turbulent scales in a time-accurate way, especially near boundaries, is the most difficult part. Fluid dynamics equations only go hyperbolic where compressibility is important, such as in supersonic flow. For incompressible, you'll notice that solving the pressure-Poisson equation generally requires an FFT, a non-local operation
    (or you can use e.g. a vortex method).

    *HOWEVER*, it is *much* easier to solve heat and poisson equations than Navier-Stokes, for the very important reason that they are linear. I mean, really. Any old cad/fem package can do heat conduction, and poisson is just an FFT away. What makes hydrodynamics hard is its nonlinearity. It's just as elliptic as the other problems you mention under the incompressible conditions most often studied.
  • by trog ( 6564 ) on Monday June 02, 2003 @12:28PM (#6096585)
    Imagine what a cluster of 700 to 1,000 blade servers running the latest Intel Xeon CPU's can do now! =)

    Actually, it would be a very crappily-performing cluster. Blade servers are designed with two major goals - CHEAP and SMALL. Blade servers are engineered for high availability applications (think webserver farm).

    Just because you CAN do something doesn't mean it's the optimal solution. It amazes me when I see vendors selling blade server clusters.

    (Disclaimer: I work as an engineer with a company with builds Linux based clusters for universities and labs)
  • by Anonymous Coward on Monday June 02, 2003 @01:07PM (#6096910)
    Of course, they do completely seperate things. Microsoft's clustering is server clustering...you know, Node A mysteriously powers off, Node B picks up his processes and apps, etc.

    Beowolf is for sending multiple processes to multiple machines for processing.

    Microsoft does make their own clustering software
    Of course they will suggest to use that instead of Linux because...[fill in the blank]
  • Re:Nice (Score:2, Informative)

    by heydrick ( 91504 ) on Monday June 02, 2003 @01:18PM (#6096988)
    That's not true at all. The MTA [cray.com] is not dead. Cray shipped [cray.com] two MTA-2 systems including a 40-processor system [cray.com] with 160GB of shared memory to the NRL [navy.mil] last year.
  • by CastrTroy ( 595695 ) on Monday June 02, 2003 @01:26PM (#6097089)
    Except for that fact that on page 2 of the article, they specifically mention Beowulf clusters running on linux.

    RTFA
  • by Daniel Phillips ( 238627 ) on Monday June 02, 2003 @01:52PM (#6097401)
    Actually, Beowulf clusters of 800-1,000 machines running Linux can be competitive with supercomputers.

    News for you: Linux clusters are [techextreme.com] the new supercomputers. Not just Blue Gene, but probably Ascii Purple as well, which is supposed to be the fastest supercomputer ever.
  • by AlphaMaker ( 556605 ) on Monday June 02, 2003 @02:29PM (#6097825)
    Sorry, but your post shows that you have no understanding of what mainframes are used for.

    Supercomputers are used for high performance technical computing. Mainframes, on the other hand are used when you need high reliability/availability. When someone talks about 5-nines reliability, they are saying a system is up 99.999% of the time - equivalent to a couple minutes per year. The systems that achieve this do what is called fault-tolerant computing. It is done by having integrated redundant hardware along with the appropriate specialized software to deal with it.

    You won't find any supercomputer or PC that does this. This is why there will *always* be a market for mainframes. It may not be a huge market, but it's still a market.

  • by rebelcool ( 247749 ) on Monday June 02, 2003 @03:59PM (#6099117)
    MS Research is a totally different entity than the rest of microsoft. MS Research is Microsoft only in name for the most part. Its essentially the best funded and staffed computer science research center in the world.

    They employ people with the likes of Tony Hoare (invented quicksort and the 'hoare triple'). They also hired most of the core developers of the functional language Haskell. And many other brilliant minds.

    Most universities could only dream of the funding that MS research has. And they're completely free to research whatever they want. And of course they use Linux, BSD and whatever other tools are right for the job. They're researchers, not software politicans.

"If it ain't broke, don't fix it." - Bert Lantz

Working...