Ask the Man Behind the NOAA's New Beowulf Cluster 87
Greg Lindahl sent in this story last September about a massive Alpha Linux cluster that's being built by HPTi for the NOAA's Forecast Systems Laboratories. What Greg forgot to mention when he submitted the original story is that he's the project's chief designer. What with all the Beowulf (and Alpha) interest around here, we figured he'd make a great interview guest, especially now that the project is well under way. Please post your questions below. Answers to 10 - 15 of the highest-moderated ones should appear within the next week.
Future Plans? (Score:1)
Are there plans for building clusters for other government agencies, or is this isolated to NOAA right now.
--cyphergirl
AppleSeed Beowulf? (Score:1)
My understanding is that Linux is preemptively-multitasked, so that one process cannot monopolize the processor. This, to my mind, does not lend itself well to a problem where you want the CPU spending all of it's time calculating the answer to some question (e.g. global climate change), rather than listening to the network, etc.
One advantage of the AppleSeed [ucla.edu] project running on MacOS 9.x is that with cooperative multitasking, the program can monopolize the CPU if it wants to.
Doesn't this make MacOS9 better than linux for this type of clustered-CPU-supercomputer? (Of course, differences in hardware may compensate for the differences in multitasking models)
Significant event (Score:1)
All the IPO's, the media hype, and the whiplash publicity have shattered any hope of the Linux community remaining a small-ish group of unrecognized collaborators, but there must be Something Else out there... a new toy to play with...
-Linus Torvalds, October 5, 1991.
Anyways, congratulations to everybody who helped shape Linux into what it is today. I'm not going to claim to be an old-school kernel junkie or anything, but even those of us who got here late can appreciate the odds of succeeding at something like this.
Intra-CPU Bandwidth? (Score:1)
What is the tradeoff between the speed and number of the CPU communication links, and the power and cost of the cluster?
For example, within a given budget, would it be better to physically cluster dozens of blazingly fast SMP systems via a gigabit Ethernet (or point-to-point 100 Mbit Ethernet), or to "logically cluster" tens of thousands of "normal" CPUs via the Internet? Or something in between?
The problems to be attacked will be readily parallelized simulations, but some of them will have a cellular decomposition requiring significant communications with less computation (such as GPS constellation and receiver simulation versus a simulation of hypersonic surface ablation).
For a fixed cost, is one cluster architecture clearly "better" than another? Or does the very notion of a "cluster" negate many of the architectural differences?
My intuition says to buy all the SMP systems I can (dual processor systems are especially inexpensive), and cluster them with whatever network I can afford with the money that remains. That is, the vastly improved IPC provided by SMP will dominate the overall communication cost (allowing less expensive networking to be used), and SMP provides the additional benefit of being better able to support apps that do not partition "nicely" across a vast number of processors.
I suppose I'm really asking this question to avoid having to simulate and test the simulation environment!
-BobC
rcunning@acm.org
Coding style differences for distributed computing (Score:1)
Future of the Alpha (Score:2)
What do you see as the future for the alpha? Will Compaq let it die a slow and unfortunate death? Will it continue in its current niche of "High Performace Technical Computing", and be out of the reach (pricewise) of mere mortals? Or will Compaq ever market them to a wider audience, and hopefully bring the price down?
Many people I know wants an alpha. No one I know thinks they can afford one.
--Bob
Should be: AppleSeed > Beowulf? (Score:1)
How did you choose the CPU type? (Score:1)
Pretty impressive and all..but... (Score:1)
..what?....
..you mean..?
oh....damn....nevermind
Re:Weather forecasting in general. (Score:1)
Personally, I am interested in a more general question regarding the application of the great new data this beast will churn out. Currently, NOAA forcasts are excellent. Even long-term flood and climate forcasting is getting quite reliable. However, the data isn't being well presented. For example, the weather forcaster says it will be cloudy on the nightly news, yet the next day it pours. The models, however, showed a deep low pressure just a pixel or two away. I don't think its necessarily a case of the model being wrong; rather, we need to tell people that the resolution of the data is x and within that distance there is a good chance for heavy rain.
How can we inform the public in such a manner that they understand the data we are generating? What extra steps does NOAA need to take to help people deal with the inherent uncertainty of predicting the future?
__________________________
a NOAA employee who will get back to work now
various... (Score:1)
1. Interconnect. I assume Myrinet, if so, what is the switch fabric. Did you look at SCI? (we have found SCI has lower latency and dosn't require switches in the 2-D torus configuration). If Myrinet, I assume MPICH, right?
2. Failover. We have had a high rate of hardware failure with intel boxes. How are the alphas faring? How do you handle node failures in the middle of a parallel run?
3. Filesystem. Are the machines set up as diskless nodes? What type of fs? NFS, CODA, etc? DHCP, bootp? I would really like to hear details about implementation too.
Technical Challenges (Score:1)
oversea's clusters... (Score:1)
regards,
Benjamin Carlson
A possible paradigm shift in Beowulf technology? (Score:3)
--
Are smaller clusters worth building? (Score:1)
I'm considering converting the network into a beowulf and seeing if I can descramble satellite traffic, without having any idea of what's up there, where to point the dish, or any prior experience with DSP. Do you think it's doable? (Ignoring for the moment my own idiocy in attempting such a thing - assume I'm Einstein on smart pills for the sake of the question
--Charlie
Doing it all over again... (Score:1)
how to start on building beowulf (Score:1)
Re:Why alpha - serious number crunching. (Score:2)
The alpha CPU runs circles around any 'cheaper' (read: x86) CPU while doing just that.
There are some chips that are even faster than the alpha (HP has some nice CPUs), but in relation to x86 the Alpha nodes aren't that much more expensive... Not unlike HP-RISC chips, which probably cost a multitude of the Alphas.
Another reason could be that the academic users use the alpha a lot to do number crunching, which would be of help in the availability of optimised libs. Not forgetting the Compaq tools and math libraries, which simply rock.
--
Okay... I'll do the stupid things first, then you shy people follow.
Re:Who Else? (Score:1)
http://cnls.lanl.gov/avalon/
Storage for a beowulf (Score:1)
What are you currently using, and do you think its OK, or just the best you can get for now? What would you want changed for improvements in usability, performance, and data integrity?
thanks,
-dB
Will it rain? (Score:3)
This brings to mind a more fundamental and philosophical question - Does your computer (or any one that's possible to build) have enough horsepower to out-calculate that analog computer called reality that we all know and love so very much?
Kernel 2.4 (Score:1)
The Future of Scientific Programming? (Score:5)
You have worked in research with Legion and in industry at HPTi. Do you think there is hope for some radical new programming technology that makes clusters easier for scientists to use? If so, what do you think the cluster programming environment of tomorrow might look like?
Job management (Score:4)
Is there provision for shifting jobs onto different nodes if one of them dies during a run?
everyone thinks CPU I think network ! (Score:1)
what I think makes the differance is the network hardware
what are you going for switched..... ?
what hardware are you useing ?
(I use extreame summit48, blackdiamonds Very nice)
regards
john
(a deltic so please dont moan about spelling but the content)
Re:Congratulations. But why??? (Score:2)
Don't take anything, especially life, too seriously.
BTW, I would have no reservations in taking first post on an interview with myself. Not that
My sister was bit by a moose once. (Score:1)
Management and Monitoring (Score:1)
We use va systemimager (www.systemimager.org) to install and update nodes, along with rsync to upload and download data and code to the cluster.
What do you use for such tasks?
It us understandable to be surly sometimes.
Re:Who are the programmers? (Score:1)
I work with parallel computers doing physics stuff. We use mpi exclusively (except on the T3E where we also use their shmem primitives in the most intensive part).
Most supercomputing centers offer free training courses a couple of times a year. In our group, most of us aren't even trained programmers, let alone trained in parallel programming. It's not that hard, although I try to avoid programming in parallel as much as possible, since it's a lot of work. mpi is actually very nice because it is so portable.
What NOAA applications are not ideal for clusters? (Score:3)
Most of the IS/IT trade publications and media usually do not fully comprehend the differences between massively multiprocessor systems with shared memory and those clusters of systems and processors with their own local memory, or supercomputing clusters. This is quite evident in a recent article regarding the TPC-D performance between clusterd Compaq Wintel/MSSQL systems and a single, shared memory Sun/Oracle system where the Compaq cluster outperformed the Sun solution in 2 of the 10 standard benchmarks. Basic laws of statistics negate those results because the design of the two systems were not of the same class -- e.g., to be fair, Microsoft-Compaq should have compared performance to an equivalent cluster of lower-costing Sun systems (let alone a Lintel cluster!).
As you and I already know (and I hope everyone reading this now knows), there are several applications where lower costing clusters cannot always do the job of more costly shared memory systems as efficiently (e.g., low-latency, real-time applications such as real-time simluations, come to mind). That is why the Compaq Wintel cluster scored drastically far below the shared Sun system in many of the other 8 benchmarks in the aforementioned study.
As such, I am interested in the considerations the NOAA has had to make in evaluating shared memory versus clustered systems. Specifically:
The main reason why these engineering applications are so efficient on clusters is the nature of how they use data. They need little to start crunching, and return little. But during the run, they create and use massive ammount of data, which is all "temporary." This is in stark constrast to databases (such as those targetted by the aforementioned TPC-D benchmarks), where data, not computational results, is the focus of the application. By using supercomputing clusters for computational-driven engineering apps, we can save both money on systems and the time of our engineers waiting on results.
As such, I am interested in the overall increase in efficiency you are seeing after the introduction of supercomputing clusters. Specifically:
[ I now work in the semiconductory design industry, and we are looking at acquiring some Linux supercomputing clusters speed up the runs of EDA (electronic design automation) tools like those for IC layout and the like. ]
I appreciate your time and wish your organization and yourself the best wishing in our Linux and OSS endeavors.
-- Bryan "TheBS" Smith
Re:The Future of the Control Software (Score:1)
Re:The Future of the Control Software (Score:2)
There are other systems; AFAPI (dead), MOSIX (SLOW but totally transparent), etc.. I've played with tham all, and I rather like MPI for dedicated clustering and MOSIX for casual 'I need a fast make World' stuff..
performance benchmarks (Score:2)
The overall performance will depend on the type of applications you are running. To that end I also wondering if are you planning on running any standard benchmarks and making the results public? I would be particularly interested in seeing the results from the TPC-C benchmark (http://www.tpc.org [tpc.org]). I'm not sure if it will be even possible to run this benchmark on your system since I don't know how it is configured but it would be nice to see how your system compares in terms of enterprise computing solutions.
Re:Why alpha? (Score:2)
Did you use any sort of optimization algorithms in designing this system? Not just for the number of nodes, but also for quality vs. price, or any other areas.
--
Rolling Cluster (Score:1)
That is all new pcs spend the first three months of their lives as a cluster member. After the three month period the go to their rightfull owner for normal use. I have set up a ltsp X windows terminals useing pcs at work, and getting machines to boot linux without installing linux is trivial. You just put a kernel image on floppy, they boot up, mount the root filesystem via nfs and off they go.
In my opinion all pcs shipped today are way too fast for general business use. Certainly current pcs are much faster than is needed to run office applications and a browser. So in other words the user would not be severly inconvenienced, the entity would always have a kick-arse cluster for only the cost of delaying all new pcs installs for users by 3 months!
So my question is "Have you considered a continuous rolling upgrade of your cluster and if not why?"
Ask The Man? (Score:2)
Clustering vs. Distributed Computing (Score:1)
Ham on rye, hold the mayo please.
What is your installation/administration software? (Score:1)
I've just set up a Beowulf cluster for parallel
programming research at our CS dept., and I wonder
which installation software and administration
tools you have been using on your alpha based
Beowulf.
I patched FAI for our needs so that I get around
with only a magic floppy to install a node from
scratch. Is your installation
developed in-house or do you prefer free software?
And which tools have you found most useful?
Thanks,
What should I be reading? (Score:1)
first? (Score:4)
Screensaver? (Score:1)
Beowulf in General (Score:4)
The end for SC's in Forecasting? (Score:2)
TWW
In the beginning... (Score:5)
Who Else? (Score:4)
Are other government agencies going to duplicate your work? Have they already? If so, for what purposes?
Hardware info? (Score:3)
Oh the temptation... (Score:2)
Secondly, what kind of cooling do you use to keep all those CPU's happy?
Um. err. (Score:1)
M
The Future of the Control Software (Score:5)
Cost and application (Score:2)
Beowulf Design (Score:1)
Why Beowulf? (Score:1)
Modifications for Beowulf? (Score:1)
Weather forecasting in general. (Score:5)
As I understood it weather models are a fairly hard thing to paralleliz (how the hell do you spell that?) because of the interdependence of pieces of the model. This would seem to me to make a Beowulf cluster a tough choice as it's inter-CPU bandwidth is pretty low right? And that's why I thought most weather prediction places chose high end super-computers because of their custom and expensive inter-CPU I/O?
Second part: Is weather prediction getting any better? Everything I've read about dynamic systems says that prediction past a certain level of detail or timeframe is impossible. Is that true?
Disclaimer: I might be dumb.
Hotnutz.com [hotnutz.com] - Funny
Re:Question about maintinance. (Score:1)
What about a dnet type client? (Score:5)
It seems to me that there are more than a few people who are willing to donate spare cpu cycles for various projects. At a minimum. you could concentrate on the client side binaries and not worry as mouch about hardware issues.
Finer grain parallel linux system and debugging (Score:1)
Long term predictions (Score:2)
BTW, As a Florida resident, accurate forecasting of hurricane paths could save millions of dollars. Thanks for your time. Kwan
Link to Miami Herald article from May 21, 2000 [herald.com]
When is a GNU/Linux cluster not a good choice? (Score:1)
Re:When is a GNU/Linux cluster not a good choice? (Score:1)
Your instinct is correct. I believe the discussion you are looking for is supercomputing clusters (e.g., Beowolf, etc...) versus shared memory (e.g., SGI/Cray, Sun, etc...) systems. I have a similiar (albeit longer) post [slashdot.org] and question set above. Check it out if you like.
-- Bryan "TheBS" Smith
bandwidth (Score:1)
Re:Congratulations. But why??? (Score:1)
take the pickle outta yer *ss and laughe a little, might actually bring some color to your face.
moderate the parent to funny! certainly not offtopic......
Design your own BeoWulf Cluster at Home! (Score:1)
Can anyone please post information on HOWTO set up your own 2-computer cluster.
Also : What are the programming considerations, as well as pitfalls and suggestions in setting it up.
I would really like to get it going experimentally...
Thanks in advance,
Beowulfs in Business (Score:2)
Congratulations on NOAA, BTW. As a former UVA CS student, its nice to see your work with Legion and beowulf systems continue to succeed. For people outside of the clustering community, take a look at http://legion.virginia.edu [virginia.edu].
Recently I have been seeing the beginnings of business adoption of beowulf style systems, as they are finally realizing the benefits which the scientific community has been enjoying for years ;). Up to now, however, most of the tools for beowulf work, such as schedulers, message passing APIS's, administrative tools, and file systems have been geared towards scientific problems, often lacking such features as fault tolerance or security. Has there been an anti-business bias within the beowulf community? And, if so, what do you think will be needed to change it?
And, as an unrelated question, if you could see one advance in beowulf technology happen tommorrow, what would it be?
Can I have a job? (Score:1)
Reliability - general & alpha vs intel (Score:1)
In particular, on the cluster reliability side, what are you doing to maximize node MTBF? What are you doing to minimize node downtime? How long does it take you to diagnose & replace/bring up a crashed node? How many nodes are currently down? How long can the cluster be expected to run before a node crashes?
On the application side, with MPI one needs to use complicated event loops with timeouts and/or are-you-there? pings to detect node outages. This makes doing parallel fault tolerant programming very complex. Are you doing anything to reduce this complexity? For example, what about a parallel programming API with fault tolerance built in? What about using PVM, where you can at least request notification of program crashes & computer outages. Have you explored other possibilities?
Finally, although a year ago alpha price/performance #s were much better than intel's, today intel's #s are so much better that they remain higher than the alpha's even when factoring in the large fixed cost of the internal cluster high speed networking. Also, linux tends to be more stable on intel than on alpha. Not to mention the question of how much longer alphas will continue to be produced. Given these issues, how happy are you with the choice of using alphas instead of intels?
Regarding linux/alpha robustness, what are you doing to make the alphas behave more robustly? What Linux distro, distro version, kernel version, libs versions, etc. are you using? What kernel patches, system patches, internally developed tweaks/patches are you using to make the alpha systems more robust?
Semi-Related: Clusters vs. Supercomputers (Score:1)
thanks, kristau
Re:What about a dnet type client? (Score:1)
Bryan R.
Hi Greg! (Score:1)
Ayden
details please (Score:1)
Details Deatils - your first post should have been a description of what you built, how you built it, what it will do and how can you make it easier for me when I build mine.
is all your work going to be open source. will you release your customizations to us.
I'll take my answer off the air?
Beowulf Alternatives? (Score:5)
Additionally, to what would you compare the system that you are planning to build, as far as computing power is concerned?
Thanks,
VVulfe
MPI restrictive? (Score:1)
ttyl
tpb
Biggest whack in the head? (Score:5)
What was the biggest 'WTF was I thinking' on this project? I'd imagine there was a fair amount of lateral space allowed to the designers, and freedom to design also means freedom to screw up.
Imagine ... (Score:4)
Seriously, what was the most challenging of maintainence tasks you had to undertake? Do you anticipate that a trade off point where the number of machines makes maintanence impossible? Do you have any pearls of wisdom for those of us just involved in the initial design of such clusters, so that maintaining it in the future is less painful?
Why not rackmounted servers? (Score:2)
Essence of a Beowulf cluster (Score:1)
Applications of this cluster? (Score:3)
1) Having just graduated with a BS in Atmospheric Sciences, I have had a chance to take numerical weather prediction courses over the last five years. With this new influx of processing power, where do you see numerical models going in the future?
2) Somewhat related to 1), with mesoscale models becoming more popular (MM5 quickly springs to mind), where do you see the balance of processor time going to these models. The ability to get a model out faster, or to compute more variables to provide a more accurate forecast at the smaller scale?
3) Not knowing too much about the origins of these models, I was interested to find that a person could get the source to the MM5 and modify it as they see fit. Will models developed in the future follow this same trend? With powerful computers becoming affordable, it would not be that difficult for a university to build one and run a particular model for their area (I believe that Ohio State is doing it, again, with the MM5)?
Thanks!
Bryan R.
Wait for it... (Score:1)
Wouldn't it be great to get a beowulf cluster of him?
Why GNU/Linux ? (Score:1)
I know you couldn't call it a Beowulf if you chose otherwise, but there's got to be more to it than that (I guess). What other OS'es did you consider, and why did you pick GNU/Linux ?
Quake? (Score:1)
Will you help spread the word about Open Source? (Score:2)
Whatever your answer, I think it's fair to say that there is something about this system, which uses an open-source clustering technology, built on top of an open-source operating system, which made it best for your needs; maybe it was the reliabilty, or the ability to modify it as needed, or maybe just the lower dollar cost to your department.
My question then, is this: have you given any thought to how you can help advance open source software, to give back to the community that created this tool? Getting the word out that the U.S. Goverment uses Linux for its cutting-edge weather forecasting tool would be an enormous PR win for the folks that still have trouble convincing their management that OSS software can be trusted for "real work." I'm not suggesting putting a picture of 'Tux' on every weather forecast, (although that would be kinda cute,) but it would be great if NOAA press releases [noaa.gov] about the project gave at least passing mention to the fact that the project will be benefitting from open source software.
I realize this is not something you would normally do for, say a Cray or IBM, but those are commercial enterprises, with their own PR budgets; they don't need your help to get their word out. OSS needs all the help it can get, so that future projects like yours can continue to reap the benefits.
Re:Why not rackmounted servers? (Score:2)
Who are the programmers? (Score:2)
So who do you have doing the programming for this thing? Did they take special training, or is it easy to pick up for any programmer?
Finally, given the possible difficulty (and speciality) of using the above, has anyone considered using DIPC?
--
Have Exchange users? Want to run Linux? Can't afford OpenMail?
Question about maintinance. (Score:5)
Besides that, best of luck, and I can't wait to see the final product. ;^)
-legolas
i've looked at love from both sides now. from win and lose, and still somehow...
Why alpha? (Score:5)
Future of SuperComputing (Score:1)
Also, a related question is how much of a role do you think the free software community will continue to play in advancing super-computing towards the masses?
Which parallel programming toolkit and why? (Score:1)
For writeing apps will you guys be useing PVM, MPI, or something else? Why did you choose that toolkit?