Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Technology

Ask the Man Behind the NOAA's New Beowulf Cluster 87

Greg Lindahl sent in this story last September about a massive Alpha Linux cluster that's being built by HPTi for the NOAA's Forecast Systems Laboratories. What Greg forgot to mention when he submitted the original story is that he's the project's chief designer. What with all the Beowulf (and Alpha) interest around here, we figured he'd make a great interview guest, especially now that the project is well under way. Please post your questions below. Answers to 10 - 15 of the highest-moderated ones should appear within the next week.
This discussion has been archived. No new comments can be posted.

Ask the Man Behind the NOAA's New Beowulf Cluster

Comments Filter:
  • During my time at NASIRC (NASA's security incident response thingy...) this NOAA project had everyone begging for one...including the IG. :)

    Are there plans for building clusters for other government agencies, or is this isolated to NOAA right now.

    --cyphergirl

  • I risk a great deal, I realize, by posting without complete knowledge of the subject, but here goes:

    My understanding is that Linux is preemptively-multitasked, so that one process cannot monopolize the processor. This, to my mind, does not lend itself well to a problem where you want the CPU spending all of it's time calculating the answer to some question (e.g. global climate change), rather than listening to the network, etc.

    One advantage of the AppleSeed [ucla.edu] project running on MacOS 9.x is that with cooperative multitasking, the program can monopolize the CPU if it wants to.

    Doesn't this make MacOS9 better than linux for this type of clustered-CPU-supercomputer? (Of course, differences in hardware may compensate for the differences in multitasking models)

  • Wow, this calls for a celebration! I can't think of anything that would be more gratifying than having Linux be chosen to power such a high-profile, "officially-condoned" cluster. Although it probably will make some of the people in the /. community sad, to see their labor of love go "establishment" so quickly, I personally think it's very cool.

    All the IPO's, the media hype, and the whiplash publicity have shattered any hope of the Linux community remaining a small-ish group of unrecognized collaborators, but there must be Something Else out there... a new toy to play with...

    1. "Do you pine for the nice days of minix-1.1, when men were men and wrote their own device drivers? Are you without a nice project and just dying to cut your teeth on a OS you can try to modify for your needs? Are you finding it frustrating when everything works on minix? No more all- nighters to get a nifty program working? Then this post might be just for you :-)"

    -Linus Torvalds, October 5, 1991.

    Anyways, congratulations to everybody who helped shape Linux into what it is today. I'm not going to claim to be an old-school kernel junkie or anything, but even those of us who got here late can appreciate the odds of succeeding at something like this.

  • by Anonymous Coward
    This topic is very timely, as I have just started talking with a colleague about building an inexpensive Linux cluster. The #1 problem is matching the problem to the architecture of the cluster, and vice-versa. The problem we were left with boils down to this:

    What is the tradeoff between the speed and number of the CPU communication links, and the power and cost of the cluster?

    For example, within a given budget, would it be better to physically cluster dozens of blazingly fast SMP systems via a gigabit Ethernet (or point-to-point 100 Mbit Ethernet), or to "logically cluster" tens of thousands of "normal" CPUs via the Internet? Or something in between?

    The problems to be attacked will be readily parallelized simulations, but some of them will have a cellular decomposition requiring significant communications with less computation (such as GPS constellation and receiver simulation versus a simulation of hypersonic surface ablation).

    For a fixed cost, is one cluster architecture clearly "better" than another? Or does the very notion of a "cluster" negate many of the architectural differences?

    My intuition says to buy all the SMP systems I can (dual processor systems are especially inexpensive), and cluster them with whatever network I can afford with the money that remains. That is, the vastly improved IPC provided by SMP will dominate the overall communication cost (allowing less expensive networking to be used), and SMP provides the additional benefit of being better able to support apps that do not partition "nicely" across a vast number of processors.

    I suppose I'm really asking this question to avoid having to simulate and test the simulation environment!

    -BobC
    rcunning@acm.org
  • My question is a bit more on the technical side. "Clusters" seem to fall in a sort of new area between MPP neural networks and conventional computers. In general, are there any special tricks or differences in coding conventions between single-system boxes, typical custom MPP systems, and massively parallel, slow bandwidth "cluster" systems? Are there accepted new ways of efficiently utilizing the unique properties of a clustered approach (i.e.: genetic algorythms, simulated neural networs, auto-distributing MPP), or does one just "pretend it's a normal box, just lots bigger"? Further, can you point interested readers towards any material on learning to code for these bad boys?
  • I know you have been involved with Alpha/Linux for some time, I remember running into you on mailing lists two years ago, and you were still the expert then.

    What do you see as the future for the alpha? Will Compaq let it die a slow and unfortunate death? Will it continue in its current niche of "High Performace Technical Computing", and be out of the reach (pricewise) of mere mortals? Or will Compaq ever market them to a wider audience, and hopefully bring the price down?

    Many people I know wants an alpha. No one I know thinks they can afford one.

    --Bob

  • Previewed, but missed the subject typo. Dammit!
  • Why Alphas and not Intel, AMD, whatever?

  • Could you imagine a beowulf clust...

    ..what?....

    ..you mean..?

    oh....damn....nevermind
  • I attended a talk on ocean circulation modelling, which in practice is quite similar to weather modelling. Essentially, you can partition the globe (or whatever subset you are concerned with) into 3 dimensional units. You can even used differently sized chunks depending on your data, modelling needs, etc. You initialize all the units, grind away for a given time slice, and then exchange boundary conditions w/ the adjacent units. Resolve the discrepencies, and grind away again. Obviously, the shorter the time slice, the greater the data-transfer needs so inter-CPU communication would probably set a lower bound on this parameter. Otherwise, the problem lends itself well to this type of hardware configuration.

    Personally, I am interested in a more general question regarding the application of the great new data this beast will churn out. Currently, NOAA forcasts are excellent. Even long-term flood and climate forcasting is getting quite reliable. However, the data isn't being well presented. For example, the weather forcaster says it will be cloudy on the nightly news, yet the next day it pours. The models, however, showed a deep low pressure just a pixel or two away. I don't think its necessarily a case of the model being wrong; rather, we need to tell people that the resolution of the data is x and within that distance there is a good chance for heavy rain.

    How can we inform the public in such a manner that they understand the data we are generating? What extra steps does NOAA need to take to help people deal with the inherent uncertainty of predicting the future?

    __________________________
    a NOAA employee who will get back to work now
  • by Anonymous Coward
    These are a little selfish as I have a cluster running atmospheric code too:

    1. Interconnect. I assume Myrinet, if so, what is the switch fabric. Did you look at SCI? (we have found SCI has lower latency and dosn't require switches in the 2-D torus configuration). If Myrinet, I assume MPICH, right?

    2. Failover. We have had a high rate of hardware failure with intel boxes. How are the alphas faring? How do you handle node failures in the middle of a parallel run?

    3. Filesystem. Are the machines set up as diskless nodes? What type of fs? NFS, CODA, etc? DHCP, bootp? I would really like to hear details about implementation too.
  • What were the toughest technical challenges involved in creating this cluster? How did you solve them?
  • Having the knowledge, and experience to build this type of system, have you been contacted by individuals, or companies about helping/building another for them? If so are you concerned about the legalities of exporting a beowulf super-computer? I am inquiring in reference to the US Governments restrictions on Cray Research and others exporting systems of this processing power.
    regards,
    Benjamin Carlson
  • I have recently become gainfully employed in a capacity which will require me to administer a Beowulf cluster. My question, Mr. Lindahl, is how you feel about the various competing technologies for distribution of computation. In particular, do you feel there is much to be gained from the work of the MOSIX project at The Hebrew University of Jerusalem? Traditionally tasks for Beowulf style supercomputers have required specific programming in MPI or PVM calls. MOSIX endeavors to provide adaptive load-balancing with process migration. Essentially this allows the programmer to forgo the hassle of parallelizing his code. Rather, he can now simply fork() or create SMP threads and the OS will automatically handle distribution of those processes over the cluster. Do you feel that this is a worthwhile avenue to pursue for scientific computation or are there issues which make MPI or PVM still a substantially better choice? Thank you for your time.
    --
  • I've got a ten-node network in my basement that I built out of discarded Pentiums I picked up for free in various places. I recently have been offered a free satellite dish (one of the older, large TV dishes) with a motorized positioner and associated electronics.
    I'm considering converting the network into a beowulf and seeing if I can descramble satellite traffic, without having any idea of what's up there, where to point the dish, or any prior experience with DSP. Do you think it's doable? (Ignoring for the moment my own idiocy in attempting such a thing - assume I'm Einstein on smart pills for the sake of the question :^)
    --Charlie
  • Pretend for a minute some alien device, or virus, etc. wiped away all your preferences for Alphas, linux, etc. and you had to make all your decisions about your super clusters over again. Would you still pick the alpha as your processor, or would you pick an IA32, IA64, Athlon, or PowerPC processor? Why? You would probably still pick linux, because it is such a great operating system, but what other OS's would you at least consider? What would you pick for your interconnect? Would it still be Myrinet, or would it be Gigabit ethernet, or InfiniBand? Are there any technologies on the horizon that us mere mortals haven't heard of yet that would affect your decision? Don't worry, I don't plan on starting a competing company (at least not yet). --WindChild.
  • I know there exists a Linux HOWTO on building beowulf. But what would you say to someone who is a hobbiyst wanting to build his own beowulf cluster?
  • My guess there is simply that this cluster will be used for sims with gazillions of floating point numbers, waiting happily to be crunched.
    The alpha CPU runs circles around any 'cheaper' (read: x86) CPU while doing just that.

    There are some chips that are even faster than the alpha (HP has some nice CPUs), but in relation to x86 the Alpha nodes aren't that much more expensive... Not unlike HP-RISC chips, which probably cost a multitude of the Alphas.

    Another reason could be that the academic users use the alpha a lot to do number crunching, which would be of help in the availability of optimised libs. Not forgetting the Compaq tools and math libraries, which simply rock.
    --
    Okay... I'll do the stupid things first, then you shy people follow.

  • actually, Avalon is build much cheaper then the NOAA computer, but I guess they don't go for the real expansive integrater.. And they have different purpose.

    http://cnls.lanl.gov/avalon/
  • What sort of storage solutions make sense for beowulf applications? Locally attached disks, with something like a network block device? Distributed file systems? NFS? AFS? Clusterized file systems (eg: GFS)? Fibre Channel?

    What are you currently using, and do you think its OK, or just the best you can get for now? What would you want changed for improvements in usability, performance, and data integrity?

    thanks,

    -dB

  • by wass ( 72082 ) on Tuesday May 23, 2000 @06:17AM (#1053744)
    Okay, here's my question. What will the weather be like on September 27, 2005, in Baltimore, MD?

    This brings to mind a more fundamental and philosophical question - Does your computer (or any one that's possible to build) have enough horsepower to out-calculate that analog computer called reality that we all know and love so very much?

  • How is the 2.4 kernel going to effect teams building large scale clusters? What changes would you like to see implemented in the 2.6 kernel to aide your job?
  • by Matt Gleeson ( 85831 ) on Tuesday May 23, 2000 @06:19AM (#1053746) Homepage
    The raw performance of the hardware being used for scientific and parallel programming has improved by leaps and bounds in the past 10-20 years. However, most folks still program these supercomputers much the same way they did in the 80's: Unix, Fortran, explicit message passing, etc.

    You have worked in research with Legion and in industry at HPTi. Do you think there is hope for some radical new programming technology that makes clusters easier for scientists to use? If so, what do you think the cluster programming environment of tomorrow might look like?
  • by gcoates ( 31407 ) on Tuesday May 23, 2000 @06:20AM (#1053747)
    One of the weaknesses for beowulfs seems to me to be a lack of decent (job) management software. How do you split the clusters resources? Do you run one large simulation on all the CPUs, or do you run 2 or 3 jobs on 1/2 or 1/3 of the available CPUs?

    Is there provision for shifting jobs onto different nodes if one of them dies during a run?
  • I see from the PR that you are useing API CPU's in API motherboards

    what I think makes the differance is the network hardware

    what are you going for switched..... ?
    what hardware are you useing ?
    (I use extreame summit48, blackdiamonds Very nice)

    regards

    john


    (a deltic so please dont moan about spelling but the content)
  • This guy is a gem. Not only is he a noted designer of world-class supercomputers, he has a sense of humor.

    Don't take anything, especially life, too seriously.

    BTW, I would have no reservations in taking first post on an interview with myself. Not that /. would interview me, but...
  • No, really. Moose bites can be very dangerous. That bit of silliness out of my system, I realize that posts like this are engaged in as a sort of protest against Slashdot's sometimes Draconian anti-troll measures, but are they REALLY helping to solve the problem? More trolls means more draconian measure, and as we've all seen, more draconian measures means more trolls. SOMEONE has to show some maturity here and stop this cycle. For chrissakes, people, we aren't in highschool anymore. It is no longer funny to play "interruption" until someone cries. If you don't like Slashdot, don't whine about it, and definitely don't add your own piss to the pool. DO SOMETHING ABOUT IT.
  • We have a beowulf cluster and I have some nice techniques for dealing with the management of nodes. We do not, as of yet, have any monitoring software. What do you use or suggest for monitoring and management? I've seen bWatch but didn't think it was that great for what I wanted.
    We use va systemimager (www.systemimager.org) to install and update nodes, along with rsync to upload and download data and code to the cluster.
    What do you use for such tasks?
    It us understandable to be surly sometimes.
  • Did they take special training, or is it easy to pick up for any programmer?

    I work with parallel computers doing physics stuff. We use mpi exclusively (except on the T3E where we also use their shmem primitives in the most intensive part).

    Most supercomputing centers offer free training courses a couple of times a year. In our group, most of us aren't even trained programmers, let alone trained in parallel programming. It's not that hard, although I try to avoid programming in parallel as much as possible, since it's a lot of work. mpi is actually very nice because it is so portable.

  • by BitMan ( 15055 ) on Tuesday May 23, 2000 @07:18AM (#1053753)

    Most of the IS/IT trade publications and media usually do not fully comprehend the differences between massively multiprocessor systems with shared memory and those clusters of systems and processors with their own local memory, or supercomputing clusters. This is quite evident in a recent article regarding the TPC-D performance between clusterd Compaq Wintel/MSSQL systems and a single, shared memory Sun/Oracle system where the Compaq cluster outperformed the Sun solution in 2 of the 10 standard benchmarks. Basic laws of statistics negate those results because the design of the two systems were not of the same class -- e.g., to be fair, Microsoft-Compaq should have compared performance to an equivalent cluster of lower-costing Sun systems (let alone a Lintel cluster!).

    As you and I already know (and I hope everyone reading this now knows), there are several applications where lower costing clusters cannot always do the job of more costly shared memory systems as efficiently (e.g., low-latency, real-time applications such as real-time simluations, come to mind). That is why the Compaq Wintel cluster scored drastically far below the shared Sun system in many of the other 8 benchmarks in the aforementioned study.

    As such, I am interested in the considerations the NOAA has had to make in evaluating shared memory versus clustered systems. Specifically:

    • What are some of the NOAA/NWS programs and software that will not be applicable for execution on this new cluster?
    • What [estimated] percentage do these programs make up of the total applications the NOAA uses, both quantity and in time of execution?
    • What [assuming] shared memory systems and solutions does the NOAA use for these applications?
    Of course, the lower the number in the first two questions, the more advantageous the existence of a supercomputing cluster is to an organization. For example, in the aerospace industry, the quantity of cluster-efficient applications may be small, but the total execution time of a "run" of these select applications can greatly outweigh all others. Again, speaking from my aerospace background, such applications like Monte Carlo, CFD, 6DOF (six degrees of freedom) runs and simulations are extremely time consuming. Monte Carlo is an ideal application for clustering since each "run" result is complete independent from another (almost linear performance improvement when distributed in a cluster). CFD is very close to linear (~90% efficient) and 6DOF, I would guess, could be as high as 60 or 70%, if it is written to take advantage of distributed computing systems.

    The main reason why these engineering applications are so efficient on clusters is the nature of how they use data. They need little to start crunching, and return little. But during the run, they create and use massive ammount of data, which is all "temporary." This is in stark constrast to databases (such as those targetted by the aforementioned TPC-D benchmarks), where data, not computational results, is the focus of the application. By using supercomputing clusters for computational-driven engineering apps, we can save both money on systems and the time of our engineers waiting on results.

    As such, I am interested in the overall increase in efficiency you are seeing after the introduction of supercomputing clusters. Specifically:

    • By executing appropriate applications on supercomputer clusters, what price/performance efficiency do you see over execution on equivalent shared memory systems? [e.g., for CFD, we found equivalently performing supercomputing Linux clusters cost 5-10% of the cost of shared memory systems from Sun and SGI.]
    • In addition to these computational-intensive applications, do you have any data-intensive applications (if any) that are more price/performance efficient (not necessarily faster overall) on clusters than shared memory systems? [I personally have not been able to justify clusters for such uses, yet]

    [ I now work in the semiconductory design industry, and we are looking at acquiring some Linux supercomputing clusters speed up the runs of EDA (electronic design automation) tools like those for IC layout and the like. ]

    I appreciate your time and wish your organization and yourself the best wishing in our Linux and OSS endeavors.

    -- Bryan "TheBS" Smith

  • Yes! This question has to be included in the interview. What message passing protocol to use is very fundamental to anyone (like me) wanting to start on a beowulf project. I've done a bit of research about it on the web but have not reached any conclusions. What are the trade-offs between MPI and PVM? What does the NOAA project use, and why? What does anyone else reading the msg use, and why?
  • MPI gives you native, transparent, and fast parallel functions at the cost of dramatically increased programmer headache. PVM gives you kinda-portable, relativly obvious parallel functions at the cost of overhead.

    There are other systems; AFAPI (dead), MOSIX (SLOW but totally transparent), etc.. I've played with tham all, and I rather like MPI for dedicated clustering and MOSIX for casual 'I need a fast make World' stuff..
  • I was interested in knowing how you were handling the issue of throughput on the backplane, which in the case of a distributed supercomputer is a network conection. Obviously, a distributed supercomputer will give you more bang for the buck in terms of raw processing power than a single supercomputer. But how much of an impact does this use of network connections have on the overall performance?

    The overall performance will depend on the type of applications you are running. To that end I also wondering if are you planning on running any standard benchmarks and making the results public? I would be particularly interested in seeing the results from the TPC-C benchmark (http://www.tpc.org [tpc.org]). I'm not sure if it will be even possible to run this benchmark on your system since I don't know how it is configured but it would be nice to see how your system compares in terms of enterprise computing solutions.

  • You can set up a nice differential equation to find the optimum # of nodes and $ per node.

    Did you use any sort of optimization algorithms in designing this system? Not just for the number of nodes, but also for quality vs. price, or any other areas.

    --
  • I have often wondered why people who buy a lot of computers, eg universities, GMH, banks etc, don't buy a rolling cluster.

    That is all new pcs spend the first three months of their lives as a cluster member. After the three month period the go to their rightfull owner for normal use. I have set up a ltsp X windows terminals useing pcs at work, and getting machines to boot linux without installing linux is trivial. You just put a kernel image on floppy, they boot up, mount the root filesystem via nfs and off they go.

    In my opinion all pcs shipped today are way too fast for general business use. Certainly current pcs are much faster than is needed to run office applications and a browser. So in other words the user would not be severly inconvenienced, the entity would always have a kick-arse cluster for only the cost of delaying all new pcs installs for users by 3 months!

    So my question is "Have you considered a continuous rolling upgrade of your cluster and if not why?"
  • I don't want to Ask The Man, I want to Stick It To The Man!
  • What do you think about distributed computing? While the number of computers may vary, it has been shown (with distributed.net [distributed.net], among others) that this can be a very plentiful source of computing power. Could this be harnessed for real-time uses?
    Ham on rye, hold the mayo please.
  • Hi,

    I've just set up a Beowulf cluster for parallel
    programming research at our CS dept., and I wonder
    which installation software and administration
    tools you have been using on your alpha based
    Beowulf.

    I patched FAI for our needs so that I get around
    with only a magic floppy to install a node from
    scratch. Is your installation /maintenance tools
    developed in-house or do you prefer free software?
    And which tools have you found most useful?

    Thanks,
  • If I am interested in working in large-scale cluster building and clustering applications, what are the best sources I could go to to pick up the related skills? What books/white papers would you recommend?

  • by Greg Lindahl ( 37568 ) on Tuesday May 23, 2000 @06:02AM (#1053763) Homepage
    first post?
  • How fast does it scan the radio waves from the Seti Project???
  • by BgJonson79 ( 129962 ) <srsmith@@@alum...wpi...edu> on Tuesday May 23, 2000 @06:05AM (#1053765)
    How do you think the new wave of Beowulf clusters will effect all of supercomputing, not just forcasting?
  • Does Beowulf mean the end of super-computers in weather-forecasting?

    TWW

  • by zpengo ( 99887 ) on Tuesday May 23, 2000 @06:06AM (#1053767) Homepage
    How did you come to be the project's chief designer? I'm curious to know the background of anyone who gets to work on such an interesting project.
  • by Alarmist ( 180744 ) on Tuesday May 23, 2000 @06:04AM (#1053768) Homepage
    You've built a large cluster of machines on a relatively pea-sized budget.

    Are other government agencies going to duplicate your work? Have they already? If so, for what purposes?

  • by matticus ( 93537 ) on Tuesday May 23, 2000 @06:04AM (#1053769) Homepage
    can you give us some information about what exactly is in this cluster? what alphas, etc?
  • Are you ever tempted to swipe some of that computer time for say raytracing (povray clusters well, since tasks can be divided by line or frame, and the input is fairly granular...) or any other sort of task.

    Secondly, what kind of cooling do you use to keep all those CPU's happy?
  • Can I have one?

    M
  • by PacketMaster ( 65250 ) on Tuesday May 23, 2000 @06:08AM (#1053772) Homepage
    I built a Beowulf-style cluster this past semester in college for independent study. One of the biggest hurdles we had was picking out a message passing interface such as MPI or PVM. Configurining across multiple platforms was then even worse (we had a mixture of old Intels, SunSparcs and IBM RS/6000's). What do you see in the future for these interfaces in terms of setup and usage and will cross-platform clusters become easier to install and configure in the future?

  • It is known that a cluster can be put together for a relative low cost. At least when compared to supercomputers. Do you see cost as a reason this magnitude of computing power hasn't been available to much of the commercial region? Do you think that these current successful low-cost implementations will speed development of any commercial applications for tools such as these? And what commercial situation(s), if any, do you see a cluster being applied to?
  • What exactly goes into putting a beowulf system together, especially one of this size?
  • When evaluating possible system designs, what made you choose a Beowulf cluster for this project over SGI-based systems? Was it simply a matter of cost of cost or were their other factors involved in that decision?
  • What types of modifications did you have to make to your legacy software in order to allow it to run on the cluster? About how many man-hours were expended in this reprogramming effort?
  • by Matt2000 ( 29624 ) on Tuesday May 23, 2000 @06:29AM (#1053777) Homepage
    Ok, a two parter:

    As I understood it weather models are a fairly hard thing to paralleliz (how the hell do you spell that?) because of the interdependence of pieces of the model. This would seem to me to make a Beowulf cluster a tough choice as it's inter-CPU bandwidth is pretty low right? And that's why I thought most weather prediction places chose high end super-computers because of their custom and expensive inter-CPU I/O?

    Second part: Is weather prediction getting any better? Everything I've read about dynamic systems says that prediction past a certain level of detail or timeframe is impossible. Is that true?

    Disclaimer: I might be dumb.

    Hotnutz.com [hotnutz.com] - Funny
  • by Anonymous Coward
    After only one OS upgrade (RedHat 5.2 to 6.0), I decided that maintaining an OS on each system would be a real pain in the patootie. So, now, one system is a boot server for the others and they each mount their / and /usr file systems from that server and keep only /tmp swap and a public file system for large data files locally. At least that part of maintaining the systems has gone away. We only have a small system _ 16 cpus spread amont 8 PC chassis _ so I don't know how well the scheme scales to large Beowulf clusters. For us, though, we haven't noticed any degradation in service.
  • by x0 ( 32926 ) on Tuesday May 23, 2000 @06:30AM (#1053779) Homepage
    I am curious as to whether (no pun intended...:)) or not you have ever done any testing to see if a distributed.net type enviornment would be useful for your type of work?

    It seems to me that there are more than a few people who are willing to donate spare cpu cycles for various projects. At a minimum. you could concentrate on the client side binaries and not worry as mouch about hardware issues.

  • Everyone seems to be building a Beowulf these days. However many scientific applications are not as easily scaleable to such a coarse grained system. Are there, or rather, are you supporting any efforts to build finer grained and shared memory clusters using linux? Also I find debugging parallel programs terribly time consuming. What kind of tools do you use?
  • A recent Miami Herald article talked about the use of an IBM RS/6000SP to process weather data. It's close to 30 times as fast as the previous machine. Though I'm curious as to how this machine compares to the NOAA supercomputer, I'm really interested in how much better predictions can get with systems like the one at NOAA. How much (statistical) confidence is in current weather prediction over 1 month? 3 months? 1 year? How much will the NOAA system expect to improve weather forecasting.
    BTW, As a Florida resident, accurate forecasting of hurricane paths could save millions of dollars. Thanks for your time. Kwan

    Link to Miami Herald article from May 21, 2000 [herald.com]
  • Being a biotech guy, I am interested in the use of Beowulf-style clusters for DNA sequence alignments and searches, etc. Incyte Corp. [incyte.com] and Volker Brendel [iastate.edu] at Iowa state already use Linux clusters, because their architecture is great for simultaneously aligning lots of different DNA sequences...I suppose forecasting gleams similar benefits. In what cases would a cluster be an inappropriate and/or inefficient soloution to a massive computational problem? When would you have to use a Cray or other big monolithic vector rig?
  • Your instinct is correct. I believe the discussion you are looking for is supercomputing clusters (e.g., Beowolf, etc...) versus shared memory (e.g., SGI/Cray, Sun, etc...) systems. I have a similiar (albeit longer) post [slashdot.org] and question set above. Check it out if you like.

    -- Bryan "TheBS" Smith

  • How will you deal with network latency and traffic? You will have to use at least gigabit ethernet and the associated equipment. Copper or fiber?
  • honestly, lighten up

    take the pickle outta yer *ss and laughe a little, might actually bring some color to your face.

    moderate the parent to funny! certainly not offtopic......

  • If there is one thing that I would like to know, and if anyone has any information regarding this:

    Can anyone please post information on HOWTO set up your own 2-computer cluster.

    Also : What are the programming considerations, as well as pitfalls and suggestions in setting it up.

    I would really like to get it going experimentally...

    Thanks in advance,
  • Congratulations on NOAA, BTW. As a former UVA CS student, its nice to see your work with Legion and beowulf systems continue to succeed. For people outside of the clustering community, take a look at http://legion.virginia.edu [virginia.edu].

    Recently I have been seeing the beginnings of business adoption of beowulf style systems, as they are finally realizing the benefits which the scientific community has been enjoying for years ;). Up to now, however, most of the tools for beowulf work, such as schedulers, message passing APIS's, administrative tools, and file systems have been geared towards scientific problems, often lacking such features as fault tolerance or security. Has there been an anti-business bias within the beowulf community? And, if so, what do you think will be needed to change it?

    And, as an unrelated question, if you could see one advance in beowulf technology happen tommorrow, what would it be?

  • I work down the hall and walk by your huge computer room everday. Can I have a job?
  • The more nodes you have, the more likely it is that a node is going to crash during a computation. This yields a maximum cluster size for parallel applications which aren't fault tolerant. What are you doing to address this?

    In particular, on the cluster reliability side, what are you doing to maximize node MTBF? What are you doing to minimize node downtime? How long does it take you to diagnose & replace/bring up a crashed node? How many nodes are currently down? How long can the cluster be expected to run before a node crashes?

    On the application side, with MPI one needs to use complicated event loops with timeouts and/or are-you-there? pings to detect node outages. This makes doing parallel fault tolerant programming very complex. Are you doing anything to reduce this complexity? For example, what about a parallel programming API with fault tolerance built in? What about using PVM, where you can at least request notification of program crashes & computer outages. Have you explored other possibilities?

    Finally, although a year ago alpha price/performance #s were much better than intel's, today intel's #s are so much better that they remain higher than the alpha's even when factoring in the large fixed cost of the internal cluster high speed networking. Also, linux tends to be more stable on intel than on alpha. Not to mention the question of how much longer alphas will continue to be produced. Given these issues, how happy are you with the choice of using alphas instead of intels?

    Regarding linux/alpha robustness, what are you doing to make the alphas behave more robustly? What Linux distro, distro version, kernel version, libs versions, etc. are you using? What kernel patches, system patches, internally developed tweaks/patches are you using to make the alpha systems more robust?
  • It seems that people are confusing, or more precisely, interchanging the terms supercomputer and cluster recently. The link to NOAA's public affairs page [noaa.gov] mentions cluster only once, but supercomputer is mentioned six times. Are clusters really becoming powerful enough to be classified as supercomputers yet?

    thanks, kristau

  • When dealing with something that isn't time sensitive (i.e. SETI or distributed.net), I would imagine things wouldn't be hampered when a user's machine is connected to the distributed network or disconnected. Dealing with weather data is a completely different matter. In the development process for a model, one of the biggest factors is how long it takes for processing of the data then the post processing. If a set number of users were on at one time, then missing the next, those numbers would be fubar. Add to that how fast the atmosphere changes, on the order of hours, would make this a problem. It may not sound like a big deal, but in the field, things are extremely time dependent. Just ask a meteorologist how the fire at NCEP that took down a CRAY last year impacted their forecasting capability. Being pretty green about all of this, maybe I am off, but from what I have observed in the last five years, making sure the cpu cycles are there is very important.

    Bryan R.
  • Are you teaching La Volta at Pennsic this year? Enthusiastic Carolingians want to know!

    Ayden
  • anyone interested in developing MMORPG games to run on linux clusters?

    Details Deatils - your first post should have been a description of what you built, how you built it, what it will do and how can you make it easier for me when I build mine.

    is all your work going to be open source. will you release your customizations to us.

    I'll take my answer off the air?

  • by vvulfe ( 156725 ) on Tuesday May 23, 2000 @06:09AM (#1053794)
    Before deciding on a beowulf clusters, what different options did you explore (Cray? IBM?), and what motivated you to choose the Beowulf System?

    Additionally, to what would you compare the system that you are planning to build, as far as computing power is concerned?

    Thanks,
    VVulfe
  • Do you use MPI? Do you find it restrictive in any way? Have you ever thought about using your own message passing routines?

    ttyl
    tpb
  • by technos ( 73414 ) on Tuesday May 23, 2000 @06:12AM (#1053796) Homepage Journal
    Having built a few small ones, I got to know quite a bit about Linux clusters, and about programming for them. Therefore, this question has nothing to with clusters.

    What was the biggest 'WTF was I thinking' on this project? I'd imagine there was a fair amount of lateral space allowed to the designers, and freedom to design also means freedom to screw up.
  • by (void*) ( 113680 ) on Tuesday May 23, 2000 @06:13AM (#1053797)
    ... a beowulf of these babies - oh wait! :-)

    Seriously, what was the most challenging of maintainence tasks you had to undertake? Do you anticipate that a trade off point where the number of machines makes maintanence impossible? Do you have any pearls of wisdom for those of us just involved in the initial design of such clusters, so that maintaining it in the future is less painful?

  • I've noticed that most Beowulf clusters (although admittedly, I don't know about the NOAA one) tend to use standard desktop mini tower cases. Is there any particular reason for doing so, as opposed to going for rackmount servers. I'd expect the latter to provide a much more space efficient system, and in my experience, rack mounted cases tend to have better cooling than their desktop equivalents. This should be particularly noticable when using large number of machines in close proximity. Is it purely a cost issue?
  • What would you say makes a cluster a /Beowulf cluster/, as opposed to just a bunch of computers running linux linked by ethernet? What is it that differentiates the Beowulf cluster from the rest?
  • by BRock97 ( 17460 ) on Tuesday May 23, 2000 @06:35AM (#1053800) Homepage
    First off, from what I have gathered, it was not clear if you background was weather or not, so, I am hoping it is. Here are a couple of questions:

    1) Having just graduated with a BS in Atmospheric Sciences, I have had a chance to take numerical weather prediction courses over the last five years. With this new influx of processing power, where do you see numerical models going in the future?

    2) Somewhat related to 1), with mesoscale models becoming more popular (MM5 quickly springs to mind), where do you see the balance of processor time going to these models. The ability to get a model out faster, or to compute more variables to provide a more accurate forecast at the smaller scale?

    3) Not knowing too much about the origins of these models, I was interested to find that a person could get the source to the MM5 and modify it as they see fit. Will models developed in the future follow this same trend? With powerful computers becoming affordable, it would not be that difficult for a university to build one and run a particular model for their area (I believe that Ohio State is doing it, again, with the MM5)?

    Thanks!

    Bryan R.
  • Yeah, Greg Lindahl is a prett brilliant guy. . .

    ...

    ...


    Wouldn't it be great to get a beowulf cluster of him?
  • I've got to ask: Why GNU/Linux ?

    I know you couldn't call it a Beowulf if you chose otherwise, but there's got to be more to it than that (I guess). What other OS'es did you consider, and why did you pick GNU/Linux ?
  • Can you play Quake on it? If so what kind of framerate are we looking at... What?
  • A lot of the questions people are asking, (and some questions I'm sure your bosses asked,) are really a variation on the same theme: "you could have chosen a commercial supercomputer, or brand X proprietary clustering product, but you didn't - why?"

    Whatever your answer, I think it's fair to say that there is something about this system, which uses an open-source clustering technology, built on top of an open-source operating system, which made it best for your needs; maybe it was the reliabilty, or the ability to modify it as needed, or maybe just the lower dollar cost to your department.

    My question then, is this: have you given any thought to how you can help advance open source software, to give back to the community that created this tool? Getting the word out that the U.S. Goverment uses Linux for its cutting-edge weather forecasting tool would be an enormous PR win for the folks that still have trouble convincing their management that OSS software can be trusted for "real work." I'm not suggesting putting a picture of 'Tux' on every weather forecast, (although that would be kinda cute,) but it would be great if NOAA press releases [noaa.gov] about the project gave at least passing mention to the fact that the project will be benefitting from open source software.

    I realize this is not something you would normally do for, say a Cray or IBM, but those are commercial enterprises, with their own PR budgets; they don't need your help to get their word out. OSS needs all the help it can get, so that future projects like yours can continue to reap the benefits.

  • The original AltaVista was a big rackmounted Alpha cluster. It's all on open-frame racks bolted to a concrete floor with overhead cable trays, installed in an old telephone company building in Palo Alto. This is a telephone central office type setup, and it works well. Large server farms are dominated by cable-management problems, and that's the real driver on big cluster layout. The head of Inktomi once said that at one time their biggest reliabilty problem was people knocking power plugs out of machines while working around them.
  • As I understand it, MPI/PVM (which most [all?] Beowulf clusters consist of) require special programming techniques. That is, you can't just take your number crunching app, put it on the cluster and type "make" for it to work.

    So who do you have doing the programming for this thing? Did they take special training, or is it easy to pick up for any programmer?

    Finally, given the possible difficulty (and speciality) of using the above, has anyone considered using DIPC?
    --
    Have Exchange users? Want to run Linux? Can't afford OpenMail?
  • by Legolas-Greenleaf ( 181449 ) on Tuesday May 23, 2000 @06:14AM (#1053807)
    A major problem with using a beowulf cluster over a single supercomputer is that you now have to administer many computers instead of just one. Additionally, if something is failing/misbehaving/etc., you have to determine which part of the cluster is doing it. I'm interested a] how much of a problem this is over a traditional single machine supercomputer, b] why you chose the beowulf over a single machine considering this factor, and c] how you'll keep this problem to a minimum.

    Besides that, best of luck, and I can't wait to see the final product. ;^)
    -legolas

    i've looked at love from both sides now. from win and lose, and still somehow...

  • by crow ( 16139 ) on Tuesday May 23, 2000 @06:14AM (#1053808) Homepage Journal
    Why did you choose Alpha processors for the individual nodes? Why not something cheaper with more nodes, or something more expensive with fewer nodes? What other configurations did you consider, and why weren't they as good?
  • Now that advanced clustering systems (like Beowulf and Myrinet) have made super-computers affordable to almost any organization, how long until you see super computer applications being off the self, and every midsize corporation having a super-computer in their server room?

    Also, a related question is how much of a role do you think the free software community will continue to play in advancing super-computing towards the masses?
  • For writeing apps will you guys be useing PVM, MPI, or something else? Why did you choose that toolkit?

Love may laugh at locksmiths, but he has a profound respect for money bags. -- Sidney Paternoster, "The Folly of the Wise"

Working...