Terascale Computing System Installed 108
lysie writes
The Pittsburgh Supercomputing Center, with Compaq and the NSF, has installed the Terascale Computing System. Worldwide, it's second in power only to ASCI White at Livermore. However, it's the most powerful system in the world for unclassified research--6 teraflops per second. 3,000 Compaq Alpha EV68 microprocessors, in 750 four-processor AlphaServer systems running Tru64 UNIX."
Re:Where's Linux? (Score:1)
Re:Where's Linux? (Score:5, Informative)
Obviously you've never used Digital Unix, and you are not familiar with their kick ass, highly optimizing compilers... they ain't gonna build a cluster like that to run apache+mod_php and serve crap you know, it's all about number crunching.
Not to mention (Score:1)
Re:Where's Linux? (Score:2)
It seems like there is still a small performance delta in my experimental results (on our own Alphas, not on their cluster!), in favor of Tru64. But I can't be sure about this, and the delta wasn't large.
Thus my conclusion is that 1) they liked something about Tru64 besides the compilers, and/or 2) Compaq liked having their name on the OS running the cluster, and gave them a good deal.
Somebody else talked about support from Compaq. I expect the good folks at PSC know nearly as much about these machines as the Compaq engineers do. Furthermore, linux is a supported OS for these machines taken singly, though I didn't see "750-member ES40 cluster" in the support options...
-Paul Komarek
Re:Where's Linux? (Score:2)
OW! Stop throwing things at me!
A chance to run Windows now (Score:1, Funny)
I was gonna say it... (Score:2, Insightful)
teraflops per second ? (Score:4, Funny)
a teraflop per second would be an acceleration in processing power... not what the article means I guess
Re:teraflops per second ? (Score:1)
My new desktop goes 0-60 megaflops in 0.0134 seconds! How's that for speed!
Re:teraflops per second ? (Score:1)
Re:teraflops per second ? (Score:1)
Re:Top speed? (Score:1)
I thought the Alpha was all but dead... (Score:3)
Re:I thought the Alpha was all but dead... (Score:2)
teraflops (Score:1)
wow...6 trillion floating point operations per second per second.
Re:teraflops (Score:1)
floating-point operations per second per second? (Score:2, Funny)
Re:floating-point operations per second per second (Score:1)
Scales like a real UNIX should (Score:4, Insightful)
There will probably be a lot of people here asking "why isn't this running Linux?", without really knowing what they're talking about. First of all, Linux just doesn't have the kind of scalability that a commercial UNIX, particularly Tru64, does. Secondly, Tru64 is quite well-known for its excellent clustering capabilities, and its tight integration with the Alpha platform leads to high efficiency in computing. Finally, when you are paying $43 million for a supercomputer, you most certainly are going to be running the best software out there too, and frankly, the only reason that people out there are writing free software is that no one would want to pay for their code.
When you pay for the cost of commercial UNIX systems, you are paying for the assurance that 1) you aren't going to have stupid design flaws like the one the 2.4 kernel has in its inability to use virtual memory efficiently and 2) All of your nice new custom hardware is going to be supported, and frankly, high performance drivers for high-end hardware under Linux are sorely lacking.
Re:Scales like a real UNIX should (Score:3, Insightful)
the only reason that people out there are writing free software is that no one would want to pay for their code.
This is clearly not the only reason. There are a number of philisophical & practical reasons for free software. Furthermore, ther are numerous examples of people who are paid to write free software (e.g. linus, alan cox); and people who are paid to write propriety code (i.e. they are good enough programmers that someone is willing to pay them) in their job, but also are involved in free software projects in their own time.
Re:Scales like a real UNIX should (Score:2, Insightful)
I'm no expert on Tru64 scalability, but this level of flamebait makes this post highly suspect to my mind. Would someone who both knows something on the subject and can manage to comment without bad-mouthing the competition care to say whether this post is really +3, Insightful?
Comment removed (Score:4, Interesting)
Re:Scales like a real UNIX should (Score:4, Informative)
Re:Scales like a real UNIX should (Score:1)
DUnix over Linux.
1) Compilers
The compaq compilers are available under
linux.
2) SMP scalability.
This possibly is a bit of a red herring, since the SMPs are comparatively small (4 processors).
Another issue that I didn't see mentioned is
page coloring, and the VM system in general (hi Greg!). Many people find that there is a 10-20%
improvement on numerical codes.
Another issue is the familiarity of the Compaq
service organization with Tru64 vs. Linux.
Since it doesn't really cost them anything
to include the licenses, and it is probably
good publicity for Compaq's big iron sales
(remember, this was before Compaq decided to
flush all of its HPTC customer down the toilet)
why not?
Re:Scales like a real UNIX should (Score:2)
Anyway, we run linux on our ES-40 Model II, and it works great. The machine is a dream to administer (I last thought about it several weeks ago =-). However, we're not doing any parallel processing on ours, and aren't likely to anytime soon. We bought it for expandability. The ES-40 Model II will hold 32 1GB PC100 ECC wierd-physical-form-factor dimms, and up to 4 cpu cards. We like to think of the cpu multiplicity as "more machines with no extra administration". The memory is great, too -- we've got folks running 15 GB processes. Maybe the program didn't need to use that much memory; but the astronomers don't have to get fancy with their source code, and hence are saving a lot of time and effort.
You haven't lived (maybe died) until you've waited for a 4GB process to finish dumping core. Thank heavens I haven't been involved in any 15GB "accidents". =-)
I will say that our Tru64 machine (a Microway dual EV67, similar to a Compaq DS-20) has held up surprisingly well under hellish loads: load averages above 5 for a week are not rare, and I was able to recover from a load avg of 40 during an administrative accident. It will be interesting to see if linux does as well for us. However, I've got a lot of gripes about Tru64, and I'd prefer running linux even if we have to ask the users to play nicer.
-Paul Komarek
Re:Scales like a real UNIX should (Score:1)
Re:Scales like a real UNIX should (Score:1)
No doubt there are people out there who work on commercial compilers and sit around and laugh at gcc the way that HP-UX kernel engineer you talked about does.
Free software is good stuff for some things, but this idea that some people have about it being superior to everything in the commerical world is laughable.
Late reply... (Score:2)
Maybe your friend (and his friends) need to get off their butts, quit passing the bottle, and help make the Linux kernel what it really could be. Same thing goes for people who "laugh" at the inefficiencies of GCC.
Why is it that they have to sit around, drink beer, and laugh at code on a big screen? That sounds as pathetic as a bunch of beer-gut guys watching football, instead of out there playing it.
Contribute! That is what is needed.
However, I bet I know the reason why your friend can do nothing but laugh - he probably sold his soul and signed an NDA. Sucks to be him.
Re: (Score:2)
Re:Late reply... (Score:2)
I understand your argument about kernels for large systems vs. small systems, and how the needs of one may cause issues with the other, if implemented (or not implemented, depending on the direction). Perhaps in this case, the kernel needs to be forked into a parallel dev effort, one side for PCs, etc - the other dedicated to larger systems.
I don't believe I have blind devotion - I use what I feel is best for my abilities - right now this is Linux. I have given thought to BSD as well, and also wondered and prodded about on various niche OS projects - but Linux seems to be the most viable, in that I don't have to worry about my hardware becoming quickly obsolete because of an OS change, and I don't have to worry about not being able to find and run a piece of favorite software because the OS no longer supports it. BSD allows this too, and I like its slower "rev" cycles - whereas Linux seems to be a frothy mess, everything being updated all the time - but it isn't something I worry about much.
I just feel it is better to give back - because I know in the end others will give back to me. It has worked for me for a while now, in many areas of my life (not just the open source community). At the end of the day, I know what I have done to help has made a difference for others. Sometimes, I am even told that it has. This to me is better than any amount of money someone could give me for my work.
Re: (Score:2)
Re:Late reply... (Score:2)
What do I do for a living? I am a software developer for a Phoenix company. Our stuff isn't open source, probably never will be (but who knows?). I don't work 90 hours a week - my time is my time. I will put in extra hours when it is the right thing to do, but I won't do it just for the hell of it (ok, sometimes I do put in extra hours - you get into that "mode", where time just flows, and code is flowing - great state to be in).
I don't think being paid to do a job is selfish - but if that is all you do with your life - working, getting money, being paid, never giving back - yeah, that is selfish. One could state that he did give back - to his employer - by working a ton of hours, but he didn't give back to the community. That is selfish. He didn't just learn programming on his own. He had teachers. You know it, he knows it, and I know it. I give back because of all the people I learned from typing in code from magazines and books from when I was a kid. I give back because of the numerous examples I have found online about ways of doing things. I give back so someone else may learn from me, and teach others along the way.
But I do this on my own time. Not my employer's, my own. I do this because I love computers and software and coding - not because of these things can potentially make me money - but because of the worlds they have opened up for me. The insights, the freedom, the knowledge, the friendships - all of it!!! These things are things I cherish - and I wish to give others the chance to share in the same ways and feelings I have tread and experienced.
It seems like today companies and people only want to make money, let no knowledge out, never truely give back. But should they succeed, they will simply be causing their own ultimate demise, for where will the new information creators come from? The schools? Perhaps - but what about those who don't go, or can't afford, higher education? Should they be denied these things? Should they not be able to program computers, render 3D graphics, or build their own OS, should they so desire?
The corps are saying "Yes! THEY MUST BE DENIED!" in their mad rush to censor and restrict the flow of all information - not just copyrighted information (which I don't have a problem with, were it not that copyright extends forever anymore - ie, Sonny Bono/Disney Copyright Act, DMCA, SSMCA, etc). They want to even stop libraries, the internet itself. They are killing themselves and don't even know it, nor care.
Don't cut out Open Source yet - it isn't over. The bubble burst because of bad investment decisions, investment by VCs who would take a business plan written on a barroom napkin. They were stupid, and arrogant. Many great projects have benefitted from the open source and GPL philosophies. Maybe they haven't made money yet because they haven't found the right business model. Maybe they won't find the right business model. But they should at least try, especially now in these more "sane" (and I could argue less sane as well) times.
As far as my beliefs in OSS - could I write an operating system as you say? No, I could not - not because I couldn't learn how to do it - there are plenty of ways to learn how, and I am sure I could learn it if I was so inclined. I don't believe I could do it, for the same reasons that Linus didn't create the entire Linux kernel himself today. He created and released, from the first release, a very basic kernel. Improvements by others rolled in, and he incorporated them steadily, along with his own improvements. The thing kept getting bigger, until it is at where we are today. Kernel creation and design by near anarchy, is what it is. So no, I couldn't do it myself. I would be willing to bet, though, with a proper plan and good software design, and the release of a simple kernal that followed that design, you might be able to amass enough people to continue with development. The problem would be getting enough people who have access to the same kind of large scale arhitecture, which may be where and why this kind of development would fall apart. Of course, if you could get a company to "donate" a large dev machine to work on, it could be done. I believe IBM (or someone) actually has done this. Whether they did it for altruistic reasons or not is another issue...
Re: (Score:2)
Re:Scales like a real UNIX should (Score:1)
Re:Scales like a real UNIX should (Score:2, Informative)
Bearing in mind that this machine was built by Compaq under contract I would find it inexplicable if the systems programmers on site did not have the source to tweak as required.
Re:Scales like a real UNIX should (Score:3, Informative)
"When you pay for the cost of commercial UNIX systems, you are paying for the assurance that 1) you aren't going to have stupid design flaws like the one the 2.4 kernel has in its inability to use virtual memory efficiently and 2) All of your nice new custom hardware is going to be supported"
Having administered a Tru64 4.0 and 5.0 box, I can't agree with your statements about "what you pay for" when buying commercial UNIX systems. We had to upgrade because Tru64 4.0E did not support more than 8 SCSI devices on a single chain. Why on earth did we have to pay $1000 to be able to support an old SCSI standard?
We would have moved to linux, except that we have a half-terabyte of ADVFS-formatted data -- i.e. our data is "held hostage" by a proprietary file system format. If all goes well, we'll soon have 700GB of linux-readable space with which we'll rescue our data and then reformat the original array.
Oh, and let's not forget the time (before I was the admin, thank goodness) the machine was crashed by facilities to stop it from relaying spam -- turns out Tru64 ships (or shipped) with an open mail relay. linux has flaws, but at least you get the flaws for free!
Now there's the reboot cycle it got into, which corrupted the filesystem. However, the disk check ran without errors, and there's nothing unusual in the logs. Tru64 has some great features, none of which we need. We're only using it because we have to. We only paid to upgrade it because we had to.
-Paul Komarek
...and then Microsoft jumps in (Score:1, Funny)
Leave the fastest computer alone!
Gates: You will run Windows, Resistance is Futile!
Now the following scenario is possible... (Score:1, Troll)
And God said
#You have not signed on yet.
#Enter user password.
#Password Incorrect. Try again!
#Password Incorrect. Try again!
#And God signed on 12:01 a.m., Sunday, March 1.
#Unrecognizable command. Try again!
#Done.
#And God created Day and Night. And God saw there were 0 errors.
#And God signed off at 12:02 a.m., Sunday, March 1.
#Approx. funds remaining: $92.50.
#And God signed on at 12:00 a.m., Monday, March 2.
#Unrecognizable command! Try again!
#Done.
#And God divided the waters. And God saw there were 0 errors.
#And God signed off at 12:01 a.m., Monday, March 2.
#Approx. funds remaining: $84.60.
#And God signed on at 12:00 a.m., Tuesday, March 3.
and let the dry land appear and
#Too many characters in string specification! Try again.
#Done!
#And God created Earth and Seas. And God saw there were 0 errors.
#And God signed off at 12:01 a.m., Tuesday, March 3.
#Approx. funds remaining: $65.00.
#And God signed on at 12:00 a.m., Wednesday, March 4.
#Unspecified type. Try again!
#And God created Sun, Moon, Stars. And God saw there were 0 errors.
#And God signed off at 12:01 a.m., Wednesday, March 4.
#Approx. funds remaining: $54:00.
#And God signed on at 12:00 a.m., Thursday, March 5.
#Done.
#Done.
#And God created the great seamonsters and every living creature
that creepeth wherewith the waters swarmed after its kind and
every winged fowl after its kind. 0 errors.
#And God signed off at 12:01 a.m., Thursday, March 5.
#Approx. funds remaining: $45.00.
#And God signed on at 12:00 a.m., Friday, March 6.
#Done.
#Done.
#Unspecified type! Try again.
#Done.
and have dominion over the fish of the sea and over of the fowl
of the air and over every living thing that creepeth upon the
earth.
#Too many command operands! Try again.
#Execution terminated. 6 errors.
#O.K.
#Execution terminated. 5 errors.
#File Garden of Eden does not exist.
#Done.
#O.K.
#Execution terminated. 4 errors.
#O.K.
#Execution terminated. 3 errors.
#Illegal parameters. Try again!
#O.K.
#Execution terminated. 2 errors.
#Done.
#And God saw man'nwoman being fruitful and multiplying in the Gard.En.
#Warning: No time limit on this run. 1 errors.
#Done.
#And God saw man'nwoman being fruitful and multiplying in the Gard.En.
#Warning: No time limit on this run. 1 errors.
#Desire cannot be undone once freewill is created.
#Freewill is an inaccessible file and cannot be destroyed.
#Enter replacement, cancel, or ask for help.
#Desire cannot be undone once freewill is created.
#Freewill is an inaccessible file and cannot be destroyed.
#Enter replacement, cancel, or ask for help.
#And God saw man'nwoman being fruitful and multiplying in the Gard.En.
#Warning: No time limit on this run. 1 errors.
#Done.
#And God saw he had created shame.
#Warning: System error in sector E95. Man'nwoman not in Gard.En.
#1 errors.
#Man'nwoman cannot be located. Try again!
#Search failed.
#Shame cannot be deleted once evil has been activated.
#Freewill an inaccessible file and cannot be destroyed.
#Unrecognizable command. Try again.
#ATTENTION ALL USERS ATTENTION ALL USERS: COMPUTER GOING DOWN FOR
REGULAR DAY OF MAINTENANCE AND REST IN FIVE MINUTES. PLEASE
SIGN OFF.
#You have exceeded your allotted file space. You must destroy
old files before new ones can be created.
#Destroy earth. Please confirm.
#COMPUTER DOWN. COMPUTER DOWN. SERVICES WILL RESUME ON SUNDAY
MARCH 8 AT 6:00 A.M. YOU MUST SIGN OFF NOW!
#And God signed off at 11:59 p.m., Friday, March 6.
#And God saw he had zero funds remaining.
Supercomputers... (Score:4, Informative)
There was a fun story apparently about a slowdown that was due to _one_ RAM dimm not seated properly... So 2999 processors were doing their job, but then waiting for the last processor to finish its job, which was taking much longer...
I've seen pictures of this beast. All I can say is: wow. So many cables, so many machines...
And apparently, they're not yet completely connected. Each box is supposed to have two connections to a "fat tree" quadrics network. Well right now they only have one... But it seems that Linpack isn't so communication oriented, so it's not too big a strain on the network.
Maan
Cheer up! (Score:2)
So, hey, maybe we'll be bidding on this bad boy in a couple of years...
Terascale Computing System Installed (Score:2, Insightful)
Re: (Score:2)
Re:Terascale Computing System Installed (Score:2)
And I don't think Compaq sold the Alpha line because they couldn't make it work financially -- I think they sold it because they wanted to make the whole company look attractive to HP. Thank goodness we still have the PowerPC from IBM. That will hold for the next 10 years, at which time Intel might finally deliver a good IA64 implementation. We've been waiting, what, 4 years already? -- with no end in sight, either.
-Paul Komarek
Re:Terascale Computing System Installed (Score:2)
At least they've still got the iPAQ. Who knows what its future is, though. Maybe we'll get lucky, and Intel will bail on IA64 and just make Alphas. I can dream, can't I?
-Paul Komarek
{sniff} (Score:1)
After reading all the specs...wow. Nice work.
"What Norman and other astrophysicists would like to learn is why star formation is so inefficient."
Hell, I could have told you that...this is "mother nature" at work, not a computer sim.
"she" does things in her own sweet time...
And look at the cooling specs...yikes. It should have been build in Alaska...outside...in winter.
Moose.
(fingers crossed this gets posted to the right forum...not like last time. Prolly serves me right...Used windows + ie + relaunch after an "internal error" + posting to
Hey, I just noticed, maybe that is what I.E. stands for..."Internal Error"?
Science Editor :) (Score:1)
"SYSTEM CAPABILITY: The 3,000 processors can perform up to 6 teraflops, or 6 trillion calculations per second. Virtually every man, woman and child on earth would have to perform a calculation each second to keep pace..." hmm... with my PC maybe
Links to more info on TCS1 (Score:2, Informative)
Here are some links to more information:
"teraflops per second"? (Score:1)
Quake Comment (Score:1)
Re:Quake Comment (Score:1)
Some Science Editor... (Score:2)
Umm... no. Designers were battling against the speed of electrons though a cable, signal attenuation and noise. The speed of light is that limit on the gigaflops/sec acceleration or something.
woof.
I'm off to a bar only 18 kg from my apartment for a 9V glass of beer.
Re:Some Science Editor... (Score:2)
I think it takes about 40 nanoseconds for light to travel 1 meter. A 750MHz cpu does one cycle every 4/3 of a nanosecond. In 40 nanoseconds, a 750MHz cpu has gone through 30 cycles. Because these EV68 chips are really beautiful superscalar processors, a lot of instructions are consumed and retired during 30 cycles.
If the time-in-flight between nodes is highly asymmetric, I expect it would be difficult to reasonably schedule work across all the nodes. With some machines 1 meter away, and others 33 meters, the nearest machines could get a signal 1000 cycles before furthest machines did.
Even without all the computation, you can see that the nearest machines could get 33 times more work done than the furthest, when waiting for synchronization signals. What the computations were meant to show was that a 750MHz cpu, and in particular an EV68 Alpha, can do meaningful work during this time.
Please take note that this kind of analysis is not in my field, and furthermore this post is the first time I've thought about it (though I am familiar with these chips, as I administer and use them daily). So please go easy on the flaming corrections!
-Paul Komarek
Re:Some Science Editor... (Score:2)
I'd think this fixed-overhead incurred for each synchronization would have quite an effect for computational loads that required a lot of synchronization. Thus a cpu with a shorter (dare I say "reasonable"?
It would be great if someone who knew what they were talking about corrected me, since I really don't know what I'm talking about. It would be really great if this person wasn't as lazy as me, and could compare actual pipeline lengths instead of dredging up old and unreliable memories.
-Paul Komarek
Re:Some Science Editor... (Score:1)
Practical Uses.. (Score:2)
I would love to see some of the particle simulations this puppy could crunch, we might even see some accurate weather simulations finally..
I just wish there was a way to get that kind of power available to the universities.. Could you imagine what grad students could do if they didnt have to write a 20 page thesis and description of what they would want to run on the system? some of the best discoveries are the middle-of-the night AHA! sessions that need to be ran at that moment.
Re:Practical Uses.. (Score:1)
Re:Practical Uses.. (Score:2, Informative)
Re:Practical Uses.. (Score:2)
The PSCs networking group spun off into the National Center for Network Engineering (NCNE) when PSC proper lost it's NSF funding (due to sheer administrative stupidity, IMO). The PSC provides the internet connection to CMU, Pitt, and Penn State, but they have no other real affiliation with Penn State.
They're pretty liberal about giving out research accounts, so long as you're somehow affiliated with one of their funding sources. Currently, this mostly means that you have to be an academic of some sort in Pennsylvania (including students) or you have to be doing networking research, particularly for Internet 2.
Wow! (Score:2)
Cool!. (Score:2)
"HOT!!!" is the reaction you'll get in 2 years from now when you'll see the same thing comming from intel.
:)
Linux stability on an Alpha Cluster (Score:1)
1 Calc/sec. per Person on Earth? (Score:1)
or 6 trillion calculations per second. Virtually every man, woman and child
on earth would have to perform a calculation each second to keep pace.
Unless the population of the Earth has increased by 1,000 times recently, every person would have to perform a calculation every 1ms to keep pace.
only won use (Score:1)
SETI SETI SETI SETI SETI SETI SETI SETI SETI SETI .....?
SETI SETI SETI SETI SETI SETI
SETI SETISETI SETI c where i am going with this
Re:I cannot see the point (Score:1)
And, they want to cluster this with other clusters....mmm....Beowolf....
Re:I cannot see the point (Score:2, Informative)
> Compaq wanted to use their own Linux distribution.
Err, Tru64 as it's called now, is no Linux distribution. It's Digital/Compaq's version of Unix for the Alpha architecture (or "AXP" if you prefer that). It had the names "Digital Unix" and "DEC OSF/1" before it was renamed to Tru64.
And please: Don't feed the trolls...