Supercomputer Becomes Massive Router For Global Radio Telescope 60
Nerval's Lobster writes "Astrophysicists at MIT and the Pawsey supercomputing center in Western Australia have discovered a whole new role for supercomputers working on big-data science projects: They've figured out how to turn a supercomputer into a router. (Make that a really, really big router.) The supercomputer in this case is a Cray Cascade system with a top performance of 0.3 petaflops — to be expanded to 1.2 petaflops in 2014 — running on a combination of Intel Ivy Bridge, Haswell and MIC processors. The machine, which is still being installed at the Pawsey Centre in Kensington, Western Australia and isn't scheduled to become operational until later this summer, had to go to work early after researchers switched on the world's most sensitive radio telescope June 9. The Murchison Widefield Array is a 2,000-antenna radio telescope located at the Murchison Radio-astronomy Observatory (MRO) in Western Australia, built with the backing of universities in the U.S., Australia, India and New Zealand. Though it is the most powerful radio telescope in the world right now, it is only one-third of the Square Kilometer Array — a spread of low-frequency antennas that will be spread across a kilometer of territory in Australia and Southern Africa. It will be 50 times as sensitive as any other radio telescope and 10,000 times as quick to survey a patch of sky. By comparison, the Murchison Widefield Array is a tiny little thing stuck out as far in the middle of nowhere as Australian authorities could find to keep it as far away from terrestrial interference as possible. Tiny or not, the MWA can look farther into the past of the universe than any other human instrument to date. What it has found so far is data — lots and lots of data. More than 400 megabytes of data per second come from the array to the Murchison observatory, before being streamed across 500 miles of Australia's National Broadband Network to the Pawsey Centre, which gets rid of most of it as quickly as possible."
Raijin assists with other big data tasks. (Score:3)
400 Mb per seconds (Score:2)
Re: (Score:2, Informative)
Most of it is noise you can throw away quickly. After that point it gets more and more difficult to choose so you need balance processing+storage+bandwidth
CERN ran into similar problems but at least they had a part of the science done on-site. (a week in geneva is way better than a week in the middle of the fucking desert)
Space people have kind of the opposite problem, since they have very limited on site storage/processing power and limitations in bandwidth/telemetry and they cant just dump more computers
Re:400 Mb per seconds (Score:5, Informative)
Most of it is noise you can throw away quickly.
In the case of the Square Kilometer Array (named for its total collection area by the way,
not because it is "spread across a kilometer of territory", whatever that's supposed to mean),
none of it is noise.
The SKA relies heavily on processing everything, using advanced phased-array
and other "inverse beam-forming" techniques to look at multiple targets in multiple
frequency ranges at once (the final design will have continuous coverage from
70 MHz to 30 GHz!).
This is only possible with centralised processing, so none of the antenna sites can throw
anything away: They don't know what will be important.
Re: (Score:2)
>This is only possible with centralised processing, so none of the antenna sites can throw
anything away: They don't know what will be important.
Even more than that, *all* of it is potentially important. As I understand it phased arrays pretty much require the whole signal from all the antennas to get the benefit of having the antennas at all, it's not until *after* the signals are combined and processed that you can weed out the data you're not interested in. In fact based on 20s of reading on phased a
Re: (Score:2)
Traditionally phased array was done by feeding the raw signals to a central point and then the "processing" is analog (circuits, not algorithms). The output is a single signal that contains the desired "image" which then goes into an A/D. At any time you only can look at the data in one way, since the raw data is not captured (raw being the data from each individual antenna).
That works great for a radar on a ship where the antennas are all next to each other and where you can just rapidly steer back and f
Re: (Score:2)
" before being streamed across 500 miles of Australia's National Broadband Network to the Pawsey Centre, which gets rid of most of it as quickly as possible."
I imagine a bunch of Indian and Chinese people pressing Shift+Delete randomly on files. Their target: 90% resolution rate on incoming data :)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
As a matter of fact...
OK, I may be too pedantic, but a 747 full of DVDs is just large storage, wildly different from large bandwidth.
When you stream something, the data is immediately ready for processing as it comes (provided it's structured with that goal in mind). On the other hand, a 747 full of DVDs is data that must be read before it's ready for processing, and the average DVD read speed is more or less 100 Mbps, maybe a bit more than that but not by much. Throw time spent writing those DVDs into the
Re: (Score:2)
So use SSDs instead. The point being though that I can get maybe 1GB/s with a high-speed data link, or umpteen PB/s with a truck full of storage media (I first heard the maxim as "...a station wagon full of floppy disks".
As for bandwidth reading the data, sure you'd need a lot of connections to get anywhere near that. Heck, you'd need a lot of computers to process data that quickly - a single PC with dual channel DDR2-800 RAM has a maximum data throughput (no processing, just reading it from memory) of o
Re: (Score:2)
It's not. It's really not much at all. For $150k, you could build a hadoop cluster that would happily accept the data stream, process it, and make it available for consumption. If you just want to store it, you don't even need that much.
That's a waste of a Cray. Well, a Cray is a waste of money these days anyway.
Re: (Score:2)
Not really. The real-time components (aka correlation) are basically just straight up FFTs. Custom hardware in correlators might make sense (and probably does at scale), but through ASICs or FPGAs. They're not doing that (...yet). Throw a GPU or two into each node, and you'd get far more FLOPS than you would with a cray. This work is mostly embarrassingly parallel, so throwing money into cray's is a total waste of time.
Re: (Score:1)
Re: (Score:2)
What a bad summary. (Score:2, Insightful)
A lot of waffling that tells me nothing about the premise. Why did they do it, why did they need to, what made that thing uniquely suitable so nothing else would do?
HEY EDITORS. DO YOUR JOB ALREADY, DAMMIT. STOP WASTING MY TIME.
400MB/s (Score:5, Funny)
More than 400 megabytes of data per second come from the array to the Murchison observatory, before being streamed across 500 miles of Australia's National Broadband Network to the Pawsey Centre
They forgot to mention the step where the 400 MB go to the NSA to be checked for signs of extra terrestrial terrorism.
Two, actually! (Score:1)
Well, I knew someone on this planet actually needed gigabit Internet if we looked hard enough.
Re: (Score:1)
Summer? (Score:5, Informative)
Later "this summer" doesn't start until December.
500 miles
For those of us who dont use archaic measurements, it's 800 KM from the city of Perth, which makes it 800 KM from the closest city. If anyone is interested, here's the google maps link [google.com.au] and it's distance to Perth, Western Australia. [google.com.au]. There's literally nothing out there, picking up an AM radio station is difficult, making it the perfect place for a telescope.
If you truly want to get lost, you need to go somewhere like Murchison, no-one will find you. Of course just about everything there is trying to kill you, from King Brown snakes to Land Sharks and Koala Drop Bears.
Re:Summer? (Score:4, Informative)
I live in Western Australia and it's winter here.
I live in South Australia, and it's winter here, too.
Later "this summer" doesn't start until December.
I would say it does, because using seasons as a unit of time is a distinctly Northern hemisphere convention. In my observation, American's and Canadians are the main users of it (more than the British).
I often get confused talking to an American when they talk about doing something "in the summer", and it's not so much that they have a different summer, but that I'm not used to measuring time like this. (We only use it for things that are specifically related to the weather, such as sports).
In Australia we wouldn't say "later this winter", we'd just say "around August/September".
Re: (Score:2)
Yeah, "product scheduled for release this autumn", wtf does that mean?
Still, the USA uses Imperial measurements so it's not exactly hip to, you know, measurements that people can actually understand.
Re: (Score:2)
the USA uses Imperial measurements so it's not exactly hip to, you know, measurements that people can actually understand.
Don't they teach arithmetic in your country?
Re: (Score:1)
..Ok fine, so it did hit 8 degrees this morning, but it's 18-20 during the day which would be considered a warm spring day for some parts of the US.
Re: (Score:1)
18-20 isn't a warm spring day. Hell I consider anything less then 25-35 to be a cold day during the spring as we routinely hit 40-45 during this time of year.
Of course, if we even hit 8 during the period Dec-March we're suffering a heat wave as it's usually closer to -8 here during that period and the funny thing is, I'm only 300Km from Los Angeles.
Re: (Score:1)
Not true. We, Canadians, use the following seasonal measurements to indicate the time of the year: Almost winter, winter, still winter, and construction.
Re: (Score:2)
Not true. We, Canadians, use the following seasonal measurements to indicate the time of the year: Almost winter, winter, still winter, and construction.
On the plus side there is very little risk of heat stroke.
Re: (Score:2)
I live in Western Australia and it's winter here.
It's currently the middle of the night in Perth, and still 12C. Tomorrow's high is forecast to be 20C. That is not winter.
Re: (Score:2)
I live in Western Australia and it's winter here.
It's currently the middle of the night in Perth, and still 12C. Tomorrow's high is forecast to be 20C. That is not winter.
Yes it is.
Summer in Perth is 40 Deg C.
I would use this for (Score:1)
Misleading summary and first article (Score:5, Informative)
The Square Kilometer Array will have a *collecting area* of one square kilometer. That means that if you add up the area of all the detectors, you get one square kilometer. Since there is some distance between each detector, the SKA will cover a ground area *much* larger than a square kilometer.
Part of the SKA will be built in the MRO-area in Australia. But it is far from finished - construction won't begin in earnest until 2016 I think. So the most powerful radio telescope in the world is not at MRO now. It is LOFAR in Europe.
Re:Misleading summary and first article (Score:5, Informative)
The article also washes over the fact that there are different telescopes for different parts of the radio spectrum. The MWA and LOFAR are the most powerful in the MHz regime, but the VLA is still the most powerful between 1 to 50 GHz, and ALMA is the most powerful from 85 and 700 GHz.
Re:Misleading summary and first article (Score:5, Informative)
Right. And then there are the issues of resolution and survey area. Planck covers the same frequency range as ALMA, but measures the whole sky in total intensity and polarization, for example, and is much better at measuring the CMB than ALMA. So the term "powerful" is an over-simplification.
Petaflops (Score:2)
Well it sure can do a lot of floating point operations per second; how does that help for networking applications exactly?
Re: (Score:2)
Ditto. Also, in many "big data" projects, FLOPS is of little use anyway. There is a ton of textual processing and predicate matches to be done in the rest of the world. With ARM entering the HPC space, hopefully more broadly meaningful integer & IO ops numbers will be bandied about rather than just this laser-focus on vector floats.
Getting rid of data? (Score:2)
From the article:
Get rid of data? Don't you mean routing the data to its destination? And you would hope the Pawsey Centre actually DID something with the data and not just get rid of it.
Routing? (Score:3)
Re: (Score:3)
Good ol' Wikipedia has a decent description of the overall system: http://en.wikipedia.org/wiki/Murchison_Widefield_Array [wikipedia.org]
An educated guess is describing it as a router is ridiculous. It's more like intelligently combining the M incoming data streams (beam forming) so that the data can be shipped at a lower bandwidth to N universities (each of which may be using a different combination of incoming data and hence looking at a different beam).
One of the nice things about phased array (electronically steered) a
Keep an axe handy.... (Score:1)
Not the National Boardband Network (Score:1)