China Bumps US Out of First Place For Fastest Supercomptuer 125
An anonymous reader writes "China's Tianhe-2 is the world's fastest supercomputer, according to the latest semiannual Top 500 list of the 500 most powerful computer systems in the world. Developed by China's National University of Defense Technology, the system appeared two years ahead of schedule and will be deployed at the National Supercomputer Center in Guangzho, China, before the end of the year."
Re:Clueless (Score:3, Informative)
In all, Tianhe-2, which translates as Milky Way-2, operates at 33.86 petaflop per second
First of all, it's PetaFLOPS. It's not a plural, so there is no PetaFLOP. FLOPS = FLoating-point Operations Per Second, so saying "PetaFLOP per second" is saying "Peta-FLoating-point Operations Per per second"
Overwhelmingly Linux (95%) (Score:4, Informative)
It's interesting to browse this website:
http://www.top500.org/ [top500.org]
And look at the Statistics section, such as Operating System Family
http://www.top500.org/statistics/list/ [top500.org]
Operating system Familyâf Countâf System Share (%)âf Rmax (GFlops)âf Rpeak (GFlops)âf Coresâf
Linux 476 95.2 217,913,963 318,748,391 18,700,112
Unix 16 3.2 3,949,373 4,923,380 181,120
Mixed 4 0.8 1,184,521 1,420,492 417,792
Windows 3 0.6 465,600 628,129 46,092
BSD Based 1 0.2 122,400 131,072 1,280
Re:Supercomputers are pretty useless (Score:5, Informative)
People don't build supercomputers for no reason, especially when HPC eats up a large part of their budget.
The main application of supercomputers is numerically solving partial differential equations on large meshes. If you try that with a distributed setup, the latency will kill you: the processors have to talk constantly to exchange information across the domain.
As someone pointed out, modern supercomputers are like distributed computing, often with commodity processors. They look like (and are) giant racks of processors. But they have very fast, low-latency interconnects.
Coincidence? (Score:3, Informative)
Re:Two years ahead of schedule? (Score:2, Informative)
rest of their lives breaking codes for the national spy agencies. Several of the top computers, like Kraken, Jaguar, and Titan, were/are NSA cryptography machines.
The NSA has their own computers, why would they need to use the rather publicly known ones, and compete with other users for time? Do you assume those computers only do one piece of science because you only read about it in the news/PR, or did you actually bother to look at the research papers and groups using these computers on a daily basis? I know people on research groups that use those computers. What they have to sometimes compete with is not the NSA, but nuclear stewardship programs. Other than that, it is other sciences groups getting time and/or slices of the machine.
Re:Supercomputers are pretty useless (Score:5, Informative)
If you use a hyper-cube, then the processors on the outside edges have no one to talk to. For a single dimension example, imagine a series of processors where every processor in a line has two communication links, one to talk to its neighbour on the left, and one to talk to its neighbour on the right. This is great for all the processors in the middle of the arrangement. However, in a one-dimensional straight-line arrangement, the processors on the end are either missing a left (or a right) neighbour. The solution to this problem is to connect the processors on the ends to each other, making the line a circle or ring.
A one-dimensional hypercube is a line. In supercomputing, it is often desirable to avoid any topology where the there is a flat (non-connected surface) on the side of the cube. Connecting the opposite edges of the cube to each other results in the torus topology in higher dimensions, and the ring topology in 1-D. For a picture of this effect, see the torus interconnect article on wikipedia [wikipedia.org].
While it is theoretically possible preferable to have really high-order interconnects, in practice wiring considerations limit the maximum number of interconnects. As such, most practical torus architectures are limited in the number of neighbours they can support.
FYI: The tree architecture is avoided in supercomputing for a different reason. Typically, each node has the fastest interconnect that can be provided, as interconnect speed affects system speed for many algorithms. Imagine if each leaf at the bottom of the tree needs 1X bandwidth. Then the parent node one-element up needs 2X bandwidth. The next parent node up requires 4X bandwidth, and so on. With tens of thousands of nodes in the supercomputer, it quickly becomes impossible to make fabricate interconnects fast enough for the parent nodes of the tree.
A practical application of the tree problem occurs on small Ethernet clusters. It is easy to make a 16-node 10Gb Ethernet cluster, because standard switches are readily available. As the system approaches hundreds of nodes, it becomes difficult to find fast enough switches. Even if the data communication speed to each node is reduced to 1Gb, for sufficiently large numbers of nodes, the backplane switches will be overwhelmed.
Re:Clueless (Score:5, Informative)
As a computational physicist:
"flop" is sometimes used to mean "floating point operation", when you're talking about the compute cost of an algorithm. For instance:
"The Wilson dslash operation requires 1,320 flops per site" or "The comm/compute balance of this operation is 3.2 bytes per flop".
So saying "ten flops per second" is fine -- "flops" is the plural of "flop".
Yes, "flops" is also acronymized as "... per second", and while that's the most common use it's not exclusive.