Dual-Core Processors Begin Takeover of Top 500 Super Ranking
Published: November 16, 2006
by Timothy Prickett Morgan
It's the fall, so that means it is time once again for the semi-annual ranking of the Top 500 supercomputer sites in the world. The ranking this time around is timed to coincide with the SuperComputing 2006 show in Tampa, Florida, which started on Saturday. The Top 500 ranking is a kind of leading indicator for server technology. While it is interesting to talk about different architectures, operating systems, and interconnects, perhaps the most significant statistic in this fall's ranking is the increasing use of dual-core processors inside the machines at the largest supercomputing centers in academia, government, and business.
The Top 500 rankings are based on the performance of machines using the Linpack benchmark, which is a test written in Fortran that shows how well--or poorly--a machine does doing floating point math--specifically, solving a set of linear equations. The peak performance is the measurement in teraflops (trillions of floating point operations per second) that a machine can do with the maximum matrix size for the set of linear equations. The list is compiled by researchers at the University of Manheim in Germany, the University of Tennessee in the United States, and the Lawrence Berkeley National Lab, also in the United States.
In the fall Top 500 list announced this week, there are 75 supercomputers that are based on dual-core Opteron processors from Advanced Micro Devices, and 31 machines are based on Intel's "Woodcrest" Xeon 5300s, the first real dual-core chip from Intel based on its Core microarchitecture. While IBM has been shipping dual-core Power processors for five years that are very powerful, they are not as inexpensive as X86 processors were back when IBM first got into the game and they are still not as inexpensive as 64-bit dual-core X64 chips are today. That explains, in part, why the Opteron has just passed by the Power family of processors to become the number two chip in the Top 500 list.
The number one supplier of chips in the Top 500 list is, of course, Intel, which has 263 systems on this list--a little more than half of the list. That is down, however, from the 333 systems (or 67 percent of the total) that Intel had a year ago. AMD's Opterons have more than doubled to 113 systems in the past year, and now account for about 23 percent of the list. There are 119 machines using 32-bit Intel chips, 109 using 64-bit Intel chips, and 35 using Itanium chips. That means the X86 architecture and its successor, the X64 architecture, together account for 376 machines out of 500, or three-quarters of the list. This time around, the machines in the Top 500 list have a total of over 1 million processors, and 485,205 of them are X86 or X64 chips. The list has a total of 3.53 petaflops of sustained performance, and X86 and X64 chips together deliver about 1.8 petaflops of aggregate performance. Just upgrading the existing X86 machines to dual-core X64 machines would add a lot more performance to the list.
Don't count the Power-based systems out, however, since these also increased their presence on the Top 500 list, with 91 systems (18 percent of the total), an gain of 20 boxes since the fall 2005 ranking. Power-based supers account for 1.2 petaflops of performance and 416,492 processors. Most of the performance and processor count for the Power platform comes from the Blue Gene/L supercomputers, which have a very large number of very modestly performing dual-core PowerPC embedded processors.
There are four Cray vector machines on the list, and three vector machines from NEC. All of the other machines are built on scalar processors. There are three Sparc systems, 20 PA-RISC systems, and three Alpha systems on the list as well.
By operating system, the current Top 500 list is dominated by Linux, although there are some hybrid Unix-Linux clusters as well as a smattering of Unix. There are 403 Linux boxes, with 92 Unix boxes and five hybrid Unix-Linux machines. Of the Unix machines, HP-UX and AIX dominate. Of the commercial Linuxes, Novell's SUSE Linux Enterprise Server seems to be more popular than Red Hat's Enterprise Linux, but 81 percent of the machines are running homegrown Linux. So that doesn't necessarily mean much. The big supercomputing labs arguably know more about Linux as it relates to high performance computing than any of the commercial Linux distributors, and they have grad students or employees to do the support work.
In terms of interconnection schemes for the clusters on the list, Gigabit Ethernet rules, with 213 of the machines, or nearly 43 percent, using that commodity interconnect. InfiniBand, which has a lot more bandwidth and a lot lower latency, is used in 78 machines, while Myrinet interconnect is used in 79 machine and Quadrics is used in 14 machines. The remaining interconnects are various kinds of proprietary, high speed interconnects.
By vendor, IBM is once again dominant, with 237 machines and about half of the aggregate number-crunching performance on the Top 500 list. Hewlett-Packard comes in second by brand, with 157 machines and about 16 percent of the aggregate power. (There is one machine built by IBM and HP together.) Silicon Graphics comes in third in the popularity contest, with 20 machines on the list, followed by Dell with 18 machines, and Cray with 15.
The top 10 on the list hasn't changed all that much, but some machines have been raised or lowered in the rankings as supercomputing centers upgrade their machines at different rates. The top machine on the list is the Blue Gene/L machine at Lawrence Livermore National Laboratory, which held its spot without any upgrades and which is rated at 280.6 teraflops of sustained performance. Cray moved from number nine to number two on the list as it doubled the performance in the "Red Storm" Opteron massively parallel machine it built for Sandia National Laboratories. Red Storm is now rated at 101.4 teraflops, and Cray this week announced that it can put a petaflops box in the field, if someone wants to pay for it. IBM has a Blue Gene machine of its own in the TJ Watson Research Center, which was the former number two machine but now slips to the number three position with its 91.2 teraflops of sustained performance. IBM's ACSI Purple Power-based massively parallel machine, which is based on its p5 575 servers and which sits beside Blue Gene/L, is number four on the list, with 75.8 teraflops of sustained performance.
The "Mare Nostrum" PowerPC blade server cluster that IBM built for the Barcelona Supercomputer Center in Spain is number five on the list, and uses Myrinet links to lash the blades together. This machine is now rated at 62.6 teraflops. The "Thunderbird" cluster that Dell built for Sandia is number five, and is the largest InfiniBand machine on the list. It recently was re-tested, and is now rated at 53 teraflops, holding its position on the list. The Tera-10 machine built by the French government to simulate its own nuclear weapons (that is what all of the U.S.-based labs are doing with all of this iron, of course), is number seven on the list. Tera-10 is a cluster of Bull NovaScale Itanium SMP servers linked with Quadrics interconnect, and it is rated just behind Thunderbird, with 52.8 teraflops of performance.
Number eight on the list is the "Columbia" Altix cluster made by SGI for NASA's Ames Research Center, which uses NUMAflex interconnect inside the Altix box and InfiniBand to lash servers together into a massively parallel cluster. Columbia is rated at 51.8 teraflops using single-core 1.5 GHz Itanium 2 processors. Obviously, if NASA Ames upgrades to dual-core Itanium 9000s, it can double performance in approximately the same footprint and thermal envelope. Number nine on the list is the "Tsubame" Sun Fire 4600 cluster that Sun Microsystems, NEC, and ClearSpeed have built for the Tokyo Institute of Technology. This machine uses InfiniBand interconnect, and is rated at 47.4 teraflops. And finally, number 10 on the list is the "Jaguar" XT3 massively parallel machine from Cray, which uses dual-core Opteron processors and Cray's own SeaStar interconnect married to the Opteron's HyperTransport bus. Jaguar is running at Oak Ridge National Laboratory, and is rated at 43.5 teraflops.
The ante to get into the Top 500 poker game this time around is 2.74 teraflops, compared to 2 teraflops only six months ago. That is an increase of 37 percent. To reach the top 100, you had to have a machine rated at 6.65 teraflops, which is 41 percent higher than in the last Top 500 list.
Out of the 500 machines on the list, only 244 of them have gone into commercial or industrial sites (the latter being a super that sits at an IT vendor's own shop); the other half go into government-sponsored supercomputer centers. IBM sold 121 of those commercial machines and HP sold 116 of them. Which means IBM and HP have the very high-end of the commercial supercomputer market locked up.
Top 500 Supers: Brace Yourself for Petaflops Systems
Linux Clusters Continue to Expand in Top 500 Supers Ranking
Top 500 Supers List Dominated By Exotic Clusters
Top 500 Supers List Dominated by Teraflops-Class Machines
Linux, X86 Clusters Take Over Top 500 Supercomputer Ranking
Top 500 Supercomputer Ranking Gets Top Heavy