Top 500 Supers List Dominated By Exotic Clusters
by Timothy Prickett Morgan
The semi-annual list of the Top 500 supercomputers was released last week at the International Supercomputer Conference in Heidelberg, Germany, and there is a tremendous amount of churn in the rankings as vendors have installed various kinds of supercomputers that have been in the works for years and academic, government, and private research facilities gobble up huge amounts of computing capacity. To make it to the list today requires an ante of more than 1 teraflops of aggregate computing power, in fact.
As one measure of how quickly the world of supercomputing has embraced new architectures and technologies and consumed computing power, the aggregate number-crunching capacity of the entire Top 500 list when it debuted in June 1993 was just over a teraflops. That is the power of the smallest machine on the list 12 years later.
In fact, number 500 on the list, a Cray T3E1200 parallel supercomputer with 1,900 processors and a sustained performance of 1.17 teraflops on the Linpack Fortran benchmark test that is used to create the rankings, would have been number 299 on the list only six months ago and would have been the fifth most powerful machine in the world on the June 2000 list. That machine would have come in behind the ASCI Red (2.4 teraflops), ASCI Blue Pacific (2.1 teraflops), and ASCI Blue Mountain (1.6 teraflops) parallel supers from Intel, IBM, and Silicon Graphics that were funded by the U.S. government in the 1990s to fuel development of indigenous supercomputers with diverse architectures. Number four on the list at that time was a parallel super made from IBM Unix servers rated at 1.4 teraflops. It will not be long before these machines--as well as many others that were formerly thought of as powerful and exotic--disappear from the Top 500 list completely. That is how fast supercomputers come and go, which is as sensible as it is ironic.
Being at any particular position on the Top 500 list is a fleeting thing these days, as all supercomputer vendors know. Five of the 10 vendors at the top of the list have been pushed down in the past six months. IBM's Blue Gene/L super, which was build by Big Blue for Lawrence Livermore National Laboratory for the U.S. Department of Energy, has doubled in size since the November 2004 rankings and now has a sustained performance of 136.8 teraflops. This machine has an incredible 32,768 dual-core, 700 MHz PowerPC 400 processors ganged up to deliver that performance; the lab is expected to double the processor count and Linpack processing capability yet again as part of the DOE contract. The number two machine on the list is a Blue Gene machine that IBM built for its own use recently at the TJ Watson Research Center in New York where Blue Gene was created; this machine, dubbed Blue Gene/W, has a rating of 91.3 teraflops, and according to IBM, which announced the machine last week, the processing capacity in the super will be used to its own product designs and research with a portion of it donated for free to academic researchers around the world. There are 16 other Blue Gene machines in the Top 500 list, and five of them occupy slots in the top 10 rankings. The aggregate processing capacity of all of the Blue Genes on the list comes to 364.7 teraflops (with six out of the 16 machines being at IBM's own facilities and accounting for 36 percent of the total capacity of Blue Genes on the list). While the Blue Gene machines are not suitable for many workloads, if IBM gets 50 cents per megaflops in real money from the 10 commercial systems in the Top 500 list that are based on Blue Gene, it would have got back its $100 million initial investment in Blue Gene from 1999 and got a lot of PR for free. IBM is significant in the Top 500 rankings for another reason: with 259 systems on the list and an aggregate of 976 teraflops, it has the most machines on the list and the most computing power as well. Blue Gene represents 37 percent of the installed IBM capacity on the list, but only 6 percent of the systems. To put that another way: Take out Blue Gene, and other machines dominate the aggregate flops.
If the Top 500 list is any indication, however, IBM doesn't own the supercomputing market, and even if it did, that ownership would be, by its very nature, ephemeral. The "Columbia" cluster at NASA's Ames Research Center, which is comprised of 10,160 Itanium 2 processors housed in SGI's Altix NUMA clusters comes in at number four on the list, with a sustained performance on Linpack of 51.9 teraflops. Number five on the list is the Earth Simulator built by NEC as a parallel vector supercomputer, which held the top spot for quite some time and has a sustained performance of 35.9 teraflops. The new "MareNostrum" blade server cluster, built from 2,400 of IBM's two-way JS20, PowerPC 970-based blades, comes in at number five with a rating of 27.9 teraflops.
Number six is a Blue Gene box at the University of Groningen in the Netherlands rated at 27.5 teraflops, followed up by the "Thunder" Itanium 2 cluster built by California Digital for Lawrence Livermore, which is rated at 19.9 teraflops and which has the distinction of being the largest Linux-based cluster using Quadrics interconnections. (The MareNostrum cluster at the Barcelona Supercomputer Center in Spain is using Myricom interconnect. (Blue Gene has its own interconnection fabric.) Numbers eight and nine on the list are Blue Gene machines rated at 18.2 teraflops installed at Ecole Polytechnique Federale de Lausanne in Switzerland and at the Advanced Institute of Science and Technology in Japan. Rounding out the top 10 is the "Red Storm" massively parallel Linux-Opteron machine built by Cray for Sandia National Laboratories, which is currently rated at 15.3 teraflops and is half-way to its completion. An almost equally powerful Cray XT3 is at Oak Ridge National Laboratory is number 11, the "ASCI Q" AlphaServer cluster at the Los Alamos National Lab is number 12, and the still notable "System X" Apple Xserve-Mellanox InfiniBand cluster is at number 14 on the list.
In terms of interesting trends, the number of machines on the Top 500 list using Xeon or Itanium processors from Intel continues to creep up, with 333 systems on the June 2005 list, up from 320 from the November 2004 list, and 287 systems in the June 2004 list. IBM's Power processors (in various incarnations, including the PowerPC 400s in Blue Genes, the PowerPC 970s in blade servers, and the Power processors in IBM's big SMP servers) racked up 77 systems, with Hewlett-Packard's PA-RISC machines (mostly at the bottom of the list) accounting for 36 systems and machines based on AMD's Opteron processors coming in with 25 systems. Some 304 of the machines on the top 500 list are clusters (as opposed to massively parallel, shared memory, vector, or other kinds of machines).
Of the 1.69 petaflops of aggregate power on the list (up 50 percent from only six months ago), IBM has 57.9 percent of total capacity on the list (boosted somewhat by its own use of Blue Gene machines), with HP coming in second with 13.3 percent of the total capacity in 131 systems. SGI has 25 systems on the list that account for 7.5 percent of the total petaflops. No other vendor has a 5 percent share, which shows that this HPC server market is pretty tough to crack even with all of the innovation by lots of smaller vendors.
On the operating system from, while the use of Unix (as opposed to other more exotic and proprietary platforms) increased from 1995 through 2002, representing about 80 percent of installed systems, Linux has come from out of nowhere in 1998 and had knocked out all other non-Unix platforms by 2002, accounting for over 80 systems, and now is the operating system of choice for more than two-thirds of the platforms on the Top 500 list; Unix is used on all but a few remaining systems.