Blade Servers Make It to the Top HPC Sites
Published: November 27, 2007
by Timothy Prickett Morgan
The 30th edition of the semi-annual Top 500 listing of supercomputers in the world was announced just after The Linux Beacon went on holiday, during the Supercomputing Conference 2007 in Reno, Nevada. And if Moore's Law is a constant in the IT racket, so is the desire by the largest supercomputer facilities in the world to consume ever more large aggregations of computing power. The availability of multicore X64 processors that stay within a fixed thermal envelope coupled with a more or less standard Linux application environment and cheap InfiniBand interconnect and even cheaper Gigabit Ethernet links between nodes is making high performance computing cheaper than ever before.
The November 2007 Top 500 list is noteworthy in that India has broken into the Top 10 for the first time with a new blade server cluster it acquired from Hewlett-Packard. IBM's Blue Gene Linux-Power supers continues to give Big Blue an edge among big systems, and exotic machines from Cray, Sun Microsystems, and Silicon Graphics have either appeared on the list or will in the next iteration.
The turnover rate in the Top 500 list is staggering--the machine listed at number 500 on this 30th list would have been halfway up the list only six months ago--and so is the diversity of interconnections and approaches to building very powerful HPC iron. The ante to be on the current Top 500 list is 5.9 teraflops, up nearly 50 percent from the 4 teraflops it took to get on the list only six months ago. The aggregate performance of the Top 500 machines in the list is now 6.97 petaflops, up 41.7 percent from the 4.92 petaflops on the 29th list six months ago, and up 96.9 percent from the 3.54 aggregate petaflops on the 28th list a year ago. This rate of increase in aggregate power on the list is twice that of Moore's Law, which in recent years doubles the performance of processors every 24 months or so. (It used to be 18 months, but the 120-watt thermal ceiling and 3 GHz or so clock ceiling for X86 and X64 architectures has stretched Moore's Law out a bit since 2003 or so.) The good news is, for most supercomputing workloads, moving to lots of cores all connected on the chip has no negative effect on performance. The shift to multicore processors means that supercomputer centers can cram one quarter as many server nodes into a cluster as they could do only three years ago and still deliver the same performance.
As has been the case for many years, X86 and now X64 processors dominate the Top 500 list. X86 and X64 processors accounted for 413 of the 500 machines on the 31st ranking, or 82.6 percent of machines. Interestingly--and not surprisingly given the year-long lead that Intel has had in delivering four cores in a single processor socket over Advanced Micro Devices--Intel had gained ground, and has done so at AMD's expense. Intel X64-based machines now account for 335 machines on the list, and including the 19 Itanium-based machines, the Intel Inside logo is on 354 machines on the Top 500 list for November, up significantly from the 289 machines on the list six months ago. To say that the HPC market really, really wants quad-core processors--and is ready to easily and eagerly absorb octo-core chips as soon as anyone delivers them--is an understatement. The count for machines using AMD's Opteron X64 processors has dropped to 78 systems, down from 105 machines six months ago, and machines using IBM's Power processors have similarly dropped to 61 machines, down from 85 six months ago. It is not so much that machinery is being unplugged on the list (although this does happen eventually) as it is a situation that labs buys lots of the fastest, coolest thing. Two years ago, that was the Opteron or Power processors. This time around, it is quad-core X64 chips.
But the top of the Top 500 is still held by IBM's Blue Gene/L massively parallel super, which is installed at the Lawrence Livermore National Laboratory. Blue Gene/L has been in the top spot since November 2004, and was recently upgraded to a sustained 478.2 teraflops on the Linpack Fortran matrix math benchmark, This sustained performance on Blue Gene/L gets IBM halfway to its stated goal from back in 2000 to reach 1 petaflops with the Blue Gene project. Blue Gene/L runs a variant of the SLES 9 Linux operating system from Novell and has 212,992 of IBM's PowerPC 440 dual-core processors. The second system ranked on the November list is the BlueGene/P machine installed at Germany's Forschungszentrum Juelich (FZJ), which is rated at a sustained 167.3 teraflops. Number eight and 10 on the list are also Blue Gene machines, rated at 91.3 teraflops at IBM and at 82.2 teraflops at the Brookhaven National Lab.
Number three on the Top 500 list is a new SGI Altix ICE cluster built using quad-core Xeon processors--not the Itaniums that SGI has been trying to push as aggressively as anyone could for the past six years--that has a sustained performance of 126.9 teraflops. This new cluster is installed at the New Mexico Computing Applications Center. The Altix ICE machines employ a modular blade server architecture that separates computer, memory, and I/O onto blades.
Numbers four and five on the current Top 500 list are both blade server clusters built by HP, one for an Indian industrial giant and the other for an unspecified Swedish defense agency. Tata Sons has installed a 117.9 teraflops cluster in its computational research laboratories that is comprised of 1,800 of HP's BL460c Xeon blade servers, which are plugged into its Blade System 7000 c-class chassis. (Can you remember when the COCOM export rules would have never in a million years allowed such a powerful computer to be exported from the United States?) This blade cluster runs Linux and uses an InfiniBand interconnect. The 2,200-node blade cluster installed in the defense of Sweden that ranks number five on the list also runs Linux and uses InfiniBand to lash server nodes together; it is rated at 102.8 teraflops.
The three remaining machines in the top 10 are from Cray, and are based on the Linux-Opteron cluster design that made its debut in the "Red Storm" machine that Cray built for Sandia National Laboratories ad which has been commercialized as the XT3, XT4, and XT5. In fact, number six on the list is the Red Storm machine itself, which runs Cray's Unicos Unix variant, has 26,569 Opteron cores, and is rated at 102.2 teraflops of sustained performance. Number seven on the list is the "Jaguar" XT3/XT4 hybrid installed at Oak Ridge National Laboratory, which runs a mix of Unicos and SUSE Linux and which is rated at 101.7 teraflops. Rounding out the top 10 is a machine dubbed "Franklin," which is an XT4 cluster installed at Lawrence Berkeley National Laboratory. This machine has 19.320 Opteron processor cores and is rated at 85.3 teraflops. It also runs a mix of Unicos and Linux. Like SGI, Cray is moving to a blade-style form factor with its XT5 machines.
Linux has ruled the Top 500 list for quite some time, and the advent of a Windows alternative from Microsoft has not, as yet, changed that. Linux is the sole operating system on 426 of the 500 machines in the November ranking, or 85.2 percent of the machines, and is also usually the other operating system in the 34 hybrid platforms that make up another 6.8 percent of machines on the list. Commercial Unixes, FreeBSD, and Mac OS together make up 6.8 percent of the list, and Unix is usually the other side of any hybrid machine on the list, so Unix can be said to have a 13.6 of the list--a huge fall from the early days of the Top 500 list when very few people had heard of Linux. Windows now has six systems on the list, giving it a 6 percent share of machines, up from two machines six months ago. But the Windows slice of the computing pie is relatively small--about 0.7 percent of the 6.97 aggregate petaflops on the list, compared to a 70.3 percent share for Linux and a 92.4 percent of shared capacity if you blend in the mixed Linux-Unix platforms. If you consider Unix-only systems, then Unix has a 6.9 percent of the total capacity on the current Top 500 list, but if you consider hybrids, then Unix has a 29 percent share of the petaflops represented on the November ranking.
Windows Compute Cluster Server has a long way to go to make a dent in the Top 500 list, but this is not Microsoft's goal anyway--at least not yet. Just like Linux attacked proprietary and Unix supercomputers from the bottom up, Microsoft wants to use its considerable influence in the scientific workstation space and among smaller, newer kinds of HPC customers who have no Unix or Linux experience--and hence no prejudices against Windows--who want to graduate up to server clusters with hundreds of gigaflops to maybe a few teraflops of performance on the Linpack test. These machines won't even show up on this list--and would not have shown up even years ago. But just the same, there is a lot of money to be made here.
In terms of vendor share of the November 2007 Top 500 supercomputer list, IBM has 232 machines, giving it a 46.4 percent share of boxes and a 45 percent share of installed petaflops on the list. HP had 166 machines on the list, and garnered a 33.2 percent share of machinery and 23.9 percent of total petaflops. Cray had only 14 machines on the list, of 2.8 percent of machines, but because of the size of its Red Storm-derived machines, had a 7.4 percent of capacity. SGI has 22 machines on the list, and 7.3 percent of capacity, and Dell has 24 machines, and a flat 7 percent of capacity. These five vendors have 458 of the 500 machines on the list, and everyone else has one, two, or a handful of boxes--Hitachi, Fujitsu, Sun Microsystems, NEC, Bull, Appro International, Linux Networx, and California Digital.
Top 500 Supers: Moore's Law Is Alive and Well
Microsoft Back on the Top 500 List of Biggest HPC Systems
Dual-Core Processors Begin Takeover of Top 500 Super Ranking
Top 500 Supers: Brace Yourself for Petaflops Systems
Linux Clusters Continue to Expand in Top 500 Supers Ranking
Top 500 Supers List Dominated By Exotic Clusters
Top 500 Supers List Dominated by Teraflops-Class Machines
Linux, X86 Clusters Take Over Top 500 Supercomputer Ranking
Top 500 Supercomputer Ranking Gets Top Heavy
Post this story to del.icio.us
Post this story to Digg
Post this story to Slashdot