Newsletters Subscriptions Forums Store Career Media Kit About Us Contact Search Home

News While It's Still Hot
November 14, 2005

Linux Clusters Continue to Expand in Top 500 Supers Ranking

by Timothy Prickett Morgan

The number of parallel supercomputing clusters and the aggregate performance of these machines both continued their expansion in the latest Top 500 supercomputer rankings. The Top 500 list was announced at the SuperComputing 2005 tradeshow in Seattle, which is not exactly a coincidence. Microsoft, having seen Linux take off as the core platform on cluster supercomputers in the past five years, will debut its own Windows HPC variant at SC2005 this week. The question now is: Can Windows storm the Linux stronghold?

If the Top 500 is any indication, it would seem not. In a supercomputing environment, support for Fortran, C, and sometimes Java is critical, and so is the reliability of the core operating system platform deployed on a cluster. Excepting the very high-end, exotic machines that tend to dominate the Top 500 rankings, which are basically hand-tooled, very expensive supercomputers that only government agencies and academic research institutions can afford, most big HPC server buyers are interested in using as many commodity components as possible. That means X86 and now X64 processors and usually a Linux operating system. Windows can run on the same iron, of course, which is why it has a much better shot than Unix. But all of the compilers, workload management, job scheduling, cluster management, and resiliency software that has been either ported from Unix to Linux or created out of thin air by the open source HPC community has to be moved to Windows for Windows to have a chance in the HPC market. Bill Gates, Microsoft's founder, chief software architect, and a former compiler maker in his own right, might be a highly respected individual in the IT market, but research labs are going to expect more from a Windows HPC variant than they get on their desktops. It will be very interesting to see what penetration Microsoft gets in the Top 500 list in the next two years. There is certainly potential for Microsoft to win a lot of deals, as Linux distributors have done.

But for now, the Top 500 is about Linux and Unix, and the $7-plus billion HPC server market is one where Microsoft has almost no presence except for a few token accounts. The current Top 500 list makes this very clear. For instance, there is not a single Windows machine on the list, even though there are a few experimental Windows clusters out there in the world. The reason is simple: to get on the Top 500 list, the ante is now 1.64 teraflops, which is almost double the performance of the 500th system on the list. If Microsoft wants Windows to get a toe-hold in the Top 500 list in the coming years, it is going to have to find some very big institutions to use its software. Eating its own dog food is probably the best place to start, and it seems likely that Microsoft Research, the software giant's research arm, will probably start moving up the Top 500 list after the company installs its own cluster; it is possible that Microsoft pumps the money for a giant Windows cluster into the Cornell Theory Center or some other academic institution as well.

Right now, Linux rules the Top 500, after entering the list in 1998. In the latest Top 500 rankings, Linux machines account for 372 out of 500 machines (74 percent), 1.19 petaflops, or thousands of teraflops, out of a total of 2.3 petaflops of aggregate sustained computing capacity (52 percent), and 333,373 processors out of a total of 731,069 (46 percent) across those 500 machines. Various Unixes (including AIX, Solaris, HP-UX, the open source BSDs, and the BSD-derived Mac OS X) account for 109 machines, and the remaining 19 machines on the list are mixed clusters that use both Linux and Unix.

By processor architecture, there are only 14 vector machines in the whole Top 500, which shows how far clusters based on scalar, commodity processors have come in the past 15 years. In June 1993, there were 334 vector machines on this list accounting for about a third of the aggregate 1.1 teraflops of computing power in the Top 500 list (the first one ever published, in fact), and the 131 other machines were exotic parallel clusters built by companies such as Kendall Square, Thinking Machines, MasPar, and a bunch of smaller vendors that no longer exist. And tellingly, IBM did not have a single machine on the list. Most of these machines had proprietary interconnects, proprietary processors, and a basically proprietary implementation of Unix.

Fast forward to November 2005's list, and most of the machines are built on commodity Intel or AMD processors: There are 206 machines built from 32-bit Xeons, 46 built from Itaniums, 81 built from 64-bit Xeons, and 55 built from Opterons; IBM's Power family of processors have a nice showing, with 73 machines.

The dominant interconnection method is off-the-shelf Gigabit Ethernet, with 249 machines in the November 2005 list, followed by 70 machines using Myricom's Myrinet interconnect, 26 using InfiniBand, 42 using one or another variant of IBM's SP switching technology from its pSeries Unix family, 19 using Silicon Graphics' NUMALink, 31 using Hewlett-Packard's HyperPlex. There's a mix of crossbars and other interconnects in the list, but clearly people are perfectly happy to build X64 and GigE clusters, and are doing so with wild enthusiasm. On the current list, 360 of the 500 machines are clusters, with 104 being classified as massively parallel processors (which means they have a high-bandwidth interconnect that more tightly couples the machines together) and another 36 being so-called constellations (which are clustered SMP architectures that offer bigger computing nodes and bigger main memories than clusters of tiny servers).

These days, IBM systems utterly dominate the Top 500 rankings, and that is so by design. Big Blue has been very aggressive in the supercomputing market since the advent of its "Deep Blue" RS/6000 PowerParallel SP machines in 1993. (By the way, the Cornell Theory Center got the first SP box in 1994, and that might make this center, which is not the most prestigious in the world in terms of aggregate computing power, a very interesting leading indicator nonetheless. Watch the CTC, which was funded by IBM in the 1980s to build its first parallel supercomputer prototypes and which has been courted by Microsoft, very carefully.)

The biggest IBM computer on the list is also the most powerful in the world, the Blue Gene/L machine at Lawrence Livermore National Laboratory, built by the U.S. Department of Energy to simulate nuclear explosions and manage the nuclear arsenal. This machine has 131,072 32-bit PowerPC 440 processors running at 700 MHz, which runs a cut-down Linux kernel on its processing nodes. It is rated at 280.6 teraflops, which is a bit higher than IBM expected. The number two machine on the list is also a Blue Gene machine, but one at IBM's T.J. Watson Research Center with a rating of 91.3 teraflops. Blue Gene was a research project launched in 1999 that was aimed at creating a box to do protein folding simulations and reaching a 1 petaflops performance level by 2010, but it has come out as a commercialized product that could end up making IBM money. Dave Turek, IBM's vice president of deep computing, says that 19 of the 219 systems IBM has on the list are Blue Gene boxes. The reason is simple. "Blue Gene is not just about performance, but about the cost of electricity and cooling," he explains, adding that the Blue Gene project has now more than paid for itself and that interest in the HPC marketplace for the BlueGene box, which was just commercialized late last year, remains quite high. IBM also has another big RISC/Unix box on the Top 500 list, the ASCI Purple parallel pSeries server, also at LLNL. This box has 10,240 1.9 GHz Power5 processors (housed in p5 575 chasses) running AIX and using the "Federation" high performance switch. It is rated at 63.4 teraflops. The "MareNostrum" cluster at the Barcelona Supercomputer Center in Spain, which is a cluster of PowerPC blade servers running Linux and using Myrinet interconnect, is number eight on the list with 27.9 teraflops pf power; a baby Blue Gene box at the University of Groningen in the Netherlands rated at 27.45 teraflops rounds out IBM's entries in the top 10 rankings. Those 219 IBM supers in the list have a combined 1.2 petaflops of computing power (52 percent of the total installed capacity) and over 444,654 processors--many of them being Xeons and Opterons as well as Powers and comprising an impressive 61 percent of total processors installed.

Rounding out the top 10 super sites in the world is the "Columbia" system at NASA-Ames built by SGI using its Altix Linux-Itanium servers and InfiniBand interconnect from Voltaire, which is rated at 51.9 teraflops. The "Thunderbird" Linux cluster built by Dell for Sandia National Laboratories using Xeon processors and InfiniBand interconnect comes in at number five with 38.27 teraflops, and it sits on the same site as the 36.2 teraflops "Red Storm" Linux-Opteron cluster with Cray's own interconnect. Red Storm is ranked number six. The Japanese giant, "Earth Simulator," ruled the Top 500 list for some time, but at 35.86 teraflops, this parallel vector machine built by NEC has dropped to number seven on the list. Number 10 on the list is the Crazy XT3 Linux-Opteron cluster at the Oak Ridge National Laboratory. The "Thunder" Linux-Itanium cluster--built for LLNL by California Digital and using Quadrics interconnect--fell to number 11 on the list after being the number two machine only 18 months ago.

By vendor rankings, HP is the second-most prolific vendor on the current Top 500 list, with 169 systems that collectively account for 431.9 teraflops. While HP has 33.8 percent of systems, it has only 19 percent of the aggregate processing capacity in the Top 500 list. While it is hard to say what HP was counting on when it acquired Compaq many years ago, one of the things that seemed obvious was that the marriage of Compaq and HP, with a merged Itanium roadmap, would be able to take their collective leadership position and get a bigger chunk of the HPC server market. There is little doubt that had Intel more aggressively delivered Itanium processors with better thermal properties and multiple cores per socket, HP would have a much more impressive showing in the Top 500 list. Itanium chips are excellent at exactly this kind of number-crunching work, but they run too hot and are too expensive compared to alternatives in HP's own product line. Cray has a mix of Linux-Opteron and vector-based X1 systems on the Top 500 list totally up 18 machines, and SGI has 18 machines on the list as well (all of them Altix Linux-Itanium boxes). Linux Networx, which has been coming on strong in the HPC market in recent years, has almost the same slice of the pie, with 16 machines. Dell has also grown its HPC business substantially, with 17 machines on the list, but it is limited in that it does not support Itanium or Opteron processors, which offer better performance on many applications than Xeons, and it does not have a high-end RISC/Unix play on which to chase the exotic clusters that IBM, HP, Fujitsu-Siemens, and Sun Microsystems sell. Sun has fallen the furthest in recent years on the Top 500, with only four machines with a total of 9.1 teraflops on the November 2005 list. Considering that Sun bought the supercomputing carcasses of one unit of Cray as well Kendall Square and Thinking Machines and then Gridware (now Grid Engine) for grid software, you would think that Sun would have a fairly large supercomputer presence, which would be reflected in the Top 500 list. Perhaps the power-conscious "Galaxy" Opteron servers, which support Linux and Windows as well as Solaris, will give Sun another run at the HPC server market. Time will tell, but Sun is clearly hoping for this to happen.

The Top 500 supercomputer rankings are compiled by Hans Meuer, of the University of Mannheim in Germany; Jack Dongarra of the University of Tennessee in Knoxville; and Erich Strohmaier and Horst Simon of NERSC/Lawrence Berkeley National Laboratory. The list is a leading indicator of the kinds of technologies that often get deployed in the wider HPC server market, which is obviously comprised of many more than 500 sites.

Sponsored By

Clusterworx® Whitepaper

High performance Linux clusters can consist of hundreds or thousands of individual components. Knowing the status of each CPU, memory, disk, fan, and other components is critical to ensure the system is running safely and effectively.

Likewise, managing the software components of a cluster can be difficult and time consuming for even the most seasoned administrator. Making sure each host's software stack is up to date and operating efficiently can consume much of an administrator's time. Reducing this time frees up system administrators to perform other tasks.

Though Linux clusters are robust and designed to provide good uptime, occasionally conditions lead to critical, unplanned downtime. Unnecessary downtime of a production cluster can delay a product's time to market or hinder critical research.

    Since most organizations can't afford these delays, it's important that a Linux cluster comes with a robust cluster monitoring tool that:
  • Provides essential monitoring data to make sure the system is operational.
  • Eliminates repetitive installation and configuration tasks to reduce
          periods of downtime.
  • Provides powerful features, but doesn't compromise on usability.
  • Automates problem discovery and recovery on would-be critical events.

This paper discusses the features and functions of Clusterworx® 3.2. It details how Clusterworx® provides the necessary power and flexibility to monitor over 120 system components from a single point of control. The paper also discusses how Clusterworx® reduces the time and resources spent administering the system by improving software maintenance procedures and automating repetitive tasks.

High Performance Monitoring

Each cluster node has its own processor, memory, disk, and network that need to be independently monitored. This means individual cluster systems can consist of hundreds or thousands of different components. The ability to monitor the status and performance of all system components in real time is critical to understanding the health of a system and to ensure it's running as efficiently as possible.

Because so many system components need to be monitored, one of the challenges of cluster management is to efficiently collect data and display system health status in an understandable format. For example, let's say a cluster system has 100 nodes and is running at 97 percent usage. It's very important to know whether 100 nodes are running at 97 percent usage or whether 97 nodes are running at 100 percent usage while three nodes are down.

Clusterworx® provides real-time analysis of over 120 essential system metrics from each node. Data is displayed in easy-to-read graphs, thumbnails, and value tables. Clusterworx® collects data from groups of nodes to spot anomalies, then drills down to single node view to investigate problems. This allows users to determine exactly what the problem is before taking corrective action.

Clusterworx® also tracks the power and health state of each node and displays its status using visual markers in a node tree view throughout the user interface. Power status shows whether the node is on, off, provisioning, or in an unknown state. The health state tracks informational or warning messages and critical errors. Health state messages are displayed in a message queue on the interface.

Clusterworx®'s comprehensive monitoring and easy-to-read charts and graphs allow users to quickly asses the state of each node and the overall system at a glance - while providing the necessary information to make informed decisions about the cluster system.

To read the rest of this whitepaper, please visit

Editors: Dan Burger, Timothy Prickett Morgan, Alex Woodie
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message.

Breaking News


Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc. (formerly Midrange Server), 50 Park Terrace East, Suite 8F, New York, NY 10034
Privacy Statement