NASA Taps SGI and Intel for Pleiades 10 Petaflops Super
Published: May 13, 2008
by Timothy Prickett Morgan
Forget breaking the petaflops barrier, which by the way no supercomputer cluster has done officially yet but probably will on the next iteration in the fall of the Top 500 semi-annual listing of global supers. Supercomputer makers are setting their sights on 10 petaflops and beyond already. Last week, NASA's Ames Research Center announced that it has picked long-time partner Silicon Graphics and chip maker Intel to build its next generation supercomputer, called Pleiades, based on Xeon processors.
The fact that NASA Ames has chosen SGI to help it build the next big supercomputer is no surprise. The NASA facility has been a stalwart customer of SGI for many years and has helped SGI scale up the shared global memory that is at the heart of the company's Altix brand of Linux-Itanium servers. The NASA facility is where the Columbia supercomputer runs, formerly the fastest machine in the world. Today, the Columbia machine is comprised of 10,160 1.5 GHz "Madison" Itanium processors and has a peak rating of 61 teraflops on the Linpack Fortran benchmark test. Columbia was installed to get NASA back on track with the Space Shuttle program after Columbia exploded and has also been used to run various weather and cosmology models.
The Pleiades machine, named after the famous star cluster in Taurus that goes by many names (Suburu in Japanese, for instance), is being made out of clusters of SGI's most recent Altix ICE Xeon-based blade servers rather than on the Itanium-based Altix servers. The initial Pleiades machine will be based on Intel's current "Harpertown" quad-core Xeon processors, and will have over 40 racks linked together in a parallel cluster (not using NUMA and global shared memory). Each rack will have 512 processor cores (that's 128 sockets and, presuming that these are two-socket blades, 64 blades) and a total of 512 GB of main memory (presumably 8 GB of memory per blade). That is a total of 20,480 Harpertown cores, 20 TB of main memory, and 450 TB of InfiniteStorage disk arrays linked to the servers through an InfiniBand fabric, which also links the blades to each other in the cluster. The initial Pleiades configuration will be rated at 245 teraflops of peak performance, according to SGI. According to Intel, the Pleiades machine will be upgraded to future "Nehalem" Xeons, which will be based on the QuickPath Interconnect scheme and will have integrated memory controllers on the chip (resulting in higher bandwidth and greater energy efficiency per teraflops than current Xeons can deliver). After that, to get to the 10 petraflops level, Intel and SGI will work together to create a machine based on a future Xeon processors.
It is important to note that the quad-core "Tukwila" Itanium is not being used in the system, despite its very good number-crunching performance and its code compatibility with the Columbia supercomputer. Intel is also not pitching its six-core "Dunnington" Xeon chips in the machines, but jumping from Harpertown straight to Nehalem, which will deliver a factor of 16 times more performance, or 3.92 petaflops. Intel has said publicly that the Nehalem chips will have at least eight cores per socket, so that is 2X of the performance right there. The extra memory bandwidth and chip-to-chip interconnect alone in the QuickPath Interconnect could account for a lot of the jump in performance, too. HyperThreading is being put back into the Xeon chips, too, and that is another maybe 50 percent in performance. The shrink from 45 nanometer to 32 nanometer processes that happens in 2009, after the Nehalems have been in the field for a year, could account for another 50 percent hump in performance over the Harpertown chips, since Intel could crank up the clocks and still stay in the Harpertown thermal envelope; it could also move to 16 cores and do the same. The rest of the performance gain to that 3.92 petaflops in performance will probably come through adding nodes. My best guess is that SGI can double the number of racks to 80 and hit the target with the Nehalems.
Going up by a factor of 2.5 to hit the 10 petaflops performance with the Nehalem kicker Xeon in the 2012 timeframe is a bit harder to speculate about.
To date, SGI's largest Altix ICE cluster is the 14,336-core box named Encanto installed at the New Mexico Computing Applications Center, which has a peak rating of 172 gigaflops.
Like the Columbia Altix machine, the Pleiades machine will run Novell's SUSE Linux Enterprise Server 10 operating system and will presumably be upgraded with kickers as SLES 11 and 12 make it into the field over the next four years.
Intel Shows Off Future Penryn and Nehalem Chip Designs
Post this story to del.icio.us
Post this story to Digg
Post this story to Slashdot