tlb
Volume 4, Number 41 -- November 6, 2007

Cray Revamps Supercomputers with XT5 Designs

Published: November 6, 2007

by Timothy Prickett Morgan

Supercomputer maker Cray moves one step closer to its long-term goal of converging its various high performance computing machines today with the launch of the Linux-based XT5 family of machines. The XT5 and XT5h machines are the fourth generation of massively parallel machines based on the "Red Storm" design that Cray sold to the U.S. government in 2002 as part of the U.S. Department of Energy's ASCI program to use computers to manage the country's stockpile of nuclear weapons and design new ones without detonating actual warheads.

The Red Storm supercomputer, which is installed at Sandia National Laboratory, takes Opteron processors and their HyperTransport links and marries it to a high-bandwidth, low latency interconnect called SeaStar designed by Cray to put thousands and thousands of processors into a single complex. The XT3, which is the first commercialized product based on the Red Storm design, was in volume production in early 2005 and was followed by the XT4 in late 2006 and its upgraded Opterons and SeaStar-2 interconnect.

With the XT5 family of machines, Cray is tweaking the blade-style packaging for its compute and I/O blades; its rival in the HPC market, Silicon Graphics, has also moved to blade packaging for its Itanium and Xeon clusters with recent designs. The XT5 family is still based on Opterons, however, and has to be since Intel does not have anything resembling HyperTransport for the SeaStar interconnect to link into easily. This could, of course, change when Intel's future "Nehalem" Xeon and "Tukwila" Itaniums come to market with Intel's QuickPath interconnect and integrated memory controllers, which have an architecture that is suspiciously parallel to that of the Opteron-HyperTransport scheme.

There are two flavors of the XT5, which are the XT5--based solely on Opteron-based blades--and the XT5h--a hybrid that has blades with either vector processors or field programmable gate arrays (FPGAs) as well as supporting Opteron blades.

The XT5 supercomputers can employ the existing XT4 blades--customers do not have to upgrade to the latest dual-core and quad-core Opteron processors from Advanced Micro Devices to get the new chassis and its improved SeaStar-2+ interconnect. The XT4 blades, which have four Opteron sockets and four SeaStar-2+ interconnect chips, are aimed at workloads where customers need a balance of compute and interconnection bandwidth, while the new XT5 blades have eight processor sockets and four times the main memory sharing the same four SeaStar-2+ interconnects. The XT5 blades are aimed at workloads that are memory intensive or compute intensive, or both, but which do not need to communicate with other blades as much. Customers can mix and match the blades in a single XT5 chassis to get a mix of board styles that matches their particular workloads. The XT5 also supports an SIO blade, which as the name suggests handles I/O requests into and out of the SeaStar-2+ interconnect, linking disk arrays and other peripherals to the machines.

With the XT5h, Cray is throwing in a kicker to its X1 vector processor, which is itself a virtual vector processor called a MultiStreaming Processor that is made up of four vector chips that act like a bigger and more powerful single chip. This blade sporting this new X2 vector processor is the X2 vector blade, which has four vector processors on the blade that link into the SeaStar-2+ interconnect, allowing up to 1,024 vector processors to be linked into a single shared memory system. (These blades can support legacy Cray vector applications.) Each processor on the blade is rated at 25 gigaflops, yielding a vector machine that tops out at 25.6 teraflops. Cray is also integrating FPGAs, which are targeted at specific applications that can be programmed on them, in its XR1 blade. This blade has four SeaStar-2+ chips, two Opteron processors, and four FPGAs from Xilinx on it. According to Jan Silverman, Cray's senior vice president of corporate strategy and business development, this XR1 blade does not implement the hybrid FPGA-Opteron technology that Cray got through its acquisition of Canadian supercomputer maker OctigaBay in March 2004 for $115 million. But you can bet some ideas where heavily borrowed from the machines, which were renamed XD1s by Cray. The XT5h also supports a global address space for vector processing nodes that allows applications written in Unified Parallel C and Co-Array Fortran to run on the boxes. This supplements the Message Passing Interface (MPI) method of parallel processing, which breaks applications and data sets into small chunks and runs them in parallel and which is the only option on the Opteron blades. These new C and Fortran compilers try to mask some of the parallelism in the machine and allow programmers to code more like they would on an SMP box.

The XT5 and XT5h machines run a variant of Novell's SUSE Linux Enterprise Server 10. Cray is also peddling a variant of the open source Lustre file system to serve all of the nodes in the XT5 and XT5h machines.

An XT5 cabinet supports up to 192 Opteron sockets, or a maximum of 768 cores, and using the new "Barcelona" quad-core Opterons from AMD, that works out to about 7 teraflops per cabinet. Silverman says that a typical XT5 cabinet costs around $500,000, with a box with lots of memory and I/O having a price tag north of $1 million.

Two of the things that Cray will be pushing with the XT5 designs, aside from the various processing elements they embody, are density and power efficiency, which are the mantras of all server makers these days. The upgrade to the Red Storm machine at Sandia in 2005 took up 120 cabinets to hit its 43 teraflops performance, but the XT5 machine will be able to do the same task with only six cabinets. That's a factor of 20 reduction in floor space in under three years. None of the blades in the Cray cabinet has a fan, but rather cold air is pulled directly from ducts in the floor and blown up through the 24 compute blades above a single (and large) high-efficiency axial turbofan. (This fan is a lot more reliable than the muffin fans used in servers today, and is also a lot quieter than a zillion of them humming away.) The XT5 cabinet also has a 400/480 volt power distribution unit in the base of the cabinet, which feeds into a bank of modular power supplies that in turn supply power to each blade. The PDU uses the same voltage of power as comes into the data center, which means it does not need to be stepped down, which causes some energy to be wasted and heat to be generated.


RELATED STORIES

AMD Gets Aggressive About Watts with Quad-Core Barcelonas

AMD's Chip Roadmaps: Beyond Barcelona

Cray Blames 2007 Revenue Shortfall on Barcelona Opteron Delays

Cray Announces XT4, XMT Supercomputers

Cray Lands $200 Million Linux-Opteron Super Deal with DOE

Cray Warns Q2 Down Significantly, Affirms Guidance for Year

Cray Gives Pink Slips to 8 Percent of its Workforce

Cray's CTO Plans Its Future Converged Iron

Cray Subcontracts SuSE for "Red Storm" Linux Super Cluster



Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc., 50 Park Terrace East, Suite 8F, New York, NY 10034

Privacy Statement