|
||||||||
|
|
![]() |
|
|
IBM's Blue Gene/L Shows Off Minimalist Server Design by Timothy Prickett Morgan IBM recently gave the world its first look at the Blue Gene/L supercomputer, which is a scaled down version of the original million-processor, 1 petaflops Blue Gene supercomputer that Big Blue announced its IBM Research division was undertaking to build to much fanfare in 1999. While Blue Gene/L doesn't have a lot in common with the current iSeries, some of the minimalist ideas in the machine could very well find their ways into the iSeries and other Power-based servers in the not-so-distant future. Back in November 2002, IBM won a $290 million contract to build a puppy Blue Gene machine that runs a stripped down version of the Linux operating system for the U.S. government's Department of Energy and to build the ASCI Purple massively parallel supercomputer, which is based on IBM's forthcoming "Squadron" line of Power5 servers. Blue Gene/L represents about $100 million of that contract, and calls for IBM to deliver a 360 teraflops machine comprised of 65,536 compute processors. ASCI Purple will be comprised of 196 64-way Squadron servers using dual-core Power5 processors, probably running at 2GHz. This machine will have a total of 12,544 processor cores and will use a daisy chain of IBM's new "Federation" High Performance Switch (HPS) system interconnection switches to link them all together. When the ASCI Purple deal was announced last year, some information about the Blue Gene/L machine also leaked out, even though IBM didn't intend for this to happen. IBM seems to have crammed a lot more electronics onto the Blue Gene/L custom processors than it expected it could do. According to Paul Coteus, who is one of the managers of the Blue Gene/L project, the custom processors used in the Blue Gene/L machine are really just about complete systems on a chip, and the system components are based on well-tested and well-understood IBM technologies. "Blue Gene/L is a poster child for what is possible with the system-on-a-chip concept," says Coteus. "More than 90 percent of this chip was built from standard blocks in our technology library." Specifically, Blue Gene/L processors are based on the 32-bit PowerPC 440 embedded processor cores, which each have two floating point units. Like the Power4, Power4+, and Power5 processors, the Blue Gene/L processors have two cores on a single die that each have their own L1 caches and which share L2 and L3 caches. Three levels of cache (L1, L2, and L3) are embedded on the chip, and so is the main memory controller for each chip. The chip has 4 MB of L3 cache, which is pretty hefty. In addition, three different network interfaces (one of them circuitry to provide Gigabit Ethernet links) is right there on the chip, too. The two other interfaces are to support a three-dimensional torus interconnect that runs the Message Passing Interface (MPI) protocol commonly used in parallel supercomputer clusters. Running at the 700 MHz clock speed expected when Blue Gene/L is delivered to Lawrence Livermore National Laboratory in the first quarter of 2005, these special chips only crank out 1.5 watts per core and the whole system-on-a-chip only dissipates 12 watts while delivering 5.6 gigaflops of floating point performance. That is very impressive, particularly since the chip is only made in IBM's 130 nanometer CMOS 8-Cu11 chip process. The prototype Blue Gene/L machine that IBM is showing off today will have 256 of these processors in a 21U half-rack, delivering 512 compute nodes (running at 500MHz instead of 700 MHz) and a total of 2 teraflops of aggregate computing power. Coteus says that IBM has run the Linpack Fortran benchmark on the prototype, and that it has run at just under 70 percent efficiency. This is about as good as any other parallel supercomputer does, and IBM's own pSeries-AIX parallel machines often run much less efficiently. Linux clusters are, generally, very inefficient running many workloads because they use commodity interconnection technology that forces processors to spend most of their time waiting for something to do. By slowing down the processors as IBM has done with Blue Gene/L and by putting the interconnection technology on the chip, IBM has been able to dramatically reduce system latencies and thereby improve the efficiency of a Linux cluster. Slowing down processors is counterintuitive at first, but remember that memory speeds and processors speeds are diverging because memory technology has not evolved as fast as processor technology. By slowing the processors down (and using many more of them to get work done), each processor can be better fed by its memory subsystems. Having L1, L2, L3 caches and L1, L2, L3, and main memory controllers on chip helps in this regard, too, as does having connections to other processors and data storage devices. Faster is not always better. The full Blue Gene/L machine that will be delivered to Lawrence Livermore in early 2005 will support 16 TB or 32 TB of main memory (that's lower than expected, but IBM seems to have doubled the amount of L3 cache to balance performance). Blue Gene/L will occupy 64 racks, take up 2,500 square feet of floor space, will consume 1.5 megawatts of power. This sounds big, but let's compare it to other behemoths in the supercomputing world. The current goliath of supercomputing is Earth Simulator, built by NEC Corp for the Japanese government, which is rated at 40 teraflops. Earth Simulator is a parallel cluster of vector processors that run at 500 MHz. It is very, very big, occupying 34,000 square feet of floor space. It consumes 5 megawatts of power, and it cost $350 million, or about $8.75 per gigaflops. ASCI Purple will cost $1.90 per gigaflops, and while this is a big improvement over the ASCI White parallel supercomputer IBM built for DOE a few years back (which is rated at 12.3 teraflops and which cost a little over $8 per gigaflops), Blue Gene/L only cost 28 cents per gigaflops. That's one hell of a big improvement in price/performance. Blue Gene/L takes up one quarter of the floor space of ASCI White, too. Perhaps most significantly, ASCI White delivered 12,300 flops per watt. Earth Simulator delivers only 8,000 flops per watt. (This is a terrible ratio, but vector applications run like very fast on this machine, and that is important, too.) ASCI Purple will consume 4.7 megawatts of power, and deliver only 21,275 flops per watt. Blue Gene/L will beat them all, delivering a stunning 240,000 flops per watt. That means Blue Gene/L will take a lot less money to run and cool than these other monsters. Coteus says that the core processors in the Blue Gene/L machine are running a very bare-bones Linux kernel, and that an additional 1,000 processors in the cluster (which are not counted in that 65,536 processor count) will run a fuller implementation of Linux since they will need to coordinate with I/O devices. IBM has ported GNU Fortran, C, and C++ compilers to the machine, and is using a Linux cluster called "Mambo" that simulates Blue Gene/L to tune these compilers. While all of this is interesting, Coteus and his colleagues at IBM are not quite ready to commercialize Blue Gene/L. While he is optimistic that the performance of Blue Gene will scale, there is a big difference in testing a 512-node cluster and a 65,653-node cluster. However, he says that IBM is talking to a number of research institutions about getting beta Blue Gene systems, and IBM is trying to figure out if, when, and how to commercialize Blue Gene/L. This project could upset whatever plans IBM has for Power5 and Power6 servers, especially after the numbers the IBM Research team behind Blue Gene/L become widely known. Even customers who only want a few hundred gigaflops or a few teraflops of power are going to want the kind of bang for the buck that the designers of the Blue Gene/L prototype are bragging about.
Editor: Timothy Prickett Morgan
Managing Editor: Shannon Pastore
Contributing Editors: Dan Burger, Joe Hertvik, Kevin Vandever,
Shannon O'Donnell, Victor Rozek, Hesh Wiener, Alex Woodie
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message. |
|
| Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved. |