IBM’s First Power6 Box: A Glimpse Into System i 2008 Edition
May 29, 2007 Timothy Prickett Morgan
As the rumors swirling around for the past few weeks suggested that IBM would, the company last week launched the Power6 processor and its first server to make use of it. IBM had been hinting rather strongly that the first machine to get the Power6 chip would not be a System i, which got the Power5 chip first in 2004, but rather a System p box. That machine gives System i shops a glimpse of a midrange/high-end server that awaits them in the future–most likely in 2008.
The System p 570 machine that IBM announced last week as the first server to employ the dual-core Power6 chip will almost certainly come out the door with the same hardware and a modified i5/OS software stack; in fact, it is possible to run AIX 5.3–the current AIX release that has been tweaked to support the Power6 chip because AIX 6.1 (formerly known as AIX 5.4) is running late–and i5/OS V5R4 or OS/400 V5R3 side by side on the System p 570 machine when it ships on June 8. But IBM limits the number of i5/OS or OS/400 partitions that can be installed on a System p box, so don’t think you can just put a single AIX partition with 1/10th of a processor allocated to it and then put i5/OS or OS/400 on the remaining capacity of the box to make yourself a Power6-based System i 570. (If you are really desperate to add processing capacity, IBM might be willing to let you test it out. It can’t hurt to ask.)
As I have explained since IBM has been talking about the Power6 chip in early 2006, the Power6 is breaking with some of the trends in the chip industry and IBM is doing so for what it believes are good reasons. The chip is a dual-core chip, like the Power4 and Power5 processors that preceded it to market in 2001 and 2004, but with the shift to 65 nanometer manufacturing techniques, IBM is using the transistor shrink to add more components–such as a decimal floating point unit and vector math units–while cranking up the clock speeds on the chip. Because IBM is shrinking from 90 nanometers to 65 nanometers, it can boost the speed on the processors and keep the Power6 systems in the same thermal envelope as the Power5+ systems that preceded them. Other server chip makers are using the chip shrink process to double the number of cores in a single socket from two cores 18 months ago to four now to eight next year, in the case of Intel and Advanced Micro Devices; Sun Microsystems has created a 16-core Sparc chip called “Rock” based on 65 nanometer technologies, which is expected next year.
By going with clock speed increases and high memory bandwidth with the Power6 design, IBM is hoping to entice customers who have not begun the task of multithreading their applications to opt with its RISC/Unix platform instead of those from competitors, which have much lower clock speeds. For certain kinds of batch work, which is serial and single-threaded in nature, having the fastest possible processor is the key to performance. And with over 300 GB/sec of aggregate bandwidth, the Power6 chip has plenty of memory and I/O bandwidth to keep processors that run at up to 4.7 GHz fed with data.
“We’ve got bandwidth coming out of our ears,” says Brad McCredie, the IBM fellow within IBM’s Systems and Technology Group who led the Power6 design. He also says that all of the speculation that IBM could not get yields or volumes on the Power6 chip are bunk. “There has been some noise about manufacturing issues, and we found this very interesting. The fact is, we have lots of chips, and we are filling up the top sort bucket as fast as the lower buckets.”
IBM has been disclosing the technologies within the Power6 chip design for more than a year, so many of the features of the chip are not a surprise. What IBM is not disclosing is what the thermal design point (TDP) is on the Power6 chips, and how that compares with the Power5 and Power5+ chips. McCredie says that a System p 570 system using Power6 chips and the new memory and I/O subsystems is within 5 percent of the same thermals as a p5 570 server using Power5+ processors running at 2.2 GHz.
The Power6 chip has around 790 million transistors, and has an area of 341 square millimeters. Each Power6 core has simultaneous multithreading (SMT) electronics on it, and rather than shrink the length of the instruction pipeline as the clock speed was increased, IBM’s chip engineers figured out ways for the pipeline to stay the same length and become more efficient, allowing the clock speed to be doubled from the Power5 design. Each Power6 has 8 MB of L2 cache memory (4 MB for each core) on the chip and the design allows for 32 MB of off-chip but on-package L3 cache memory to be added to each Power6 module. Main memory and L2 cache memory controllers are also embedded on the chip. Each core has a VMX vector processor math unit, which will make the Power6 very fast at floating point math, and the decimal floating point unit, which does base 10 or “money” math, can speed up decimal math operations by a factor of 2 to 7 in IBM’s initial tests.
According to McCredie, the Power6 architecture also includes a two-tier star configuration for SMP clustering of processors, which is more scalable than the ring architecture of the Power4 and Power5 server designs and which provides a lot lower latency than the prior Squadron servers. (IBM has not discussed this feature yet with the press.) The Power6 chip also provides multiple power supply rails for different segments of the chip, which allows unused components to be put to sleep to save energy, as well as voltage and frequency slewing, which can cut the electricity usage of the Power6 chip by 50 percent without sacrificing performance. Finally, Power6 was designed from the get-go to scale up and down in clock frequencies, which means IBM can crank the clocks down to make a Power6-based blade server, which is indeed the plan as the Power6 chip is rolled out into various IBM product lines.
The Power6 chip is being released at 3.5 GHz, 4.2 GHz, and 4.7 GHz clock frequencies, and rather than rolling out a whole new server design, IBM is tweaking the current “Squadron” server design to get the Power6 chip into the System p 570 system. The System p 570 is one of the original workhorses of the Power5 lineup, and is comprised of four two-socket servers that are lashed together with fiber optic links to create a single system image that can bring up to 16 processor cores to bear on a single workload. McCredie says that to make the System p 570 ready for the Power6 processor, IBM created a new processor card, a new I/O hub, and a new memory subsystem that provides 12 memory slots per CPU socket, compared to the eight memory slots that Power5-based p5 570 servers had. The current System p 570 machine has roughly two times the performance of the original System p5 570s from 2004.
What IBM did not do last week, obviously, is roll out a completely revamped set of motherboards for the Power6 servers, which is what it needed to do with the Power4 and Power5 machines because the respective “Regatta” and “Squadron” server designs were so different from their predecessors. IBM first plans to roll out the Power6 chips across the System p and System i product lines, and McCredie hints that it may take until early 2008 to get the job done even in the System p Unix line. The System i line, which runs predominantly i5/OS, IBM’s proprietary operating system, is not expected to get Power6 processors until early 2008, but marketing conditions could force IBM to move that announcement up.
The Power6-based System p 570 server announced last Monday can support AIX 5.2 in a limited fashion (without logical partitioning), but really requires AIX 5.3 at technology level 6 to be useful. AIX 6, which was formerly known as AIX 5.4 until this morning, is designed to take full advantage of the Power6 chips, and is now slated for delivery in November. Only two months ago, IBM was expecting this update to AIX to be available in October, so there is still more slippage on the software front. Incidentally, AIX 5.3 has not been recompiled for Power6, and neither has IBM’s DB2 database or the SAP ERP application suite that IBM has performed some benchmark tests on to prove the mettle of the Power6 chip. When AIX 6 is out and systems software is recompiled, you can bet IBM will run tests and show considerably more performance gains.
Novell‘s SUSE Linux Enterprise Server 10 is certified to run on the new System p 570 server with the Power6 chip, and Red Hat is expected to certify RHEL 4 Update 5 and RHEL 5 by the third quarter of this year.
The amount of main memory that customers can put in the new System p 570 server using the Power6 processor varies depending on the kind of DDR2 main memory customers use. Using slow 400 MHz DIMMs, main memory scales from 256 GB to 768 GB, but using faster 533 MHz DIMMs drops the capacity the box can hold by half. And using 667 MHz DIMMs cuts it in half again, to a maximum of 192 GB. Each chassis in a System p 570 server has four PCI-Express 8x slots and two PCI-X slots, and each can hold six SAS drives for up to 1.8 TB of disk space. (A fully loaded 16-core System p 570 would have four of these chasses lashed together.) Memory and processors can be turned on and off as needed in the system, as workloads demand and under various utility-style pricing schemes from IBM.
The base System p 570 server with the Power6 chip comes with two 3.5 GHz cores activated, 16 GB of main memory, and two 73.4 GB SAS disk drives spinning at 15K RPM. This configuration costs $60,000. A midrange configuration with eight 4.2 GHz cores, 32 GB of main memory, and two SAS disks is not given a price on the IBM site, and neither is a larger 16-core configuration with 128 GB of main memory.
The System p 570 with the Power6 server will be available on June 8.