The System iWant, 2010 Edition: Big Boxes
January 18, 2010 Timothy Prickett Morgan
With the Power7 processors not expected until later this year, now is as good a time as any to think about what these boxes might look like. In the past several months, I have walked you through all of the details I could find about the upcoming Power7 processors and the very sparse technical details about the servers that will make use of them. In the absence of real data, it is always a good idea to do a thought experiment about what these machines should look like to compare and contrast them with the Power Systems that IBM eventually does deliver and the competitive landscape into which they are launched.
This week, I want to start at the high end of the future Power7 lineup, which I am calling the System iWant, 2010 Edition, just to be a thorn in the side of IBM’s marketeers, who just can’t seem to give this box its own identity and a name that has some pizzazz or at least makes sense. Power Systems i is really atrocious, and so is calling OS/400 IBM i, which very few people do. No one knows what to call this machine and its software platform any more, and as I have said many times, this is the first thing that should change.
Anyway, this week, I want to start at the high end with the big boxes, what are known as the Power 595s in the current Power6/6+ generation. The Power 595s never got a Power6+ variant of the processor and have been based from the beginning on 4.2 GHz and 5 GHz versions of the Power6 chip. The Power6+ was supposed to provide more oomph to the Power Systems family as clock speeds were cranked up, but that never happened. In entry and midrange machines, IBM doubled up the processors with the Power6+ generation at the end of 2008 and in early 2009 and that is how it delivered more bang (and more bang for the buck). But the Power 595 was relatively stagnant for more than two years–something that is only possible because Sun Microsystems has massively screwed up its own UltraSparc servers and Fujitsu was late with quad-core processors in the boxes that Sun resells as the Sparc Enterprise M line, and Hewlett-Packard has been hamstrung by Intel‘s delays with its quad-core “Tukwila” Itanium processors.
The Power 595 came in two flavors, spanning from eight to 64 cores using the 4.2 GHz Power6 engines and spanning from 16 to 64 engines using the 5 GHz variant. Main memory spanned from 16 GB to 4 TB, with 32 DDR2 memory slots per processor book (what we might call a cell board in another platform); the machine could have from one to eight of these books. Instead of using multichip module packaging, as it did during the Power5/5+ generation, putting multiple chips and L3 caches on a ceramic substrate and wiring it up on a board, with the Power6 version of the Power 595, IBM put a single dual-core Power6 chip and its 32 MB L3 cache onto a single chip module, and then put four of these onto a cell board, much as you would with a plain vanilla four-socket X64 box. The resulting machine came in a 24-inch-wide non-standard chassis, had 64 cores, 128 threads, 1 GB of L3 cache, and 256 memory slots topping out at 4 TB (if you used 533 GHz main memory instead of 667 MHz memory, which topped out at 1 TB).
IBM has been very vague about the performance of the future Power7-based systems, and has only said that using the eight-core chips, the future Power Systems machines will be able to deliver two to three times the performance in the same power envelope as the existing Power6/6+ versions of the lineup. Let’s take the high end of that estimate and see where it leads us. IBM will be reducing its clock speeds even as it cranks the cores and boosts the threads on the Power7 chips. We already know that IBM can hit 4 GHz with the special Power7 multichip modules used in the “Blue Waters” petaflops supercomputer being installed at the University of Illinois, and I already estimated that just one slice of this very interesting design–a 2U rack with eight MCMs, each with four eight-core chips on it–with 256 cores would weigh in at about 800,000 CPWs of aggregate OS/400-style computing performance. The 64-core Power 595 is rated at 294,700 CPWs using the 5 GHz Power6 chips (that’s two 32-core partitions, since i 6.1 can’t span more than 32 cores, apparently), and that gives us a factor of 2.7 in aggregate performance improvement.
So I think IBM will deliver a top-end commercial machine with 256 Power7 cores, and with four threads per core, that works out to 1,024 threads. That is eight times the number of instruction threads compared to the Power 595, and even if IBM drops the clock speeds of the Power7 cores as low as 3 GHz, that still works out to something like 600,000 CPWs of online transaction power.
By the way, the PowerVM hypervisor will probably support one virtual machine per thread at its finest granularity starting with Power7 iron, but could do 10 VMs per processor core up to a maximum of 1,000. IBM artificially capped the top-end number of virtual machines per system at 254 on Power5 and Power6 iron (for reasons that have never been clearly explained) while offering a maximum granularity of 10 VMs per core. I think that if the PowerVM hypervisor can support 10 VMs per core now, it should push it to 20 with i 7.1 and offer as many as 5,120 VMs on the biggest Power7 box. This is what it will take to be the basis of cloudy-style infrastructure, and this is something that no other platform can deliver. (That would work out to somewhere between 117 and 156 CPWs per VM, I realize, which is not much.)
I do not think IBM will use MCM packaging, as it does with the Blue Waters Power 7 IH supercomputer nodes, and as it did in the Power5/5+ and prior generations of i and p boxes, and that it will plunk Power7 chips into sockets on processor books as it did with the Power6-based Power 595s. This is a lot cheaper to manufacture. The advent of the on-chip 32 MB of embedded DRAM (eDRAM) in the Power7 chip design means IBM doesn’t even have to put the L3 cache and the chip on a single package before slapping it onto the book. Just solder it on there and be done with it. I would guess further that IBM will once again put four chips onto a book and eight books into a system, which is the way IBM can argue that a lot of the guts of the Power 595 machine remain as it is upgraded to a future Power7 box, and therefore, as Big Blue has promised, it can offer upgrades that preserve serial numbers as part of an upgrade. In fact, it is the preservation of serial numbers that makes it an upgrade rather than a box swap, and there are accounting rules that govern this.
It is hard to say what remote I/O technology IBM will use in the future high-end box, but my guess is that IBM will goose the InfiniBand interconnect that is at the heart of the 12X I/O links from its current 10 Gb/sec speed to 40 Gb/sec. It is remotely possible that IBM will take the exotic interconnect used in the hub/switch module embedded in that Power7 IH supercomputer node. (See How Does 800,000 CPWs in a 2U Server Grab You? for the details on this.) This switch module marries IBM’s “Federation” supercomputer switch and InfiniBand technologies together and allows for 2,048 Power7 IH drawers to be linked together in a relatively flat, low-latency network with 1,128 GB/sec (I said gigabyte, not gigabit) of aggregate bandwidth for the nodes. Just a fraction of this hub/switch could be used to link the central processor complex in a 595-class machine to a low of I/O drawers.
I think the future big, bad Power7 box will have more main memory per core than the Power7 IH super node, which offers only 1 TB for 256 cores, or 4 GB per core. Using slow DDR2 memory speeds, the Power 595 could offer 16 GB to 64 GB of memory per core. Provided IBM can pack in enough memory slots, it seems reasonable that this ratio will stay about the same, but to be honest, 4 TB seemed a bit high compared to other boxes. I would not be surprised if IBM topped out the initial high-end Power7 machines at 32 GB per core, which is 8 TB of main memory. That is a lot of memory, and at the lower prices IBM started charging in November 2009 for memory–that would still work out to $3.36 million for maximum memory at current prices for DDR2 modules (40 cents per MB). On the fat memory card used to get main memory above 2 TB on the Power 595, IBM charges $1.05 per MB, and that would work out to $8.8 million just for memory.
The big question is what IBM will charge for processing capacity. In mid-2008, when the Power 595 first started shipping, the chassis cost $91,000, and with the full complement of 5 GHz processor cores, GX bus I/O hubs, and 12X interconnects for I/O (but not the I/O drawers or any disks), the base machine cost $2.64 million and 4 TB of memory cost–you guessed it–$8.8 million. I doubt very much that IBM is going to cut the cost per unit of CPW processing capacity down with the biggest Power7 machine. It cost just under $9 per CPW at list price for a Power 595 in June 2008, when the Power7 machines come out around two years later or so in 2010, and IBM will probably try to charge as much as it can for processing capacity. With current deal making, IBM could be giving big discounts on Power 595 iron–if so, customers are keeping tight-lipped about it. But with IBM offering two to three times the performance in the same footprint, charging two to three times the price for that same footprint minus a little shaved off for Moore’s Law advances seems likely.
At a constant price, the 256-core Power7 box would cost around $7.2 million for 800,000 CPWs using 4 GHz processors and around $5.4 million for 600,000 CPWs using 3 GHz chips. Now, assuming Moore’s Law, where price/performance doubles every two years or so these days, cut those prices in half since it will be about two years between launches for the big boxes. That puts the top-end Power7 box at somewhere around $3.6 million with all of its 256 cores activated if Moore’s Law prevails. Provided it doesn’t take the full 8 TB of memory I expect to get decent performance–say half, or 4 TB, will do–then you are talking $8 million for 800,000 CPWs, not including the cost of storage, or about $10 per CPW for a base system not including storage. That is a hell of a lot better than a 64-core, 4 TB box with no storage, which listed for $11.4 million, or $39 per CPW.
We’ll see if my hunches are right. And once I know more about the future X64 machines due this quarter that compete with the low-end of the Power 595s and the future Power7 boxes, I will be better able to gauge what IBM might charge.
Now, that brings me all the way back to naming. Maybe it would be best to call this class of machines the Power Systems/7000. And when they are running the i 7.1 operating system, maybe we can call them the Power Systems/7000i, or PS/7000i for short. And let’s get crazy and make the numbers mean something. So a PS/7256i means a 256-core box running i 7.1, and a PS/7128i means 128 cores, a PS/7064 means 64 cores.
One last crazy thing: I want those prices to include i 7.1, which is bundled for free. (You can stop laughing now.) This would sure beat having to shell out $53,000 per core, or $13.6 million, for i 7.1 on this box.