Resurrect Dead Blue Waters Power7 Supercomputer As IBM iCloud
August 15, 2011 Timothy Prickett Morgan
You need a supercomputer to predict the future, and unfortunately when you are trying to predict the kind of supercomputer you might need to build the supercomputer to predict that future, you can’t have as much knowledge of the future to easily make the predictions. And therefore, sometimes a supercomputer project backed by governments and key IT players gets taken out behind the barn and given the Old Yeller treatment.
Such is the case with the technically impressive “Blue Waters” massively parallel machine that was to be built this year by IBM for the National Center for Supercomputer Applications (NCSA) at the University of Illinois. Blue Waters, which is built using a super-dense packaging of water-cooled Power7 processors, memory modules, and an array of hub/switch chips code-named “Torrent,” is a bit ahead of its time in terms of manufacturing, apparently. (I told you all about the Blue Waters server node, formerly known as the Power IH node and launched in July as the Power 775 server, back in November 2009 when I saw them on display at the SC2009 supercomputing show.)
The Power 775 servers can cram 256 Power7 cores and 2 terabytes of main memory to feed them into a server node that is 2U high, 30 inches wide, and six feet deep. Back in November 2009, I estimated that if this Power 775 node was running the IBM i operating system, it would deliver about 800,000 CPWs of performance. I assumed that the clock speed would be a bit lower than on the standard Power 770 and 795 machines and didn’t know about the embedded DRAM cache on the Power7 chip then and its dramatic effect on performance, and as it turns out the Power 775 drawer has something more like 1.53 million of aggregate CPWs of IBM i oomph.
The Torrent hub/switch is something you will see more of, albeit probably in a shrunken form and maybe on a chip in some future Power processor. This hub/switch module delivers a total of 1,128 GB/sec of aggregate bandwidth. The host connection between the Power7 multichip modules inside a single drawer is rated at 192 GB/sec, with another 336 GB/sec of connectivity to the seven other local nodes on the drawer. There is also 240 GB/sec of bandwidth between the nodes in a four-drawer supernode, and 320 GB/sec dedicated to linking nodes to remote nodes in the entire Blue Waters machine. And because there needs to be a way to talk to disks and such, IBM tossed in another 40 GB/sec of general purpose I/O bandwidth inside a drawer.
Up to 2,048 of these Power 775 drawers can be linked together into a behemoth using the Torrent interconnect. The Power 775 super has storage and compute drawers, and a balanced configuration with 1,365 compute nodes and 342 storage nodes with 2.7 petabytes of memory and 26.3 petabytes of disk and flash storage. It would cost about $1.5 billion at IBM list price for an aggregate of 349,440 cores that could deliver around 2.1 billion CPWs. To give you some perspective, that is probably about half of the aggregate IBM i processing capacity in the world. That price doesn’t include the IBM i software licenses. Even at the modest $2,245 per core that IBM charges for the entry PS700 blade server (with 90 days of Software Maintenance), across those 349,440 cores you would be talking about another $1.5 billion for the operating system software and three years of maintenance. Call it $3 billion for 2 billion CPWs.
And that, it seems, is the problem. That’s a fair price for a CPW of oomph, but it is a terrible price for a floating point operation per second (flops). After designing and building three racks of the Blue Waters machines based on the economics it thought it would be dealing with in 2011 way back in 2007, IBM’s top brass pulled the plug on the contract. Presumably because IBM could not make money after doing the manufacturing and support job. While supercomputers have been a loss leader for IBM and help drive its overall server technologies–the supercomputer interconnect from a decade ago gets tweaked to be the SMP backplane in tomorrow’s machines–Big Blue was not going to take a big loss for Blue Waters, for which it was to be paid $208 million, mostly from the National Science Foundation.
“The University of Illinois and NCSA selected IBM in 2007 as the supercomputer vendor for the Blue Waters project based on projections of future technology development,” IBM and NCSA said in a joint statement. “The innovative technology that IBM ultimately developed was more complex and required significantly increased financial and technical support by IBM beyond its original expectations. NCSA and IBM worked closely on various proposals to retain IBM’s participation in the project but could not come to a mutually agreed-on plan concerning the path forward.”
I talked to John Melchi, head of the administration directorate of the NCSA, and he pinned the killing of Blue Waters on IBM’s upper management and said further that as far as NCSA could tell from its early tests on three racks of Power 775s that the machine would work as expected and eventually deliver at least a petaflops of sustained performance.
IBM has to return $30 million to NCSA and the University of Illinois is out some dough it put out to get grad students to tune code for the box. IBM sources say the company is still selling the Power 775 servers, which are available at the end of the month but which are, as you can see, quite expensive.
I think it would be interesting to get IBM i running on these machines and turn it into an IBM iCloud. Tell Sam Palmisano, Big Blue’s president, chief executive officer, and chairman, and Tom Rosamilia, general manager of the Power Systems and System z line. I just did. A buck and a half per CPW is a manageable price if you can spread it across enough users. That assumes, of course, that IBM is charging $2,245 per core (an IBM i entitlement on a PS700 core with 90 days of SWMA) instead of $53,000 per core plus $6,000 per year for a year of SWMA on a Power 795. On a balanced Power 775 superserver, that would be $24.8 billion for 2.1 billion CPWs just for the software, plus another $1.5 billion for the hardware. That’s a truly stupid price–and it is exactly what Big Blue charges (at list price) for the IBM i software on its largest Power systems iron.
You Young i Professionals want to have some fun? Get IBM to let you build this cloud with the three returned Power 775 racks from the failed Blue Waters super and run it as a business unit inside of IBM Global Services. You can probably rent space from NCSA.