IBM Releases CPW Ratings on Power 595 for i 6.1 Early
June 23, 2008 Timothy Prickett Morgan
As part of the 20th birthday celebrations for the AS/400 platform today in IBM‘s Rochester, Minnesota, facility, the company will be announcing the Commercial Performance Workload (CPW) performance ratings on the top-end Power6-based Power 595 running the new i 6.1 operating system.
While the i5/OS V5R4 and i 6.1 operating systems are not going to be supported on the Power 595s, which span from eight to 64 cores running at either 4.2 GHz or 5 GHz, until September 9, the company wants customers to start planning their acquisitions now, since the Power 595 iron started shipping on May 6, and thereby gave Big Blue a strong finish to the third quarter with the Power Systems line and perhaps a much stronger fourth quarter. No customer is going to buy a Power Systems machine, whether it runs i, AIX, or Linux, without some kind of relative performance metric, which is why IBM can’t wait until September to put out the CPW ratings for the machine.
IBM has tested the Power 595 with 64 of the 5 GHz cores to come up with the CPW rating, and used i 6.1 instead of i5/OS V5R4 to come up with its ratings. As it turns out, the top-end machine is being given a rating of 300,000 CPWs. The top-end, 64-core System i 595 using 1.9 GHz Power5+ processors was rated at 184,000 CPWs, so this translates into a 63 percent performance boost relative to that machine. IBM also shipped a variant of the System i 595 using 2.3 GHz Power5+ chips, which was rated at 216,000 CPWs with 64 cores, which means the jump from the fastest Power5+ box to the fastest Power6 box in the 595 class of machines only results in a 38.9 percent bump up in performance.
One interesting thing I learned about the CPW rating: On all of the 64-core boxes IBM has tested to date, the CPW ratings have been for two 32-core partitions. I am not sure what the scalability limit is–perhaps in the operating system, but maybe in the database–but a 32-core single system image is all that OS/400, i5/OS and even the new i 6.1 can deliver. AIX has no such limit, as far as I know. (It is my guess that it is a limit of 64 threads or 32 cores supported in the DB2/400 or DB2 for i database that is integrated inside the i family of operating systems.) In any event, those CPW ratings are not for a single DB2 for i image, but rather are the sum of two partitions, each doing half the work implied by the CPW rating. According to Ian Jarman, manager of Power Systems software at IBM these days and formerly one of the key System i product marketing managers, IBM does provide single system image support beyond 32-core setups on a special request basis, since it requires special tuning and patches. (There are apparently a number of such customers in the world.) And knowing what I do about systems, there is no way a single image running on 64 of IBM’s 5 GHz Power6 cores is going to get anything anywhere near 300,000 CPWs of performance. My guess is that moving from 32 to 64 cores, half of the implied extra performance goes up the chimney as SMP overhead, so you are probably talking something more like 225,000 CPWs.
As I reported in last week’s issue, IBM has just released TPC-C online transaction processing performance benchmarks for the same machine running AIX 5.3 (not the newer AIX 6.1) and a forthcoming DB2 9.5 release, which is due at the end of this year. That box was able to do just under 6.1 million transactions per minute and had an AIX Relative Performance (rPerf) rating of 553.01. When you do some math, that works out to a CPW rating of about 611,500, provided there is some commonality between CPW, rPerf, and TPC-C (which there most assuredly is, since they are all based on the TPC-C code base).
Based on the AIX rPerf ratings and past performance gains, I was expecting the i 6.1 platform to do a bit better on the Power 595 iron–about 331,000 CPWs, if you will recall. I also pointed out that the disparity I think exists between the OLTP performance on i and AIX platforms for high-end boxes is intolerable. IBM has been mum about this–it hasn’t said it is doing anything about it and it has not said my estimates are wrong. If I had to guess, I would guess that the i platform performance numbers are closer to reality and that the AIX and DB2 team have been–how shall I put this delicately?–tuning the living daylights out those bits of software to get performance that may, in the end, not reflect what actual end users see. (Tuning software over-aggressively for a benchmark is like buying back your stock to prop up per share earnings each quarter: once you start, there is no way to stop without paying a penalty.)
It is hard to say exactly what the deal is, and IBM is not helping to clear the matter up. I have asked for some documentation to prove that my estimates for iSeries, System i, and now Power Systems are wrong and no one has provided it yet. Until that happens, I will assume TPC-C OLTP is CPW times 9.95, the historical average in years when IBM did run TPC-C tests on OS/400 platforms, and that TPC-C OLTP is rPerf times 11,400 or so on the big boxes.