Bang for the Buck: Raising the System iQ
August 28, 2006 Timothy Prickett Morgan
Direct competition, at least as far as customers are concerned, is a beautiful thing. It can make vendors react quickly and decisively, and in ways that they had not anticipated with their technology and marketing. Because one vendor figures out some trick, its competitors have to figure out how to counter or copy that trick. A good example is IBM‘s invention of the dual-core processor, and another example is the advent of the quad-core processor. IBM has used the first in the iSeries line, but for some reason, it has neglected adding the second to the i5 line. This needs to change.
Every mainstream processor–Power5+, UltraSparc-IV+, Itanium 9000, Opteron RevE and RevF, Xeon DP 5100 and next week the Xeon 7100 MP–is available in true dual-core variants. By this, I mean that a single piece of silicon has two processor cores, their caches (sometimes shared, sometimes not), and interconnections to the outside world. In the iSeries and i5 world, as well as in the pSeries and p5 worlds and the realm of the mainframe, IBM treats these processor cores as whole processors, in terms of software functionality and pricing. Every software provider has a different way of dealing with dual-core processors. Some only look at the processor socket for pricing and don’t care how many cores are in there; others make you do some math to reckon how much computing you have and therefore what you should pay for software.
Not everyone has a quad-core processor. In fact, very few companies offer them. Sun Microsystems is the first vendor to deliver a multi-core processor to market, the “Niagara” Sparc T1 chip, which has eight cut-down Sparc cores on a single chip, each with four threads. This is a very efficient processor, keeping about 75 percent of the threads active at one time. (I have been covering the Sparc T1 like white on rice in The Unix Guardian, our Unix newsletter.) Movidis is using a 16-core MIPS processor in a server it launched at LinuxWorld two weeks ago (which I reported on in The Linux Beacon).
Since last fall, when IBM moved to a 90 nanometer chip making process with the Power5+ chips, it has offered a neat little option on the System p5 line called the quad-core module, or QCM. This is not a true quad-core chip, with four cores on a single die. The QCMs cram two dual-core Power5+ chips onto a single chip package–the ceramic and pin structure that you think of when you think chip–allowing IBM to double the core count per chip socket in the System p5 servers. This might be a cheater way of getting quad-core chips, but I don’t care. I think the System i5 line should have them, and have them immediately. Moreover, I think that IBM should provide the QCMs in the i5 520 and i5 550 line for free, and not double the core count in terms of software pricing. I’ll explain why in a bit.
Opterons and the Need for Power5+ QCMs
IBM was, as we all know, the pioneer in dual-core processors, starting with the Power4 chip back in the fall of 2001. But IBM’s chip and system designers did not think five years ago, when the “Squadron” Power5 chip design project was started, that moving to four cores on a single chip die was necessarily the right architectural move. In fact, the future Power6 chip, due in 2007, is also a dual-core chip, like the Power4 and Power5 chips. But, with power and cooling becoming big issues in the data center in the past two years and with AMD and Intel working on quad-core processors, IBM had to do something to boost the performance of the Power server line last year to stay ahead of the price/performance curve.
So, IBM took a page out of Hewlett-Packard‘s playbook and crammed two Power5+ cores into one chip module, creating a quasi-quad-core chip. HP did this years ago, in fact, with IBM’s help. IBM Microelectronics was the chip foundry for the PA-8800 chip and the dual-core PA-8900 chip, which wasn’t a true dual-core, single-die chip but rather two PA-8800 cores with a baby chipset and some cache all on a single package. HP also created the mx2 dual Itanium modules when Intel was caught flatfooted by IBM’s Power4 chips, putting two “Madison” single-core Itaniums on a single package.
To cram two chips–single core or dual core, it doesn’t matter–in the same package that a single chip uses requires some sacrifices. To double the core count, which is important for some workloads, you have to slow down the clock speed so the chips don’t melt. In IBM’s case, the initial QCM chips ran at 1.5 GHz compared to the 1.9 GHz of the Power5+ chips last year. This summer, IBM has ramped the speed of the dual-core module (DCM) Power5+ chips to 2.1 GHz and 2.2 GHz, and the QCMs are scaled back to 1.65 GHz. (The DCM is what is used in the entry and midrange i5 line, while the multichip module, or MCM, is used in the i5 595.)
The difference in performance can be dramatic comparing DCM and QCM machines, but the real goal IBM has with the QCM machines is to meet or exceed the per-core bang for the buck of any AMD Opteron-based server on the market–including its own. For instance, a p5 550 box with four 2.1 GHz cores (that’s two DCMs) has an IBM relative performance (rPerf) rating of 24.86, while the same box with two 1.65 GHz QCMs (for a total of eight cores) is rated at 38.34 on the rPerf scale. That’s a 54 percent increase in performance, for very little extra cost to customers. For transaction processing and infrastructure workloads where clock speed is not a big issue, the System p5 Q variants will offer the best bang for the buck.
Intel is, of course, getting ready to launch its dual-core “Tulsa” Xeon MP processors next week, promising about 70 percent more performance thanks in large part to large cache memory. Intel says it will have its own quasi-quad, the “Clovertown” Xeon DP, to market before the end of the year–right smack in the same part of the market where IBM’s new entry System p5 boxes are aimed. And AMD is saying that it will get its “Deerfield” Opterons in the field by the middle of next year, which are true quad-core processors. Beyond that, Intel is working on “Tukwila” quad-core Itaniums, too, due in 2008.
Because the Unix market is very tough, IBM had to get QCMs into the field to stop Sun’s “Galaxy” Opteron machines in their tracks. HP can also sell Opteron machines against the System p5 line running Windows or Linux. The QCMs give IBM a significant performance and price/performance advantage, turning a relatively low volume product into something that can compete with a higher volume product.
In the IT industry, this is called smart.
The System p5 and QCMs
The reason I got to thinking about the Power5+ QCMs at all was that IBM boosted the speed of the DCMs to 2.1 GHz and added 1.65 GHz QCMs in the entry System p5 line last week. There are three single-socket boxes–the p5 505, p5 510, and the p5 520–which come in 1U, 2U, and 4U rack-mounted cases. And there is the p5 550, which is a 4U two-socket box. The i5 520 and i5 550 are essentially the same as the p5 520 and p5 550, except for one thing: the i5 models do not support the QCMs, and only use the DCMs.
Here’s why this matters. First, for whatever reason, as best as I can figure, i5/OS and DB2/400 cannot do as much work on a Power5+ box as AIX and DB2. It could be tuning, it could be benchmark gaming. I do not know the cause. But as I pointed out in last week’s iteration of the Bang for the Buck series, the p5 line does about 65 percent more OLTP work than an i5 with the same hardware. This is intolerable, and it needs to be fixed. And when you can’t fix something with software, you throw hardware at it, as all system engineers know full well.
Hence, the idea to create the System i5 Q line of machines, iQ for short. Because it is smart.
IBM is already doing this for the System p5 machines. I’ll give you one example, but this is true across the p5 line where QCMs are available. The p5 520 supports the 1.9 GHz or 2.1 GHz Power5+ DCMs or the 1.5 GHz or 1.65 GHz QCMs as the p5 510. With one 2.1 GHz core activated, 1 GB of memory, and two disks, the p5 520 costs $5,699. Activating a second 2.1 GHz core and adding 1 GB of memory boosts the price to $11,896. With four 1.65 GHz cores using the QCM, the price only rises to $12,999 on a machine with 4 GB of memory and two disks. The four-core model using the 1.65 GHz QCMs is rated at 20.25 on the rPerf scale, which is around 207,600 transactions per minute on the TPC-C online transaction processing benchmark test. The two-core model using the 2.1 GHz DCM is rated at 12.46 on the rPerf scale, which is 127,715 TPM. That’s a 63 percent performance boost. But look at the prices. System p5 520 buyers are getting that boost for a mere $1,103–which really only covers the cost of the memory upgrade from 2 GB to 4 GB. That performance is absolutely free.
IBM’s aggressive pricing on the QCMs in these four System p5 machines would seem to suggest that the company can make 1.5 GHz and 1.65 GHz Power5+ chips in much higher volumes than it can 1.9 GHz and 2.1 GHz chips. This stands to reason, since this is how it is in the chip business–the lower the clock speed, the higher the yield; the higher the yield, the lower the unit cost. Moreover, IBM’s pricing also suggests that even with the extra cost of putting two chips in the packaging where one chip used to be, Big Blue can do so at such a price that it is basically giving customers the QCM (which has about 65 percent more oomph) for the price of a DCM. It is reasonable to expect that IBM will do the same thing with the future Power6 chips, starting with DCMs and adding QCMs as yields improve and customers demand more bang for the buck.
I was curious how the use of QCMs might affect the System i5 line. AIX, at $150 per core, is basically free on the p5 line. And databases cost only a few thousand a core, too, and in some cases, doubling the cores only adds 50 percent to the software price (as is the case with Oracle 10g on the entry p5s. If IBM put QCMs into the entry i5s to get their price/performance more in line with the competition, it could not count the cores for the purposes of i5/OS, or it would have to cut the cost per core in half on these machines. Otherwise, the cost of the software, at $21,000 or $40,000 per core, would far outstrip the savings in hardware and the bang for the buck that the QCMs provide.
Now, here is what happens when you do what I have suggested. Check out this comparison table I built, which compares the hypothetical QCM i5s, which I call the iQ machines, to alternatives. Then read on.
Let’s look at an iQ 520 with two 1.65 GHz cores activated, a machine with 2 GB of main memory, two 35.2 GB disks, and the i5/OS Standard Edition, DB2/400, and Virtualization Engine hypervisor (the latter two are bundled in) will cost $37,757 with an SLR 60 tape drive and a RAID 5 disk controller thrown in. This machine will do nearly 62,400 TPM, and will cost 61 cents per TPM. That is a big improvement in price/performance, down from $1 per TPM. By comparison, a Hewlett-Packard DL380 G5 server with a dual-core, 1.86 GHz “Woodcrest” Xeon 5120 processor running Windows Server 2003 Enterprise Edition, SQL Server 2005 Standard Edition, and VMware ESX Server 3 Enterprise Edition will cost $25,162, but only does 40,800 TPM, for a bang for the buck of 62 cents.
Same price/performance, IBM.
As I pointed out before, you can make a cheaper Windows stack by moving to Windows Server 2003 Standard Edition, SQL Server 2005 Workgroup Edition, and the freebie Virtual Server 2005 hypervisor for partitioning, which drops the cost to 35 cents per TPM. But the iQ 520 at least gets the System i5 into the right ballpark and playing the same game. (This is football, IBM. Not water polo.)
The price/performance of the hypothetical iQ 550 models is not nearly as good as this iQ 520 machine, mainly because IBM charges too much for i5/OS Standard Edition, even if we ignore the doubling of the cores using the QCMs. On the i5 550 running Standard Edition, i5/OS represents 75 to 80 percent of the cost of a base configuration; with Enterprise Edition, it is nearly all of the cost. The iQ 550 with the same software pricing ($40,000 per pair of cores) is in the range of 90 cents per TPM. That’s too high. If IBM did knock down the price of a core to $21,000 and then ignored the doubling of the cores in the QCMs (which is what my table shows), then the iQ 550 is much more competitive. In fact, an eight-core i5 550 would do nearly 230,000 TPM at a cost of 61 cents per TPM. That meets most of the Windows and Unix configurations, excepting IBM’s own p5 550+ Express box, which has an insanely low price of 29 cents per TPM using the 2.1 GHz DCMs. Throwing the QCM in this box and then doing the math would just about make every other server vendor cry. Which is why IBM is using the QCM as a competitive weapon.
Now, it is the System i5’s turn. So, IBM, get up off the ground, pick the grass out of your helmet, and compete. Your jersey says i5, not Wall Street. Start playing to win, and start playing for your own team.