Power vs. Nehalem: Time to Double Up and Double Down
April 13, 2009 Timothy Prickett Morgan
The first TPC-C online transaction processing benchmark tests are in for the new two-socket “Nehalem EP” Xeon 5500 processors from Intel, and as I suggested would be the case in last week’s coverage of that chip announcement, IBM had better get its fingers out of its ears and get to work goosing the two-socket and four-socket servers with some Power6+ chips, or be prepared to cut prices on the i 6.1 software stack to compete against Windows on these Nehalem machines.
The news is not, perhaps, as bad as it could be. First, let’s take a deep breath and go over the first TPC-C OLTP test, and then I will do some comparisons based on the iLoyalty promotion configurations that IBM has been touting for the past month. (More on those in this story from early March, if you missed it.)
The Nehalem EP processors, which have four cores running at between 2 GHz and 2.93 GHz, two threads per core, and an 8 MB L3 cache for the cores to play in, plus on-chip memory controllers and a point-to-point interconnect for processors, memory, and I/O, have roughly twice the performance of the Xeon DP chips they replace. Hewlett-Packard tested a rack-mounted, 4U DL370 G6 server with the fastest X5570 Nehalem EPs installed, and slapped 144 GB of main memory on the box, plus four RAID disk controllers and a whopping three racks of disk arrays for a total of 1,210 disk drives. This machine was able to support 500,000 simulated TPC-C end users, and cranked through 631,766 transactions per minute (TPM) on the test, for a cost of $678,231 after a 16 percent discount, for a price/performance of $1.08 per TPM. This machine was configured with Oracle‘s cheapo clone of Red Hat‘s Enterprise Linux, which is called Oracle Enterprise Linux, as well as its cheapo 11g Standard Edition One database for two-socket servers. That is astounding performance for such a little box, and that is pretty impressive bang for the buck, too. At least as far as the TPC-C test.
Now, that is a ridiculously lopsided server configuration, with all of those disk drives, which are necessary to drive the I/O levels that are necessary to saturate the TPC-C databases running on that HP box. Real machines are not configured like this, of course, and if you take away a lot of those disks, you lose a lot of the performance. The good news is, you also drop the price of the configured system by a whole lot, too.
So how does the Power Systems 520, IBM’s entry Power6 server, stack up against the Nehalem EP machines, which are in roughly the same performance class? It is very difficult to say without making a lot of estimates and assumptions. But, I am not one who is afraid to do that, and so I ginned up what I think are three representative comparisons, which you can look at here.
On the Power Systems i side, I took the three Power 520 comparisons IBM has been using in its iLoyalty marketing campaign and tweaked them just a little so every core in each configuration has i 6.1 installed and doing useful work. IBM only activated half the cores in each machine for i 6.1 to make the iLoyalty discounts–activating a single core for i 6.1 for free and activating end users for half price–seem bigger than they would be on a machine fully dedicated to running i 6.1 workloads. That’s fair if you are pitching one or two cores of i 6.1 for legacy work and the remaining cores in the box for Linux infrastructure apps, which Big Blue is clearly doing. But that is not what you do when you are using the machine as a database server for the TPC-C OLTP test. So I activated all the cores for i 6.1 and added Software Maintenance on the cores as well.
I wasn’t any easier on the ProLiant DL380 G6 configurations. The hardware is as much like-for-like as I can make it to the iLoyalty Power 520 setups, including the same amount of main memory, the same number of disk drives and roughly the same capacity, plus a respectable tape drive and any expansion boxes necessary for peripherals. In terms of software, I went upscale in the Windows stack, opting for Windows Server 2008 Enterprise Edition for the operating system and SQL Server 2008 Standard Edition for the database. And instead of using the freebie Hyper-V, I went all out and put VMware‘s Virtual Infrastructure 3 stack, which is based on the ESX Server 3.5 hypervisor. The entry machine got the Standard Edition, and the two bigger configurations got the Enterprise Edition. I then added in all the HP management goodies–Insight Control Environment, Insight Power Manager, iLO Advanced, iLO Power Management Pack, and Virtual Machine Manager–that brings this X64 box up on par with the management tools that come with i 6.1. None of this software is cheap, by the way. As best I could, I tore apart software license fees and maintenance costs for this software stack. In some cases, I had to make estimates because the first year of maintenance is bundled into licensing fees.
Each machine was configured with the number of users that IBM suggested in its iLoyalty setups–25 for the small configuration, 150 for the medium configuration, and 320 for the large configuration. On the Power 520, once you pay for 320 users, you are in unlimited user land. Windows offers per-server licensing fees with unlimited users, but to be fair, I configured both the Power and the X64 servers with base plus per-user licenses. In terms of estimating OLTP performance for the boxes, I worked backward from IBM’s Commercial Performance Workload (CPW) performance ratings to get an estimate of the TPC-C throughput of the box. As for the Nehalem EP setups, I punted. The only way that these boxes can do so much TPC-C work on the real tests is because there is a ridiculous amount of disk arms a-whirring, so I configured like-for-like and took my best guess where an I/O constrained machine (at least compared to having 1,210 disk arms) would be. Something akin to a CPW test configuration, which has a bunch of disks, but not a crazy amount.
Even with that caveat, I think the Nehalem EPs can best the entry Power 520 configuration, and do so using only a single dual-core Xeon E5502 running at 1.86 GHz. That chip has HyperThreading, which means it has four threads to play with, compared to the two threads in a single core Power6. Even running at more than twice the clock speed, I think this baby Nehalem EP setup can do about 54 percent more OLTP work than the Power 520 with a single core. That said, the price per user, thanks to the iLoyalty discounts, is not all that different even if the cost per TPM is. As you can see from the table, the baby Power 520 costs $811 per user after a 16.9 percent discount, and the ProLiant DL380 G6 setup costs $719 per user after a 16 percent discount.
The chart below shows how the two platforms stack up in terms of cost per user, which is the metric we really care about:
Now, doubling up the power and adding a bunch of disk drives shows a slightly different story and, as I have complained before, that is because IBM charges too much for memory and disk capacity and for i 6.1 activation on cores. If the iLoyalty discount were the street prices for all cores and all end users using i 6.1, not just during the promotion, that would go a long way toward helping the i cause. Ditto for cutting memory, processor, and disk drive prices. On the 150-seat setups, the Nehalem EP machine delivers about twice the bang for the buck in terms of throughput and about 60 percent better value on a per-user basis.
And on the bigger machine, comparing a four-core Power 520 with an eight-core Nehalem EP box, the spread is even larger. In terms of throughput price/performance, the Nehalem EP box offers five times the value–it is about one third the price and has twice the throughput in a reasonable (and hence constrained) configuration, as best as I can figure. That iLoyalty discount sure helps on the per-user costs, but the i 6.1 core activation and Software Maintenance charges are just too expensive compared to the Windows-SQL Server-VI3 combination.
As I said last week, IBM has two options. Double up the processing capacity in the Power Systems 520 and 550s to compete, or cut its software activation prices in half. I think it would be wise to do both and maybe gain a few customers, in fact. And having done that, I think the Rochester Lab should get out 1,210 disk drives, slam them onto a Power 520 with double the cores and maybe 256 GB of main memory, and kick the Nehalem EP box right in the gut. An economic meltdown is no time to be hosting a tea party–unless you are talking about the kind you have in Boston in the 18th century.