IBM Ships Fat Memory for Power 770 and 780 Systems Early
August 23, 2010 Timothy Prickett Morgan
When the Power 770 and 780 servers, which span up to 64 Power7 cores in a single system image, were announced back in February, one of the things that was necessary for these two boxes to be useful was a set of dense memory features that allowed the boxes to scale up to its full 2 TB.
That fat memory, which is known as feature 5602 and which is actually comprised of four 32 GB DDR3 memory modules running at 1.07 GHz, was expected to be delivered on November 19. That was pretty far put from the March 16 ship date of the Power 770 and Power 780 machines, and it also meant that IBM’s benchmark tests for the Power 770 and 780 boxes had to be run at less than their CPU potential, in terms of core count, because IBM could not balance out the CPUs with an appropriately large amount of memory.
But in the August 17 announcement blitz, IBM said that it can now get its hands on these dense memory modules and will actually start shipping them for Power 770 and 780 systems starting September 17.
As far as I know, the hot node add, hot memory upgrade, and hot node repair features of the Power 770 and 780 machines are still scheduled to come out on November 19.
The feature 5601 memory group (four 16 GB DDR3 modules) and the feature 5602 group both have the same price when you do the math, at $365.63 per GB. That’s $7,720 for the physical feature 5601 memory and $15,440 for the feature 5602 memory plus an additional $245 per GB to activate the memory on either set of features. The 32 GB feature 5600 memory (four 8 GB memory sticks) costs a lot less, at $1,960 for the physical memory and $245 per GB to activate it, or $306.25 per GB when all the costs are added in. These prices are exactly the same as those on the new Power 795, by the way.
The effect of the fatter memory and its price on performance and bang for the buck can be dramatic or not, depending on the workload. In April, IBM tested a four-chassis Power 780 with only one-quarter of its processor complexes activated using the cheaper 8 GB DDR3 sticks because the 32 GB sticks were not available. On that test, a Power 780 in TurboCore mode (meaning all half the cores in the box were turned off while the others ran at 4.14 GHz instead of 3.86 GHz) with only eight cores activated and with 512 GB of the cheaper memory, plus a slew of flash drives, was able to handle 1.2 million transactions per minute (TPM) on the TPC-C test at a cost of 69 cents per TPM.
Last week, as part of the final phase of the Power7 rollout, and to get a better clustered system number than Oracle has for its clustered T5440 Sparc servers, IBM took a cluster of three Power 780 machines, stepped them back to MaxCore mode with all 64 cores turned on, loaded up the full 2 TB of memory, and was able to crank through 10.4 million TPM on the TPC-C online transaction processing test. Assuming no overhead for clustering, each single Power 780 handled 3.54 million TPM. Again, there is a huge amount of flash on these machines, which means the number of disk arms can be cut way down. But that fatter memory came at a cost, specifically the price of the cluster came to $1.38 per TPM. Some of that was the cost of extra processors, but some of it was memory.
The SMP overhead of scaling from 8 to 64 cores is unavoidable, of course, which is why the fully loaded Power 780 doesn’t do 7.5 times the work (eight cores at 4.14 GHz divided into 64 cores at 3.86 GHz). Quadrupling main memory yielded a little more than a factor of three performance improvement. (Assuming you adjust to make the clock speeds the same.) I wonder if IBM could have pushed the performance of the Power 780 up by a lot more simply by having main memory scale to 4 TB on these machines? It certainly looks like memory, not CPU, is the bottleneck.