Power7+ Launches In Multi-Chassis Power 770+ And 780+ Systems
October 8, 2012 Timothy Prickett Morgan
As an avid reader of The Four Hundred, you were thinking that IBM was going to start rolling out the Power7+ processors at the high-end of the Power Systems server line, where volumes are lowest and where the company can extract the most money possible out of each chip that comes off the line. And as usual, you were right. And if you are by any chance in the middle of a Power 770 or Power 780 deal, you need to stop and take a hard look at the new Power 770+ and Power 780+ machines that were announced last Wednesday.
As I explained in last week’s issue, I cannot remember the last time that IBM did a big bang upgrade putting a new processor into a Power Systems or predecessor in the AS/400 or RS/6000 families and did so across the entire product lines, and the Power7+ generation will be normal in that it will have a “rolling thunder” upgrade across the line rather than exceptional with a big bang in this regard. As it turns out, the Power7+ is only going into two relatively low-volume products, and that will be it for 2012. So the rolling thunder is way up in the sky for now.
“The rest of the products will get it next year with the exception of the Power 795,” Steve Sibley, director of worldwide product management for IBM’s Power Systems division, explained to me ahead of the launch, which coincided with the OpenWorld keynote from John Fowler, senior vice president of hardware at Oracle, where that company did not launch the 16-core Sparc T5 processors and related systems. “Just like with the Power 595, we already built the fastest processor and I/O into that machine,” Sibley explained.
That said, I think there are probably some Power 795 shops that would like to get some extra performance out of the much larger L3 cache memories that the Power7+ chip has, considering that it is 10 MB per core instead of 4 MB with the Power7, as well as the several accelerators. Of course, that is predicated on the idea that IBM could deliver a Power7+ running at 4 GHz or 4.25 GHz to match the current clock speeds. But, apparently, Power 795 customers won’t see a refresh until the Power8 chips ship. That is a very long time away, but then again, it isn’t like IBM has a lot of competition in the big iron area. Fujitsu‘s Sparc Enterprise M8000 and M9000, which are resold by Oracle and which runs Solaris, are looking a bit long in the tooth with its 3 GHz Sparc64-VII+ chips spanning 64 sockets. So is Hewlett-Packard‘s Superdome 2 sporting 32 sockets of Itanium 9300s. Fujitsu is working on a 16-core Sparc64-X that will come out in 64-socket servers code-named Athena (but no one knows when to expect them), and the Superdome 2 machines are expected to get an upgrade to the eight-core “Poulson” Itanium 9500s before year’s end.
So as part of the October 3 announcements, IBM didn’t feel like it had to push the envelope with the Power 795s. That said, the company has stacked up the 4 gigabit memory chips and worked with its memory card suppliers to come up with a big ole 256 GB memory card that sports 64 GB DDR3 memory modules running at 1.07 GHz. And with these cards, the Power 795 can now have its memory doubled up to 16 TB. If you are an AIX shop and you use the Active Memory Expansion memory compression algorithm that is part of the systems software, you can make it look like 32 TB to the operating system and the logical partitions.
The existing Power 795 machines can now sport up to 20 logical partitions per core using PowerVM 2.2.2, up from 10 per core maximum with the original releases of PowerVM and its predecessors, which had other names. The PowerVM hypervisor can dial down capacity for an LPAR to as small as 5 percent of a core’s CPU capacity, but the system still tops out at a mere 1,000 LPARs for some reason instead of the 5,120 you would expect in a 256 core system. With 32 TB of compressed memory capacity, such a box would be able to allocate 6.4GB of virtual memory to an LPAR, which strikes me as enough to be useful as the basis of a cloud providing IBM i and AIX capacity to users on a utility basis. The problem is that each slice would not have very much CPU, about 300 CPWs on a version of the box sliced up and running IBM i. Anyway, that’s all theoretical. Let’s talk about what’s real, and that’s the Power 770+ and Power 780+ servers.
These new machines look very much like the Power7′ machines (that’s power seven prime) that Big Blue announced a year ago, and I thought then and I still think today that IBM intended for the Power7+ chips to go into these boxes, which included a revamped Power 710, 720, 730, and 740 systems with doubled up memory and a shift to PCI-Express 2.0 peripheral slots and the same for the Power 770 and 780 machines. (These are the boxes that IBM has called Power7 Prime in its presentations.)
In some of the documents I have seen relating to this launch, these latest boxes are called the Power 770+ and the Power 780+, and I am going to stick with that nomenclature just so we don’t all go nuts here.
The Power 770+ is a four-enclosure machine using a homegrown chipset that allows for multiple server nodes to be linked to a shared memory system using NUMA clustering. We have seen this architecture for IBM enterprise-class machines since the Power5 generation. This way of making machines is easier and cheaper than making a big bad box like the Power 595 or Power 795, which has more cores, fatter memory, and buckets more I/O for truly large compute jobs. Known as the 9117-MMD in the IBM catalog, the Power 770+ has two processor cards per enclosure and each card, as in the Power 770′ machine from last year, has two processor sockets. So that is four sockets per machine and up to 16 sockets in a single system image. Another way of putting that is that in terms of socket count, that is half of a Power 795 or the same as a Power 595.
If you think you might need to start out modestly but scale fast, then a Power 770, 770′ or 770+ is the best box for you. Say, for instance, you are an SAP shop in China. Say, you are 1,000 SAP shops in China. Someone like that.
The Power 770+ doesn’t use eight-core Power7+ chips, so the scalability of the machine might not be as high as you might be thinking. In fact, presumably because yields on the 32 nanometer process that IBM is using to etch the Power7+ chips are not all that great (and perhaps a reason why these chips did not come out last year), IBM is doing the smart thing (as all chip makers do) and recycling partial duds into boxes and deallocating and isolating parts of the chips that are faulty. In this case, IBM is allocating three cores running at 4.2 GHz or four cores running at 3.8 GHz on the Power7+ chips used in the Power 770+ processors; each core is allocated only 10 MB of L3 cache.
With 32 GB memory sticks, the main memory of the Power 770+ can be pushed up to 4 TB, the same as the Power 770′ boxes, and I assume that with those 64 GB fat memory sticks that are also available in the Power 795, the memory on the Power 770+ could be doubled up again to 8 TB, or half of the Power 795, if IBM wanted to do it. But this card has not been made available on the Power 770+ because IBM needs to give customers a reason to buy a Power 780+ or upgrade the memory on their Power 795.
Using the Commercial Performance Workload (CPW) benchmark test that IBM created to gauge the relative performance of the OS/400 and IBM i server families a Power 770+ with four three-core 4.2 GHz processors is rated at 90,000 CPWs, or about 7,500 CPWs per core. The machine with four four-core Power7+ chips spinning at 3.8 GHz is rated at 110,000 CPWs, or about 6,875 CPWs per core. As you add processor cards to these systems, pushing the core counts up to 48 or 64 cores, the SMP and NUMA overhead of keeping the memory and caches coherent eats up an increasingly larger part of the aggregate raw performance, just as happens in any other multi-socket server. The 48-core Power 770+ (that’s the three-core chip running at 4.2 GHz) is rated at 306,600 CPWs, while the 64-core version (using the four-core chip running at 3.8 GHz) is rated at 379,300 CPWs.
As I have said before, with IBM i shops, you always get the fastest cores you can afford in any given box because you are paying for the software licenses based on the activated core, not the aggregate performance in the box. Moreover, IBM i shops tend to have a lot of batch work, which is monolithic in nature and likes the fastest clocks possible. (If IBM had a low speed 3 GHz Power7+ chip with all 80 MB of L3 cache turned on all cores, I might think this could really help batch workloads because of all that cache. But IBM doesn’t, so it doesn’t matter.)
If you bought a Power 770′ machine last year, you are not going to a Power 770+ this year, so this box is really aimed at Power 770 shops. The Power 770+ machine using three-core Power7+ chips running at 4.2 GHz in twice as many sockets as the original Power 770 from 2010 using six-core 3.5 GHz Power7 chips, and delivers from 23.1 to 23.4 percent more aggregate performance across the same number of cores. That’s another way of saying you can get about a quarter more work done with the same core count or you can get the same work done with about a quarter fewer software licenses.
The Power 770+ has six 2.5-inch disks in each enclosure, which can be equipped with up to 21.6 TB of internal disk capacity using 900 GB inside the server skins. The Power 770+ has eight 12X I/O loops for attaching remote I/O drawers to the system through InfiniBand links to the Power GX++ system bus. (This makes external disks look like internal disks to the system, which is clever.) Each enclosure also has six PCI-Express 2.0 peripheral slots, and you can hang up to 16 remote I/O drawers off the 12X loops to boost the disk capacity of the machine to a pretty impressive 3 PB. These I/O stats are the same as on the Power 770′ box from a year ago. The Power 770 running the latest PowerVM 2.2.2 hypervisor can also support as many as 20 LPARs per core and a maximum of 1,000 LPARs, just like the Power 795. That’s a lot closer to the maximum theoretical number of partitions, which is 1,280 across a 64-core machine.
IBM likes to pretend that the Power 780 is a lot different from the Power 770, whether it has a prime or a plus or a nothing. But the fact is, they are all based on the same hardware and IBM just tweaks a few things here and there to make the Power 780 different from the Power 770.
In the past, with the original Power 780s from 2010, IBM had a Turbo Core mode that let customers turn off half the cores and run them at a slightly higher clock speed with all of the L3 cache on the chip dedicated to half the cores. In regular mode, the chips ran at 3.86 GHz and in Turbo Core mode they ran at 4.14 GHz, and the effect on performance on a fully loaded system was to boost the CPW rating of a core by around 35 percent. Again, when you pay for your operating system and database on a per-core basis, this extra performance per core can be important in the overall system cost. But, unfortunately for IBM i shops, Oracle didn’t go for it, and no matter how many cores companies turned on or off, Oracle charged customers for the full number of normal cores in the box that were activated. You can see Oracle’s point. You could flip to Turbo Core mode, report your database and middleware usage, and then flip back to normal mode and boost the throughput of the machines.
IBM i customers would want a permanent Turbo Core mode, I think, which is why I suggested a month ago that IBM provide IBM i shops with seriously overclocked versions of the Power7+ machines with a half or three-quarters of their cores turned off and all that delicious L3 cache there to pump up the system.
Anyway, the Power 780+ has the same memory, disk, and I/O capacity as the Power 770+ machine, and the machine has the same two-socket system boards that came out in the Power 780′ machine a year ago. The big difference is that customers buying this machine will get Power7+ chips with four cores running at 4.4 GHz or eight cores running at 3.7 GHz. In the case of the eight-core chips, that is actually a lower clock speed than the original Power 780, which was humming at 3.86 GHz, but there is 2.5 times the L3 cache and all of those wonderful memory compression, hashing, encryption accelerators plus the random number generator that could boost performance on the system in countless non-obvious ways if you are just looking at clock speeds. On the Power 780+ system using four-core chips–equivalent to Turbo Core mode, and permanent so Oracle can’t complain about software licenses–the Power7+ chip runs slightly faster.
Fully loaded with 128 cores spinning at 3.7 GHz, the Power 780+ delivers 829,800 aggregate CPWs, which is 2.4 times the oomph of the Power 780 from two and a half years ago. With 64 cores spinning at 4.4 GHz, the Power 780+ is rated at 424,400 CPWs, or about 1.85 times the CPW capacity of the Power 780 in Turbo Core mode.
Imagine, if you will, how much computing power IBM will be able to cram into the Power 780 when it double stuffs the sockets?
The Power 770+ and Power 780+ machines will be available on October 19. IBM i 7.1 at the Technology Refresh 5 level, also announced on October 3, will be supported on the Power7+ machines on first shipment date, and on November 9 IBM will offer a patched version of IBM i 6.1.1 that will support the new iron as well. AIX 6.1 TL8 and AIX 7.1 TL2 are supported on the machines also, and IBM has put out a statement of direction that AIX 5.3 will eventually be supported on the boxes. Red Hat Enterprise Linux 6.3 and SUSE Linux Enterprise Server 11 SP2 will also run on the boxes and will be available on October 9 as well. PowerVM 2.2.2 ships on November 9.
I will be going through IBM i 7.1 TR5 and all of the announcements in great detail. Fear not.