Wanted: Power 745 M3 For IBM i SMBs
March 19, 2012 Timothy Prickett Morgan
Call it a case of server envy. While I think IBM has the right server design for small and medium businesses that employ its workhorse IBM i operating system and database double whammy, it has unfortunately put the wrong processor in that machine. I am, of course, talking about Intel‘s long-awaited “Sandy Bridge-EP” Xeon E5-2600 processor, which debuted two weeks ago and started shipping in System x machines last Friday.
The Xeon E5-2600 processors, also known by their internal code-name “Jaketown,” are designed for two-socket servers in the same power class as the Power 740 machine in the Power Systems lineup. The E5-2600, just like the Power7 chips used in the Power 740s, are available with four, six, or eight cores and pack up to 20 MB of L3 cache onto the chips thanks to Intel’s ability to cram 2.26 billion transistors on the die using 32 nanometer chip-etching processes. The Intel cores run at anywhere from 1.8 GHz to 3.3 GHz, depending on the number of cores and cache activated, and energy consumption ranges from a low of 60 watts for a six-core part running at 2 GHz to 135 watts for an eight-core part running at 2.9 GHz. The chips include a new ring interconnect that has debuted in higher-end Xeon E7 processors and will make its way in modified form to the “Poulson” Itaniums later this year, and a new I/O hub has been put on this ring, along with ports for cores, the DDR3 main memory controller, L3 cache segments, and two QPI links. That I/O hub includes links out to the “Patsburg” C600 chipset and also sports two integrated PCI-Express 3.0 I/O controllers, right there on the chip. The Xeon E5-2600 is not just the first chip to support this much-faster I/O scheme–which has twice the I/O bandwidth of the PCI-Express 2.0 ports that IBM just put into some of its Power Systems last fall–it is also the first chip to bring a PCI-Express 3.0 controller onto the die. (Other manufacturers have put PCI-Express 1.0 or 2.0 controllers on their chips, but mainstream processor makers have not done so.)
The Power7 chips, by comparison, weigh in at 1.6 billion transistors and have a crossbar interconnect linking the L3 cache segments and the cores together. They run at higher clock speeds, ranging from 3 GHz to 4.25 GHz, and generally speaking, a Power chip can do more work than an Intel chip, clock for clock, based on various public benchmark tests. (I think the gap might be closing as Intel really tweaks its X86 microarchitecture.)
The I/O functions and PCI-Express controllers are off chip with the Power7 chip, in the Power chipsets, while the DDR3 memory controllers are on the chip. With the October 2011 partial revamp of the Power Systems line, the Power Systems Gen 2 machines, as I call them, had their I/O boosted to the PCI-Express 2.0 level (which was used by prior Xeon server generations) and had their memory capacity doubled up by supporting fatter 16 GB memory sticks. In a Power 740 server, the base machine has four PCI-Express 2.0 slots with eight lanes (designated x8 in PCI-speak) and another option for four low-profiles PCI-Express 2.0 x8 slots. That’s a total of 16 lanes of traffic, and the lanes have a raw bandwidth of 500 MB/sec per direction. So you get 64 GB/sec of bandwidth max out of those PCI slots. PCI Express 3.0 slots use a more efficient encoding method that allows them to deliver 8 Gb/sec of interconnect bandwidth even though the raw bit rate only increases from 5 gigatransfers per second (GT/sec) to 8 GT/sec. The upshot is that with PCI-Express 3.0 links, IBM could, in theory, double the raw I/O throughput of the eight x8 slots in a Power 740 to around 128 GB/sec.
To be fair, IBM has this genius thing called 12X I/O loops, which implement double-data rate (20 Gb/sec) InfiniBand link between the Power7 processor, its I/O systems, and remote I/O drawers. The Power 740, which is what I am concerned about in this story, can have two of these 12X loops and up to four I/O drawers hanging off them for a total of 44 PCI-Express 2.0 slots. I have no idea how hard the I/O subsystem inside the Power 740 can push and pull all that I/O. But there sure is a lot of peripheral expansion.
What I can tell you is that a two-socket Xeon E5-2600 has a total of 80 lanes of base I/O, mostly x8 slots, coming directly off the on-chip PCI-Express controllers with even more possible coming off the C600 chipset if server makers want to add more I/O peripherals. (There’s 40 lanes per socket.) An older Xeon 5600 server configured up with a mix of PCI-Express 2.0 peripherals had 32 lanes running at 500 MB/sec per direction, or 64 GB/sec total bandwidth and now, with the integrated PCI-Express 3.0 controllers on the Xeon E5-2600, Intel has 80 lanes at 160 GB/sec–in theory. In practice, according to internal Intel benchmarks I have seen, using a read/write I/O benchmark Intel could drive about 17.7 GB/sec on the old PCI-Express 2.0 slots with the Xeon 5600s and can now drive 81.6 GB/sec on the new Xeon E5-2600 with 80 lanes. It is early in the PCI-Express 3.0 cycle, and with tuning the effective real bandwidth will only go up.
If you don’t think this I/O bandwidth is important, you’re wrong. What you really want is to run as much of your IBM i operating system and DB2 for i database in main memory as you can and then run the rest on flash with disk for only the coldest of data. And perhaps more importantly for entry IBM i customers, they want a mix of flash solid state disk and hard disk drives all inside the skins of the server, directly attached to the system and hot-plugging into the box without having to buy HSL loops, external I/O draws, and all that jazz.
As it turns out, I think that what a lot of entry IBM i customers want is something that looks a whole lot less like a Power 720 or a Power 740 and a whole lot more like the two-socket System x5500 M4 that IBM started shipping last week for small and medium business customers.
The System x3500 M4 is a bit bigger box, with 5U of space eaten up when it is configured for a rack, thus:
The System x3500 M4 is also available as a tower server, which is important to many IBM i shops. Here’s what the tower server looks like when it is open:
The Power 740, which could be the workhorse IBM i server if its software tier was set at the P05 or P10 tier instead of the P20 level, only has six disk drives like the Power 720 that is the workhorse and that is in the P05 tier for entry machines so long as you only use four cores; if you want six or eight cores, you are in the P10 tier, and that’s a big jump in software costs for both IBM and ISV software. The Power 740 has eight DDR3 memory slots per socket, and the System x5500 M4 has 50 percent more at a dozen per slot. So the Power 740 tops out at 256 GB of main memory using 16 GB sticks–new since last October–while the System x5500 M4 tops out at 384GB using 16 GB memory modules, and if you want to move to load-reduced DDR3 main memory and 32 GB memory modules, then you can push that two-socket box all the way up to 768 GB of memory. The new Xeon E5 workhorse from IBM has more I/O capacity, more memory capacity, and with 32 2.5-inch disk bays, a heck of a lot more slots for disks and SSDs.
If IBM would just pop out the Xeon E5-2600 processors and C600 chipset from Intel and slap in two eight-core Power7 processors and whatever it calls the Power7 chipsets–I heard they were named after planets orbiting Sol, but could never confirm that–I think it would have a real Belgian drafthorse of a server for IBM i shops. Add in support for PCI-Express 3.0 peripherals and it would be future proof as well as backward compatible with PCI-Express 2.0 and 1.0 slots. If there are indeed Power7+ chips, these would be a good idea, too. Particularly if the clock speeds are higher and the per-thread performance is better.
And if this theoretical machine–let’s call it the Power 745 M3–was at the P05 tier, maybe it would get a slew of shops who are running on older iron with OS/400 V5R3 or i5/OS V5R4 to finally upgrade to IBM i 7.1 and even add new workloads to the box. Three years of support for IBM i costs essentially the same as the base processors and core activations on the machine, and the IBM i licenses activated across the four, six, eight, 12, or 16 cores in the machine cost 10 times as much as the processor cards and core activations. This two-socket box is simply not a P20 machine, no matter what IBM or ISVs want to believe. And all you have to do is compare it to Intel iron and you can see that.
What I want–and what IBM needs and needs to understand–is a larger, sustainable IBM i business built on competitive hardware and software pricing. You can make X dollars over 10,000 customers who are grumpy about high software prices and somewhat higher hardware prices and are the only ones who keep current or you can make maybe 2X dollars over 100,000 customers who are very happy with the platform and not only believe, but absolutely know, they are getting a competitive deal they can defend with any bean counter who kicks up a fuss.
And so I want this Power 745 M3 for you. Maybe the third time will be a charm.