Impending Xeon Blades and Racks Offer Flexible SMP, Memory
March 8, 2010 Timothy Prickett Morgan
Just when you think you have IBM figured out, it does something unexpected–and interesting. In previewing its upcoming rack and blade servers based on Intel‘s eight-core “Nehalem-EX” Xeon processors and Big Blue’s own eX5 chipset, IBM took a chip that is aimed at high-end servers with many sockets and tweaked it to make midrange machines that offer processor and memory scalability independent of each other.
The Nehalem-EX processors, which sport eight processor cores with HyperThreading, on-chip DDR3 main memory controllers, and the QuickPath Interconnect point-to-point interconnect, were conceptually at least designed to put the Xeon family of servers into the same big iron class as Itanium, Power, mainframe, and Sparc alternatives. But, IBM has a Power and mainframe base to protect, and so it is twisting the Nehalem-EX processors and its own eX5 chipset to make two-socket and four-socket boxes. At least at first.
The eX5 chipsets and its predecessors in the xSeries and System x lineup are based on SMP/NUMA clustering technologies that IBM got when it bought Sequent Computer Systems back in 1999 for $810 million. Sequent figured out a way to use a special ASIC to glue together multiple X86 servers into something that looked approximately like an SMP box to Windows, and IBM used some of this technology to glue multiple Power5, Power6, and now Power7 boxes together to create the 570-class of machines. The System x and Power 570 machines lash two, three, or four multiple-socket servers together in a single system image with a single main memory for the operating system and applications to play in. The chipset implements special SMP ports on the system boards of the servers, allowing for the servers to be linked in a cache coherent manner to each other through external fiber optic cables.
With the eX5 chipset that previewed last week and that will no doubt play a starring role in Intel’s Nehalem-EX processor launch later this month, IBM could have put four sockets or eight sockets of Nehalem-EXs in a box and linked four boxes together in a single system image, to create a 128 or 256 core behemoth with 256 to 512 threads. (For perspective, the kicker to the Power 595, presumably to be called the Power 795, will sport 256 cores and 1,024 threads.) But instead, according to Tom Bradicich, vice president of systems technology at IBM’s System and Technology Group, Big Blue decided making malleable servers that competed better against X64-based servers was more important–and will no doubt be more profitable.
The upcoming eX5 machines will come in three flavors at first: a two-socket rack machine, a four-socket rack server, and a two-socket blade server. Using a feature of the chipset called FlexNode, two of these units can be glued together in an SMP configuration, giving you what amounts to a four-socket rack, a four-socket blade, and an eight-socket rack. The interesting bit is that FlexNode, which will be controlled by a plug-in for IBM’s Systems Director systems management tool, will allow for this fatter SMP to be created on the fly, or for a fat SMP once created to be broken back into two physically distinct servers. It would be interesting to see how far IBM can take this idea.
Another feature, called Max5, takes one of those fiber optic pipes coming out of the eX5 chipset and instead of putting a server node at the end, it simply puts a node with memory slots. A lot of workloads are not CPU bound, but memory bound, and by creating the Max5 feature, System x shops will be able to scale the memory in their machines beyond the basic number of slots in the machine. Perhaps more importantly, if you add more memory slots to a machine, you can use less dense and a lot less expensive memory DIMMs for a given amount of capacity.
According to Bradicich, the two-socket rack server will have 32 DDR3 memory slots on its system board and the four-socket rack server will have 64 slots; a single 1U memory expansion box with 32 memory slots can be linked into these rack servers through the eX5 chipset, doubling up the memory on the two-socket box and boosting it by 50 percent on the four-socket machine. On the blade server, the two-socket Nehalem-EX blade server will have 16 memory slots, and its companion Max5 blade will have 24 additional slots, offering 150 percent more memory capacity. Or, as I said, the extra memory chassis or blade can be used to reduce the price of a given capacity. On the street, 8 GB DDR3 DIMMs cost over $1,000, and 4 GB DIMMs cost one quarter of that amount. So switching to 4 GB DIMMs for a given capacity can cut the memory price in half. The same rough proportion holds for 2 GB DIMMs compared to 4 GB DIMMs.
The FlexNode and Max5 features can be combined in any given system, but for now, IBM is only supporting one Max5 feature per SMP setup, whether it has one or two nodes in it. Why this is the case, Bradicich would not say. It would be mathematically appealing to have everything pair up evenly.
It is not clear if IBM’s own chipsets for its Power7 processors and systems could support features like FlexNode and Max5, but they would no doubt come in useful. IBM has added 2:1 memory compression on Power7 systems running AIX with a feature called Active Memory Expansion. This memory compression is implemented in the memory controller on the Power7 chips, but is not supported by i 6.1.1; it is not clear if it will be supported with the impending i 7.1.