IBM Puts Future Power Chip Stakes In The Ground
April 12, 2016 Timothy Prickett Morgan
Two important things happened at the OpenPower Summit last week. One of them concerns IBM i shops directly and the other one affects them indirectly and perhaps more importantly. The first is that IBM rolled out an official roadmap for Power chip development for the next five-plus years. The second was that search engine giant Google and cloud operator Rackspace Hosting said that they were collaborating on a design of a future system based on the Power9 chip.
Google has been experimenting with Power8 chips for at a couple of years, and we suspect (without any corroborating evidence at all except that Google is a curious, methodical, thorough, and innovative engineering company) that it has been ripping apart Power-based machines for quite a bit longer than that to see how its search engine, data analytics, mail hosting, video streaming, and other services might run on Power-based systems compared to the Xeon-based iron it has been deploying for years. The news last week, as Maire Mahoney, an engineering manager at Google and also a director of the OpenPower Foundation and the lead for Google’s participation now that her boss, Gordon McKean, senior director of server and storage systems, has stepped down as chairman of the OpenPower effort, announced that Google had actually ported its infrastructure software stack to the Power platform and that with a rejigger in a config file, Google’s engineers could write code and deploy that infrastructure software onto Power8 machinery instead of systems based on Intel Xeon processors.
You can read my full report on the “Zaius” server and Google’s thinking behind adopting Power-based systems in its datacenters over at The Next Platform, where I follow the hyperscale, HPC, and cloud markets. I am not going to repeat everything here in The Four Hundred but I will give you some high-level information that is important to you.
The first thing is that the hyperscalers and cloud builders are driving the low-end of the Power chip product line, and Brad McCredie, the former president of the OpenPower Foundation and an IBM Fellow and vice president of development in its Power Systems division, was perfectly blunt with me about this. IBM wants to drive down the cost of Power chips to compete with Intel and it wants drive up the use of all kinds of accelerators to give a collection of Power, FPGA, GPU, and flash accelerators a performance edge over the Xeon-based systems that dominate the datacenter these days.
The Power9 chip that IBM announced as the next big stop on the Power chip roadmap will be implemented in a 14 nanometer process from Globalfoundries will have a very respectable 24 cores coming, twice the core count of the current Power8 processors and, we think, overkill for the vast majority of customers who are running IBM i workloads. As we have explained before, the vast majority of IBM i shops have maybe 1 to 4 cores on their Power processor activated and running this grandchild of OS/400 that is being updated this week with a 7.3 release. Yes there are companies that have larger IBM i systems, and yes IBM is committed to creating Power9 chips that are aimed at NUMA-style shared memory systems like the Power E870 and Power E880 that are the biggest, baddest boxes that Big Blue makes and that also support IBM i.
Here’s the funny bit to me. The Power9 machine that Google and Rackspace are working on will have two sockets with 24 cores each, plus 32 DDR4 memory slots that will, using 64 GB main memory, be able to support a maximum of 2 TB of memory. This is a very large machine in its own right, in terms of capacity, and to give you a sense of just how much processing we are talking about, assuming that the clock speeds stay around the same at 3.5 GHz, this two-socket Power9 machine would be rated at about 525,000 CPWs, and with microarchitecture improvements, it might be boosted to 600,000 CPWs. By comparison, the entry Power S814 with four cores on a single Power8 chip activated, and with cores running at 3 GHz, is rated at an aggregate of 39,800 CPWs and tops out at 64 GB of main memory (an artificial limit imposed by IBM). So that is a factor of 15X more compute capacity and a factor of 32X more memory capacity than a typical midrange customer has need of.
The Google and Rackspace machine, which is nick-named “Zaius” after the orangutan scientist in the movie Planet of the Apes, might have been called King Kong as far as IBM workloads are concerned. And in fact, we think that IBM will keep the Power8 line alive for quite a long time to service the needs of IBM i shops with more modest needs than those running search engines and deep learning algorithms. We can guess the Power S814, S822, and S824 machines will be in the lineup a long, long time.
This is particularly true because there is not going to be a Power8+ kicker, as many roadmaps had been showing last year. Like, for instance, right here:
Instead, IBM has tweaked the existing Power8 chip to add NVLink interconnects for linking Nvidia Tesla GPU accelerators very tightly to the Power compute and memory complex. So there is no process shrink down to 14 nanometers. This is called the Power8 With NVLink chip now, and the “plus” has been removed from those roadmaps that showed it and conveyed the wrong impression of the chip. (We are not sure if IBM never intended for there to be a Power8+ or not, but the roadmaps certainly said that and implied, as a plus variant does, that a microarchitecture tweak and a process shrink was coming.)
It doesn’t much matter for IBM i shops, for which a 6-core or 12-core Power8 chip packs more punch than they need for their online transaction processing workloads. That is never why I care about any plus upgrades to existing processors. A process shrink, which happens with most of the plus chips (but not the Power6+ in 2008), usually means that the cost of manufacturing the chip goes down just as the IBM is able to make some performance tweaks, and therefore it usually means an improvement in bang for the buck for a new generation. So a Power8+ could have been a chance to make the Power Systems line running IBM i a bit more competitive over Xeon server alternatives out there in the field. This will not happen now, unless IBM cuts prices on existing Power8 gear, which I strongly suggest that it do to keep Power System sales growing.
Unfolding The Latest Roadmap
The roadmap that IBM unveiled at the OpenPower Summit last week has a bit more detail on it than the one that I was able to get my hands on last summer, which is shown below:
This fuzzy roadmap (meaning the quality of the image, not the thinking) had not been made public yet and was given out to customers in the HPC market where IBM is trying to peddle hybrid machines that mix Power processors and Tesla accelerators and offload heavy duty, massively parallelized floating point calculations from the CPUs to the GPUs. There is not much need of this sort of thing at IBM i shops yet, but there is talk of better parallelizing Java so it can offload sorts and other routines to GPUs and thereby goose the overall performance of Java applications.
What we learned this week is that IBM will be delivering two different Power9 chips, one aimed at machines with one or two sockets, which it calls “scale out” machines, and those aimed at larger NUMA systems with four or more processor sockets sharing memory, which IBM calls “scale up” machines. The future chips are called Power9 SO, due in the second half of next year, and Power9 SU in the table below:
Both Power9 machines will support DDR4 memory, plus the enhanced versions of IBM’s Coherent Accelerator Processor Interface (CAPI) and NVLink interconnects, which are used to hook Nvidia Tesla coprocessors directly to the Power processor complex, allowing them to share memory. (NVLink is also used to hook multiple GPUs together in a manner that allows them to share data very quickly without going all the way to NUMA clustering across GPUs.) The Power9 chips will allow for non-buffered memory to link directly to the memory controllers on the Power9 die, rather than requiring the external “Centaur” memory buffer chip that current Power8 machines have. The Power9 SU machines will use buffered memory because they will presumably have a much larger memory capacity that will be helped by more memory and the buffering of a follow-on Centaur chip. These chips will be used in the kickers of the current Power E850, Power E870, and Power E850 systems.
McCredie said he was keeping IBM’s options open beyond these two Power9 SO and Power9 SU chips, and he did not divulge the core count or the process technology for the Power9 SU. I think it will be probably be 14 nanometer processes in 2018 for the Power9 SU, with that chip having fewer cores and higher clocks for big iron jobs, and that there will be a set of Power9+ kickers that implement the 10 nanometer processes from Globalfoundries maybe starting in 2019 for the Power9+ SO and maybe in 2020 for the Power9+ SU. Out past 2020 comes the Power10, with a 10 nanometer process according to the earlier roadmaps shown above, which will sport both a new microarchitecture and a new process. I think, as I have said before, that a Power11 chip should be expected using 7 nanometer processes by around 2023 or so, and that there will be pluses and other tweaks between 2020 and 2023 for the Power10 to fill in the gap.
All of this presupposes that IBM and its OpenPower partners get hyperscalers, HPC shops, and cloud builders to adopt Power chips over Xeon chips, and the addressable market for new customers is really only for Linux, which represents about a third of shipments in datacenters and on the public cloud these days. IBM basically needs to eat half of the Linux market to reach its goals of being a player again in the server racket.
That is a tall order, and putting out a roadmap like it did above is the next step after founding the OpenPower Foundation two and a half years ago. The Google and Rackspace endorsements are a big deal, and now it is China’s turn, and then perhaps followed by Facebook and Apple.
The thing to remember is that all of this Power activity benefits IBM i. Anything that makes Power stronger makes IBM i live longer.