More Thoughts On A Hybrid System Of Systems
March 14, 2016 Timothy Prickett Morgan
If IBM is going to take on the hegemony of Intel in the datacenter, it is going to have to do a lot more than just crank up the core count of the Power processors and license them to partners so they can make their own chips and machines. That is a good start, but what IBM and its OpenPower partners need to do is make a new kind of system that brings all of the current workloads in the datacenter together and also allows for new workloads to be run natively and in an accelerated fashion on top of a Power-centric platform.
To a certain extent, the current OpenPower strategy of mixing GPU and FPGA accelerators as well as directly attached flash memory with Power8 and follow-on processors into a relatively tightly coupled hybrid system represents the kind of machine I think Big Blue needs to bring to market. But it is missing one element, and possibly two. In addition to Power processors, this hybrid system needs something so obvious it is ridiculous. And that is the ability to run X86 code natively, and for the future, the ability to run ARM code as well.
The simplest way to do this, of course, is to add literal X86 and ARM cores to the Power8 processing complex, using something like the Coherent Accelerator Processor Interface (CAPI) that is part of the Power8 processors. With CAPI, IBM adds what is essentially a “hollow core” to the processing complex, which allows other kinds of computing elements to be added to the system and address the cache and main memories on the Power8 or a collection of Power8 chips linked by NUMA electronics. By making these coprocessors a part of the coherent memory space of the Power8 processor, the programming model for hybrid applications can be radically simplified and the performance substantially boosted.
I am not suggesting that Power and X86 cores should be run in a hybrid fashion, per se, but rather that they would operate in parallel and side by side as they do in the datacenter, but with very fast data sharing between the two different kinds of processors. This is, as I pointed out last week, precisely how the original AS/400 with its I/O processors, based on Motorola 68000 processors, functioned. This was what is called an asymmetric multiprocessor, or AMP, and I happen to think this is a good model and, frankly, something that others such as Nvidia have recreated with the CUDA programming environment for GPU accelerators. The memory model was a little looser on these IOPs than on the Power8 plus CAPI accelerators, to be sure, but the analogy still holds.
The one thing we want is for IBM to create a system that actually delivers the hybrid processing that is native in IBM i and AIX shops, which have boatloads of X86 processors surrounding their mission critical systems. Rather than try to port Windows Server natively to Power chips, which we think would be tough for IBM to get Microsoft to do at this late juncture–something Microsoft pulled the plug on 13 years ago after a briefing shining moment when Windows NT was available on PowerPC systems from Big Blue. But IBM could license custom Opteron processors from Advanced Micro Devices and create a modern version of the File Serving I/O Processor that Big Blue first debuted in 1994 to run Windows, NetWare, OS/2, and Linux workloads.
The trick to adding X86 engines to the Power processing complex would be to make it essentially invisible, to make it look as much as possible like Windows Server and Linux were running natively on the Power processor even though it was running natively on the X86 modules. That would mean making the PowerVM or OpenKVM hypervisor span two different architectures–and three once ARM processors were added into the complex. The easiest thing to do would be to have all three types of processing elements running the KVM hypervisor and put the OpenStack cloud controller on them and provide a unifying framework. IBM already does this with the Power VC implementation of OpenStack, so this is not much of a stretch. I am merely suggesting that IBM do this across multiple processors within a machine as opposed to across server nodes. Or, more precisely, within machines and across machines.
The key thing is to put processors on similar types of processor cards that already fit into Power Systems machines, or future ones, and allow customers to mix and match them as they see fit. If the current mix of Power to X86 processors is 1 to 10 or 1 to 20, then this hybrid system has to be able to reflect this. We want to be able to take racks and racks of servers down to a few enclosures and have the Power, X86, and ARM processors share as many components as possible. Like blade server designs originally promised but did not take far enough.
The idea would be to radically lower the cost of X86 processing on this hybrid machine compared to actually using real two-socket Xeon servers. And in fact, I think that most workloads that are running on two-socket Xeon servers today only need something on the order of 64 GB to 256 GB of main memory–look at Facebook’s server configurations to know this is true–and even an older generation Opteron processor can easily do this. Future “Zen” Opteron processors will be able to address a lot more. IBM could easily license ARM cores and design its own chip or simply use one from Applied Micro, Cavium, Broadcom, Qualcomm, or AMD. At this point, AMD’s “Seattle” Opteron A series ARM chips are not all that impressive compared to those from Applied Micro and Cavium, and we won’t see what Broadcom and Qualcomm are up to until the end of this year or so.
The issue for IBM is whether or not it can add such hybrid computing in an affordable fashion for customers and in a way that is profitable for itself. The other thing it does not want to do is dilute its marketing message that it is interested in moving as many workloads as possible over to Power processors. The thing is, IBM is not trying to attack the entire market with the hybrid machine I am proposing, but rather the 125,000 or so customers that have IBM i as their primary platform and that have a slew of Windows Server machines doing other work. My presumption is that there is value in Power, X86, and ARM processors being able to share data quickly over high speed buses, much as logical partitions do over the memory bus that underpins PowerVM networking. Maybe these workloads do not share data that much, or maybe they would if it was easier to do so. I also presume that future workloads might be less siloed than they currently are, and that companies would like to have a mix of compute elements to have the right type of processor or coprocessor to fit the workload. It should be easy to swap out an X86 chip for a GPU or FPGA in such a system, but the Power chip remains at the center, providing memory access and communication across processing elements.
This is a packaging and interconnect problem more than anything else, I think. And fairly modest machines would cover the needs of the IBM i and AIX markets, and from that established base of happy customers, IBM could grow from there. The systems would have to have converged and virtualized SAN storage, which is the modern way to do things. Forget external disk arrays, that is the past. It would also have to employ technologies like NVM-Express links over PCI-Express to talk to flash and disk storage for the best of all possible performance on storage. There is no reason to cut back on the networking, either, so 100 Gb/sec Ethernet and InfiniBand should be available from the get-go to appeal to both enterprise and supercomputing workloads.
If I had it my way, such a system would mask NUMA interconnect differences and memory access methods and present all compute elements as peers, and you could build a box with all X86 processors if you wanted. That would be too much like IBM getting back into the X86 server business, of course. But maybe it should have designed such an asymmetrical machine from the beginning and it would not have put itself in the bind it did. IBM, of all the system vendors, knew better.
But, hope springs eternal. And IBM can still create a new kind of system–what it once called a system of systems–but one that doesn’t put its overly expensive but still elegant and power System z mainframes at the heart of that cooperative computing complex. Maybe, just maybe, all compute should be peers in a system of systems, sharing memory and work as needed. Like all of the processing elements in an AS/400 before IBM decided to rearchitect it with dumb peripherals and centralized processing to reflect the dominant system design in the Unix market in the late 1990s and early 2000s. IBM brilliantly vanquished Oracle/Sun and Hewlett-Packard from the Unix space, but it lost the server war. It can, however, engineer a new kind of system that does something that no one else is doing.
That is what we would expect International Business Machines to be able to do.