A Hypothetical Future IBM i System
August 31, 2015 Timothy Prickett Morgan
A few weeks ago, in the main story in this newsletter, I showed you the Power processor roadmap running out past the Power10 chip in 2020 and later and talked about the contrast between the huge amount of processing capacity that IBM is delivering in its Power Systems line and the relatively modest amount of oomph that the vast majority of IBM i shops need to do their daily work to make their daily bread. The gap, as I showed, is quite large, and it will continue to be so if current trends for usage growth and capacity growth continue.
Not everyone agreed with my assessment, and as I believed I pointed out, I never meant to suggest that there are not IBM i shops with very large processing and memory capacity needs today and even larger ones in the not-so-distant future. I know there are such customers, who are generally members of the secretive and influential Large User Group, which we told you about a few years back. There are others who are not quite so large and influential who nonetheless need more capacity than an entry machine in the IBM i P05 or P10 software groups. One such reader responded to the story this way:
I think your reasoning in the article is way off! We use every bit of a Power7+ processor and wish we had three times (or more) the speed. Our database on Power i is the core database for a moderate size insurance company. I am constantly pushed by Windows app developers who are retrieving DB2 data from us, and loudly complaining of the lack of speed (not as fast as going direct to their Microsoft SQL Server database). This leads to a fracturing of the data when they copy parts of the data to their servers to then provide instant retrieval in their apps (the core apps are Power i, the warehouse and customer apps are C#-based web apps).
While IBM may have a higher box count at the low end, I suggest you look at revenue at the higher end–and also consider that concentrating on the low end of the market will only enhance the slide of customers moving apps to other servers. IBM i wouldn’t be a contender in SAP apps if only looking at the low end. The low end servers turn over much more slowly than the high end, so a comparison of low end customers with high end customers seems way too simplistic.
Even though our insurance company is in the low end of the larger servers, if we saw IBM pull back from its development path at all, management would jump elsewhere.
Fair enough, but I never said that IBM should–or will–pull back on hardware development. I merely pointed out that for the vast majority of the base, the machines it was building were overkill. By all means, build the big iron and sell it, and I agree that this probably generates a lot more hardware and software revenue. But a product line needs a large feeder base of customers, most of which grow modestly but a few get very large as their businesses take off. The lack of feeders is why the mainframe base has shrunk from tens of thousands of shops to maybe 5,000 or 6,000. We still have somewhere north of 125,000 customers, and I want IBM to keep them–and keep them all happy. Not just the big shops.
As I said in the prior story, according to the experts I talked to the average IBM i customer has somewhere between 1,000 CPWs and 1,500 CPWs of capacity supporting IBM i–and this is on a single core that can deliver somewhere around 9,900 CPWs of compute capacity. That works out to somewhere between 10 percent and 15 percent of peak CPU. Looking ahead five years to 2020, given trends that have persisted for the past 10 years, the average IBM i shop will need maybe 3,500 CPWs of oomph, but a single core of a Power10 chip will probably deliver something more like 15,000 CPWs. And IBM could deliver a 20-core chip using 10 nanometer processes in conjunction with fab partner Global Foundries, which would yield maybe 575,000 CPWs after some overhead is taken out.
This is moving in the wrong direction for the customer base. I think if IBM wants to make a killer database engine, it needs to look for some inspiration from supercomputers and graphics cards, oddly enough. I spend a fair amount of time analyzing such machines in my other job over at The Platform, and I keep seeing a design principle that could deliver a machine that is more appropriate perhaps for the kind of work that IBM i shops are doing. Such a machine would have a different balance of compute, memory, and I/O and frankly would look more like a game console or a supercomputer node than a database engine that also runs RPG, COBOL, Java, and PHP code.
Here is the basic concept, which I am learning from studying high performance computing. Processing power is not the limiting factor in performance anymore. Memory capacity (both volatile and non-volatile), memory bandwidth, and I/O bandwidth from the network and storage subsystems are the real issues. We can cram all the cores we want onto the dies, but it is getting harder and harder to keep them fed.
There are a couple of ways of dealing with this probably, but the basic shape of the solution is looking the same. Rather than trying to cram as much compute in a node as possible, the idea is to create smaller units of compute (which nonetheless have lots of single-threaded integer and parallel floating point performance) that have 3D stacked, high bandwidth memory right next to the processor. This 3D memory is a bit like the GDDR5 frame buffer memory in a graphics card, conceptually, meant to provide very high bandwidth at relatively low power. But in this case it is proper DRAM that runs in parallel and offers many hundreds of gigabytes per second of bandwidth into and out of the caches of the processor.
In many designs, a chunk of slower and more capacious DRAM memory is off the processor package to boost the local, bit-addressable memory capacity, and in other cases the network interconnect is either on the processor die or in the chip package. Intel’s “Knights Landing” Xeon Phi parallel processors are one example, and Oracle’s future “Sonoma” Sparc T series chips are another, just to name two. The presentation by Intel Fellow Al Gara, one of the creators of the BlueGene massively parallel machines at IBM, is also illustrative. The basic idea Gara has is to go even further, breaking these monolithic chips with lots of cores into smaller processors with more high bandwidth local memory and doing away, more or less, with DRAM that sits a bit further from the processor. The idea is to gang up multitudes of these smaller (but very powerful) compute elements using high speed interconnects. Let me remind you that by 2020, InfiniBand and Ethernet will be running at 200 Gb/sec speeds with latencies in the dozens of nanoseconds.
By the way, such processing elements cannot be lashed together using NUMA coupling like Power8 and earlier chips. Their memory bandwidth is so high that they would swamp whatever ports might be used to link multiple processor complexes together. This is a side effect that you cannot get around except by slowing down the main memory, which defeats the purpose. But we don’t care about that, see. Because we only need one of these compute elements to satisfy the computing needs of 95 percent of the IBM i shops in the world. For those who need more shared memory capacity for their databases and applications, IBM can make big NUMA servers with all of their inherent engineering challenges. My guess is that companies will learn to parallelize their databases and not pay the NUMA hardware tax. So IBM may have to blow the dust off that DB2 Multisystem clustering software from the 1990s to make DB2 for i parallel again. (I would love this, as I have said many times in the past.)
To build a system out of such a compute node, IBM would layer in several different layers of non-volatile memory, perhaps the new 3D XPoint memory that Intel and Micron Technology have just announced and will ship next year close to the CPU complex and cheaper NAND flash memory further away. The whole shebang could be solid state, with no spinning disk drives at all.
If that doesn’t make the DB2 for i database spit fire and shoot sparks, I don’t know what will.