The Application RISC Machine System/500
February 27, 2012 Timothy Prickett Morgan
Let’s have a little fun here. Those of us who have made a living in the market for AS/400 systems and their progeny on down to the current Power Systems-IBM i platforms spend a lot of time looking back at the past and pointing out all of the innovations that IBM cooked up in the System/38 and AS/400 product lines since the dawn of the commercial computing era in the midrange. We spend a lot of time talking about keeping legacies alive, but we don’t spend enough time talking about building a new machine that will be worthy of being a legacy three or four decades from now.
That’s kinda stupid. And short-sighted. I learned this week that I am 20-25 in the right eye and 20-60 in the left with a whole lot of astigmatism and a low-resolution retina–talk about good software and hardware, look at what your brain and a little glass can compensate for–and even I can see that with so many new kinds of workloads and so many different hardware and software technologies, there has never been a better time to create a sequel to the AS/400.
And if IBM doesn’t build it, well maybe someone else will.
The great thing about the AS/400 and its daddy, the System/38, is that it took the most advanced ideas of the time and wove them into a coherent and sophisticated computing platform that mere mortals could use. Experts in business with some programming skills could make sophisticated applications that rode atop a relational database–way high tech back in 1978 when the System/38 came out and still high tech a decade later when the AS/400 debuted–and do so in such a manner that you didn’t need to be a relational database expert to make it all work. And, you didn’t need a database administrator to constantly tweak and tune the database and manage the schemas and data structures, either. That was all done by the database management system itself. To a certain extent, we could have called it the Automated System/400 as much as the Application System/400.
Here’s some other smarts that people outside of the AS/400 world still don’t appreciate enough. The first, of course, was single-level storage, where main memory and disk memory were treated like one giant address space and the operating system watched data access paths and decided automagically what portions of data should be stored in main storage and what should be on hard disk drives. You didn’t have your programmers messing around in the areas between the operating system and their code, moving data around in a basically manual fashion. You let the operating system do the work, as it should do.
Ditto for the original concept of the file system on the AS/400, where everything was stored in the database. Until 1995, there was no such thing as an external ASCII file system. If it got stored on the AS/400, whether it was an image file or a sound sample or ASCII data from a PC, it was stored in DB2/400 (what we now know as DB2 for i with IBM i 7.1). In 1995, concurrent with the shift to big and fast 64-bit PowerPC processors, IBM grafted DB2/400 onto the OS/2 High Performance File System, which we now know as the Integrated File System, and now there were many different ways to store data. The trouble with storing data in DB2/400 was that the original CISC processors back in the 1980s and early 1990s were terribly slow and file system functions performed by the database were not fast enough for the client/server approach to keeping Windows-based PCs fed with data. Rather than try to figure out ways to improve the performance of the database–and therefore keep all data in the system reachable through any programming language and indexable by the database–Big Blue threw its hands up and basically converted the AS/400 into an RS/6000 running an OS/400 database and runtime.
It’s true. Think about it. The original AS/400s were very modestly powered CISC processors surrounded by an army of smart I/O controllers–remember these, I/O Processors, or IOPs?–that ran a portion of the operating system and did actual processing or preprocessing for the CPU as data flowed in and out of the system. This was a form of asymmetric processing, and IBM used Motorola 68000 series processors in most of its IOPs and, I have long suspected, a licensed version of the 68K chip as the foundation of the CISC CPUs at the heart of the systems. I could never prove it definitively, but it makes sense that the instruction set and microcode running on the CPU would be compatible with that running on the IOPs, and the easiest way to ensure that would be to use the same processor architecture throughout the system.
The other innovation–and a key one, at that–was that System Licensed Internal Code (SLIC) microcode that sat atop the hardware in an AS/400 and that talks to the Technology Independent Machine Interface (TIMI) that presented system APIs and this abstracted view of hardware (I hesitate to say virtualized, but that is what it really is) to OS/400. These layers of microcode ate up some performance but they allowed application portability across many generations of underlying CISC and then RISC hardware. Yes, every decade or so we have to do a true program conversion, but it doesn’t happen every time apps move to a new system or a machine has a processor upgrade. And yes, these layers of abstraction caused the loss of some performance. But they saved companies from a huge amount of grief compared to alternative platforms.
So here’s the though experiment for the day. You have been tasked with building a modern AS/400. I don’t just mean a machine that will run RPG and COBOL programs, but one that will be based on modern system and application concepts and use architectures that are more flexible than the current monolithic systems. Guess what? If you build such a machine, I think you would have to undo a lot of the things that IBM has done in the past decade with its Power Systems. Here’s what I would seriously think about doing if I were going to build a machine for the future. And yes, I realize that I am not a systems designer, but I also have my doubts that the people at IBM, Hewlett-Packard, Dell, Oracle, and Fujitsu are thinking outside of the box very much these days. They are interested in making incremental improvements in current monolithic server designs and introducing as little change as possible.
Where’s the fun in that?
I think it would be far more interesting to go back to the future and see what kind of AS/400 we could throw together using today’s commodity hardware and software. Call it the Application RISC Machine System/500, or ARM System/500 for short and ARMS/500 for even shorter. So here’s how I would do it, and by all means, hit that Contact page above and tell me how you would do it. The more ideas, the merrier. Maybe we’ll get some venture capital together and just do this damned thing ourselves.
Go ARM: Let’s start with the obvious. The AS/400 was founded on the idea that you take a collective of relatively cheap processors and use some fatter ones as central processing units and a slew of smaller and cheaper ones–or specialized ones–to operate in an asymmetric fashion across the I/O bus to do work.
In my other life over at The Register, I spend a lot of time looking at hybrid supercomputer architectures that mix and match CPUs, GPUs, and field programmable gate arrays (FPGAs), and I think any modern system will want to leverage the licensable ARM processor architecture. These chips are used in smartphones, tablets, and a slew of embedded devices and are more energy efficient than either X86 or Power chips for any given amount of work. And you are allowed to license the ARM design and make your own variations.
Calxeda has a fine quad-core, 32-bit ARMv7 variant called EnergyCore that is shipping in low volumes now, and Applied Micro Circuits, a maker of embedded PowerPC processors, has crafted a multicore, 64-bit ARMv8 chip called X-Gene that will run at 3 GHz and have L1 and L2 caches on the cores, shared L3 caches that span the cores, integrated DDR3 memory controllers, two 10 Gigabit Ethernet interfaces, PCI-Express peripheral controllers, and SATA storage ports all integrated onto a single system-on-chip (SoC).
The X-Gene chip will also pack a fully non-blocking processing interconnect rated at 1 Tb/sec and providing nearly 80 Gb/sec of aggregate bandwidth between processor sockets. This interconnect allows the X-Gene to scale from two to 128 cores in a cache coherent fashion. That means you will be able to create SMP slices on the fly or keep processor sockets electrically isolated, as the needs suit you. The X-Gene also has on-chip server virtualization circuits, and Samsung is championing a Xen hypervisor for ARM chips and Columbia University is making a KVM variant for ARM, too. You will be able to get an X-Gene processor for a few hundred bucks and it will have the performance of a Xeon chip and much better performance per watt, if Applied Micro is to be believed.
Go asymmetric multiprocessing and clustering: So I will say it again. Use a mix of symmetric and asymmetric multiprocessing in the ARM System/500 machine, and build clustering, grid computing, and high speed messaging between server nodes into the system from the get-go. Allow customers to mix and match different means of lashing machines together to get work done so they can play off latency demands against costs. This machine should be as comfortable running clustered databases, Hadoop data munching jobs, parallel supercomputer applications accelerated by GPU and FPGA co-processors, and traditional single-threaded work like the big batch jobs running at corporations. You pick the right processor nodes and interconnects to do the job, and the machine has modular system board designs so you can pop out an Ethernet or InfiniBand port and pop in an X-Gene interconnect port to shift it from cluster to SMP configurations. (I/O and other co-processors will hook into the PCI-Express bus, and may eventually have their own switching, too, so they can communicate directly without involving the CPUs.)
Go multi-operating system and database skinning: By this, I mean something different than you might be thinking. Sure, with a hypervisor and a recompiled version of the operating system for ARM, you can have multiple operating systems running on this hypothetical machine. It is certain there will be Ubuntu Linux Server from Canonical and we presume Microsoft will get Windows Server 8 on ARM machines–although it has made no promises as yet. IBM i and PowerVM are just a bunch of C++ and Java code (for the most part) and could similarly be ported to ARM architectures, and if IBM can find the old OS/400 system programmers, it might even be able to tune it up for the old asymmetric approach.
But let’s think about this another way. What if you started with a Linux kernel, which is plenty enough rugged these days, and skinned all the other popular operating systems? What if you embedded .NET, Java, PHP, RPG, COBOL, Ruby, and any other runtime you can think of inside this kernel and then skinned the operating system commands from Linux, Windows, IBM i, AIX, or whatever on top of them, translating them into Linux functions where they existed and creating new commands where they didn’t? Why not make the ARMOS/500 operating system pretend to be any OS you are comfortable with and just remove the OS zealotry?
IBM and EnterpriseDB are already skinning Oracle databases atop their respective DB2 and PostgreSQL databases, so why not the important functions of each operating system?
Yeah, it is a lot of work. But how many commands really change in any operating system? And what percentage of them are really used by system administrators?
And it goes without saying–but I will say it anyway–that IBM i commands and DB2 for i databases will have to be skinning in this future system.
Go embedded database and file systems: IBM had the right idea a long time ago. I don’t know what the correct database and file system should be, but out there in the hyperscale Web world, it is getting hard to tell what is a database and what is a distributed file system with funky features. We’ll need relational database access and all the ACID properties they afford, but modern “big data” workloads also require columnar databases, hybrid SQL-like queries of the Hadoop Distributed File System, and all kinds of offshoots in the NoSQL world. Customers should be able to use all of these new Web-driven technologies as well as relational databases and NFS file systems, and I/O processors and other co-processors should be designed to boost the specific performance of each database and file system. If you could do this with reprogrammable FPGAs inside the IOPs, that would be just perfect, allowing you to buy one piece of iron but radically change its personality. Obviously flash has to be embedded into the system for database and file system performance, and embedded more deeply than putting it behind a RAID controller far away from the CPU.
Go programming languages–all of them: I’ll make this simple. Any programming language that has some unique advantage in the market and an established customer base should be supported on this hypothetical platform. That goes for RPG, COBOL, and Java as much as Node.js, PHP, and Ruby. The key is embedding the application frameworks and runtimes for languages into the architecture from the get-go, and creating a framework that allows for new languages to be snapped in easily. I have no idea how to do this, but I know it needs to be done.
Go from small single systems to large clouds: This system has to scale up and down. If IBM did any one thing wrong, it is that it ceded the low-end market to X86 machines instead of trying to figure out how to beat Intel at the volume processor game. ARM chips represent a second chance for any server vendor to create a single, scalable platform that ranges from baby servers all the way up to fully scale private and public clouds supporting tens of thousands of workloads. The key is to provide a single architecture–hardware and software–that scales from the smallest to the largest customers and that charges reasonable fees for the hardware and software that customers buy, whether they acquire it or rent it on a cloud.
Yeah, I know this sounds crazy. But so were the IBM Fort Knox and Future Systems projects, too.