The System iWant, 2010 Edition: Clustered Boxes
February 22, 2010 Timothy Prickett Morgan
With only a few machines launched in the actual Power7-based Power Systems lineup and additional machines not expected for a while, I have plenty of time to continue the conversation with you about my theoretical and completely hypothetical System iWant, 2010 Edition boxes. I have been through small machines, midrange machines, big iron, and blades and cookie sheet servers. I have wrestled with Windows and its place in i/OS shops and a few other issues worth thinking about as the Power7 machines were on the horizon. That leaves us with clustered machines.
There are a number of different kinds of clusters, many of which are useful to i/OS shops today and that could be more useful to a larger number of them in the future, provided IBM and its systems and application software partners came up with a coherent strategy for clustering. What I am thinking about is a bit different from what you are used to seeing, but includes some familiar elements, too. This is a thought experiment, and I leave it up to you and the techies at Big Blue to tell me how silly I am in what I am proposing.
There are only two reasons people cluster machines together: to get more performance than they can otherwise get or to have spare copies of servers around in case production machines die for some reason. In some cluster architectures–such as Oracle‘s Real Application Clusters extensions for its eponymous databases and IBM’s Parallel Sysplex for mainframes–the cluster extensions are meant to provide both application and database scalability as well as disaster recovery. But ultimately, as our pal, Dan Olds of Gabriel Consulting, likes to put it, virtualization is about making small ones (meaning servers) out of big ones and clustering (in its many guises) is often about making big ones out of small ones for both economic and technical reasons.
Some clustered architectures, like the NonStop Unix-oid operating system and related and same-named database from Hewlett-Packard (formerly the free-standing Tandem Computers), have clustered architectures that are designed to provide not just scalability and disaster recovery, but also fault tolerance. And yet there are other clusters, at both the systems and application level, that provide data and application replication and failover from a primary production machine to a backup machine with a short failover time in between.
Such two-system clustering is common in the AS/400 and Unix bases where customers need absolute uptime, for legal as well as business reasons. Banks have to give you access to your money, hospitals have to have access to medical records, emergency responder systems (police, fire, ambulance, etc.) communication networks have to stay up and route help to your location. There is no “we’ll get back to you later, the system is down” in these applications.
With the vast majority of current i/OS shops being able to get by with a Power 520-class system–but certainly not all of the revenue from the business comes from here, so don’t get the wrong idea about the importance of big boxes to the i-side of the Power Systems biz–you would think that there would not be much opportunity for clustering for these customers within a tight budget.
Yes and no. It all depends on how you slice up the box. As we learned from last week’s issue of The Four Hundred, the Power 750 has a new small form factor split disk backplane (called feature 8430) that allows the eight disk drives inside of a Power 750 chassis to be broken into two physically and electrically isolated units. The System p variant of the Power 520 (8203-EA4) has two processor cards, unlike the earlier Power6 versions bearing the System i label. So in an entry Power 520 machine–and presumably in the forthcoming Power 720 and definitely in the just-announced Power 750 from last week–you have multiple processor cards (each with their own main memory) and multiple disk backplanes. To a certain way of thinking–well, mine at least–that means the Power 520, Power 550, Power 720, and Power 750 could have mostly electrically isolated components and with some tweaking of the PowerVM hypervisor and i 6.1.1 operating system could be preconfigured as a ready-to-go cluster. Granted, such a machine would be fairly limited in terms of the number of disk arms, but you just use those GTX++ ports on the Power7 chipset on each system board to hang paired 12X Remote I/O drawers to add more disks or other peripherals if necessary.
I guess in the Olds lingo, this would be making mirrors out of ones.
In my scenario, instead of buying a barebones Power 520 server to run a modest set of RPG or COBOL applications for around $25,000 or so and wishing you had another $20,000 so you could mirror the boxes and then scrounge for some extra budget for high availability clustering software to manage the data replication and application failover, you spend maybe an additional $7,500 to add a second disk backplane and processor card that IBM treats like its reduced-price Capacity BackUp (CBU) cluster machines. And then you pay a modest fee to Vision Solutions, Maximum Availability, Bug Busters, Trader’s, or IBM to put cluster management tools on both halves of this clusterized box. The second half of the machine would only be available for clustering and whatever workloads the CBU allows (such as tape backups). And it would be clustered over Virtual LANs running in main memory, presumably much faster than any other method.
The beauty of doing this my way is that if you need extra capacity for your workloads, you can take this Power box and convert it to a true SMP through software tweaks and then plunk a new box next to it to mirror it. Or, you could upgrade from, say, a Power 720 with a split personality to a Power 750 with a similar Cybil setup and just double up each half to add capacity.
Obviously, this CBU-inside-a-single-box approach would require IBM to come up with some very attractive terms for the i 6.1.1 running as the target in the mirror. Let’s call it i/OS Cluster Edition for fun, to make it distinct from Standard Edition (the basic OS/400), Enterprise Edition (the basic OS/400 plus green-screen processing capability), and Application Server Edition (the OS/400 platform minus a license for the DB2 for i database and not including 5250 capability). This i/OS license fee, as well as processor and memory activation, should be very affordable. And IBM should offer upgrades for the hardware and software to make a half-box into a full box if customers need it.
Everything I just said about the i/OS clusters-in-a-box can be done for AIX and Linux, and there is no reason why using logical partitions, all three operating systems could not be clustered inside of a single machine if this is something that customers wanted to do.
That is one kind of clustering I would like to see. The other, as I have said for a long while now, is based on the DB2 Multisystem parallel database clustering that IBM created back with the original OS/400 V4 back in the mid-1990s. I think the economics, scalability, and fault tolerance of DB2 Multisystem would make a cluster of smaller machines much more appealing that a giant SMP box for a number of workloads. Just as is the case with every other RISC, Itanium, and X64 architecture out there.
Oracle is dead serious about making and selling clusters to replace SMP boxes, as it has been for the past decade. (Although it is perfectly happy to charge lots of money for database licenses on those big SMPs, of course, if your applications and databases are addicted to large shared memory spaces.) Buying Sun Microsystems was Oracle’s way of controlling the entire stack–hardware, systems software, database, middleware, and applications, and I think that when Larry Ellison, Oracle’s co-founder and chief executive officer, says over and over for nine months that he really sees an advantage in building hardware precisely tuned to run Oracle’s software, we need to start believing it. (I surely didn’t trust it, and believed that Oracle would say anything to get its hands on Java and Solaris.) When I see the first Sparc chip with integrated Java and system clustering acceleration on the silicon, I will firmly believe it. And if I was Oracle, that is precisely what I would do over the long haul.
In the meantime, if IBM is going to use DB2 for data warehousing and analytics (the Smart Analytics System that I told you about last August) and the similar but different PureScale extensions for DB2 database clusters tuned for online transaction processing, then the i/OS platform should get equivalents based on DB2 Multisystem for either data warehousing or OLTP. The software exists, it has been used in production, and all IBM needs to do is blow the dust off this and update it. For many customers, a cluster of Power 720s, 750s, or 770s is going to make more sense than a Power 750, 770, or 790 box. If IBM wants to meet Oracle and beat it, Big Blue will have to fight a two front war with all of its operating systems platforms. Not just AIX.
It goes without saying–but I will say it anyway–that these parallel clusters of Power Systems running i/OS should be loaded to the gills with high-performance solid state disks and main memory to really boost their performance, and the i/OS operating system and storage should be tuned like crazy for data warehousing or OLTP workloads.
If the 2010s are going to be about anything, it looks like it will be tuning.