PowerVM, IBM i Enhancements Mean Better Power Systems Clouds

October 22, 2012 Timothy Prickett Morgan

We’re still chewing through the October 3 Power Systems announcements from IBM, and this week we will drill down into some of the cloud-related tweaks that Big Blue made to the IBM i operating system with the Technology Refresh 5 update as well as in the PowerVM hypervisor and System Director VMControl tools that comprise the server virtualization underpinnings of a Power-based cloud–whether or not you use IBM’s SmartCloud Entry control freak to orchestrate that cloud.

We told you all about the update to SmartCloud Entry V2.4, which finally supports the IBM i operating system, in last week’s issue. The SmartCloud tool, which was developed by IBM itself as far as I know and which is not based on the open source OpenStack or CloudStack cloudy control freaks, is able to hook into the PowerVM hypervisor for Power Systems as well as VMware‘s ESXi and Red Hat KVM hypervisors for X86-based machines. It knows how to manipulate operating system images for logical partitions or virtual machines (they come to the same, despite the different name) based on IBM i, AIX, or Linux on Power-based servers and for Windows and Linux on X86-based servers. As far as I know, and I follow the cloud orchestration market pretty carefully and there are system management vendors and OpenStack cloners coming out of the woodwork. SmartCloud Entry is the only cloud controller for Power Systems and importantly is the only one that speaks IBM i. The control node for SmartCloud itself runs on AIX, and it would be nice of IBM ported this to the PASE AIX runtime environment at some point so IBM i shops could insulate themselves from AIX.

The changes to the PowerVM hypervisor that gives SmartCloud virty infrastructure to control has been revved to V2.2.2 level, and it has a number of important changes, Steve Sibley, director of worldwide product management for IBM’s Power Systems division, explained to me in a briefing. (You can get some of the details on the PowerVM enhancements in announcement letter 212-344.) First and foremost, the hypervisor has been tweaked so you can scale an individual logical partition, or LPAR in IBMspeak, down to 5 percent of a core’s processing capacity. That means you can cram up to 20 LPARs on each Power7 or Power7+ core.

With a typical modern Power core yielding somewhere between 6,000 and 7,800 units of IBM i performance based on the Commercial Performance Workload (CPW) scale, that means you can allocate virty Power server slices that range as small as from 300 to 400 CPWs, depending on the particular machine and the processors you choose. That is a useful amount of capacity, particularly if you are an IBM i cloud builder trying to cram as many customers onto a box as possible and means you can spread the cost of a system over twice as many users, provided 300 to 400 CPWs is a useful amount of capacity for their particular workloads.

However, PowerVM 2.2.2 has not extended the maximum number of LPARs that it can support, which stands at 1,000 as it was in the prior releases tied to Power7 machinery. For entry and midrange machines, for anything with under 50 cores in the box, you can max out the logical partition count across those cores. It is only on the largest Power 770, Power 770+, Power 780, Power 780+, and Power 795 machines where the cores scale further and faster than PowerVM.

While X86-based hypervisors, including Microsoft‘s new Hyper-V 3.0 tied to Windows Server 2012, do not have any lower scalability limits–you could give a VM a half percent of CPU capacity if you want–a 5 percent lower limit is probably as small as you would want to get. And none of those X86 hypervisors can do something equally important that PowerVM and its predecessors have always been able to do: scale from a tiny slice across the entire system–meaning all of its CPU cores and memory–if need be. The X86 hypervisors all have hard limits for their VMs that are significantly smaller than the scalability limits of the hypervisor itself and the underlying hardware. With PowerVM and its predecessors from a decade ago, if you suddenly need the whole machine for a single partition, you take it and it runs. No questions asked, no limits.

With the IBM i TR4 update back in April, the IBM i operating system working in conjunction with PowerVM finally got Live Partition Mobility, IBM’s term for live migration on its Power Systems. This feature, which has been supported for two years on AIX and Linux, was long overdue on IBM i and is absolutely necessary as far as I am concerned. If you cannot take a running workload on one machine and teleport it in a matter of seconds over to another physical machine, then as far as I am concerned, you do not have a cloud. You may have very nice hosting, but technically speaking you need mobility if you want to be a cloud. (I need to get on it quick, and get off it quick.)

With the PowerVM enhancements that came out as part of the October 3 announcements, the speed at which live migrations can be done as well as the number that can be done has been significantly boosted. On the new Power 770+ and Power 780+ machines based on the Power7+ chips, the live migration of a single LPAR now runs three times faster, and PowerVM can now manage as many as sixteen concurrent live migrations, double that available with prior PowerVM releases running on prior Power7 iron.

At the moment, to do live migration on Power Systems, you need storage area network hardware that the source and target Power machines are attached to. The live migration moves the pointers to the system on the SAN where the LPAR is stored from one machine to the other after suspending the LPAR system memory for a fraction of a second and taking a snapshot of it. It is only this memory state that moves from system to system, and you load it into the new LPAR, point it to the quiesced disk files, and voila! You have an LPAR running the same stack on a different physical machine with effectively no downtime.

I think that given the preponderance of internal storage among the IBM i base and the cost and complexity of SAN arrays that IBM should use replication software to synchronize files on internal arrays in two physical systems and allow live migrations from two physically distinct machines not sharing storage. VMware is doing this for SMB customers with its ESXi hypervisor. Sibley says that IBM is looking into replicated LPAR capability that I outlined above and also at other ways for two servers to cross-couple their internal storage to allow for LPAR mobility. But IBM is making no commitments here.

There have also been some changing in system and storage pooling with the latest announcements, and these hook into the elastic capacity on demand (CoD) utility priced CPU and memory that is available on Power Systems machinery.

PowerVM V2.2.2 now can allow up to 16 separate physical systems to share a common pool of preallocated storage. These shared storage pools make it much easier to allocate storage for lots of virtual machines, which thanks to Live Partition Mobility, can be flitting around the network of systems and, when the workloads are light, corralled off to a few boxes as others are turned off to save power. On the processor and memory side, the Power Systems Pool feature allows up to ten Power 780+ or Power 795 machines to pool their CPU and DRAM and treat it like a giant pool for LPARs to splash around in.

I know what you are thinking. First, why is this Power Systems Pool restricted to the biggest machines in the line? (By the way, in the first quarter of next year, the Power7-based Power 780 will be grandfathered into this system pooling.) And second, if you are letting the hypervisor share storage across 16 machines, why not 16 machines for the CPU and memory pooling? I am sure there are good reasons for this, but IBM didn’t give them in its announcements and I didn’t notice the discrepancy when I was talking to him.

PowerVM is also now able to do linked clones of LPAR images, which is a useful feature. Basically, when you clone an LPAR, you can do two things. Make a carbon copy that is freestanding and that can be independently altered from the original, and you end up with two unique LPARs that now need to be independently maintained. Or, you can clone an LPAR, but keep it linked to the original image; as that gold image changes, all of the linked clones change. (I get this conceptually, but I can’t see how they don’t end up doing exactly the same work. But clearly there is the ability to push different work through the linked clones while keeping the underlying system software bits the same.)

These pooling and cloning functions of PowerVM are available when you get the VMControl V2.4.2 plug-in for Systems Director 6.3.2. Those are both updated releases.

The updated Power Systems hypervisor also includes a new Virtual I/O Server Performance Advisor that helps you figure out how to get better performance out of VIOS. As you know, many disk arrays and some peripherals can only attach to IBM i if they work through the VIOS partition on a Power Systems machine, which virtualizes the drivers that would otherwise have to be written to work natively with IBM i. And people grumble about VIOS all the time. Anything that makes it better is a good thing.

PowerVM V2.2.2 will be available on November 9.

By the way, there is a new Hardware Management Console software release, called V7R760 that is necessary to do those 16 concurrent LPMs on a Power Systems machine. The new software also adds RAID 1 data mirroring for disk drives in the rack-mounted HMCs, which include the older 7042-CR6 unit and the new 7042-CR7 unit, which has faster hardware. The HMC V7R760 software is the last one that will be supported on the even older 7310-C04, 7315-CR2, and 7310-CR2 versions of the HMC. The new HMC console and software update will be available on November 19.