The X Factor: Virtual Server Sprawl
June 19, 2006 Timothy Prickett Morgan
It is hard to find a server maker who is not gung-ho about virtualization these days. This might seem a bit perplexing, given that one of the main marketing drivers for virtual machine partitioning or logical partitioning on modern servers is that by carving up a physical machine into virtual, dynamic slices, customers can do server consolidation on a grand scale, and in theory reduce their footprints.
Of course, not everyone is thinking about server virtualization in this manner. At data centers in the financial services sector, for instance, IT departments are operating in a business environment where transaction volumes are increasing every year, data centers are static in size, and the amount of profits that they can wring out of a transaction each year is falling faster than the rise in volume. That is something of an oversimplification, but rather than consolidate and reduce server footprints, server virtualization software, which is in the early adopter phase at many financial institutions, is about trying to get the most work possible out of a given physical and thermal footprint in a static data center. These customers want to drive more work within the power envelope they have in their data centers; they are not able to even consider reducing footprints. Most businesses would prefer to have this sort of problem, really. It sure is better than not having growing workloads, which implies a static business environment where cutting costs is the only option.
Either way, whether server virtualization is used to reduce server footprints, and therefore over the long run cut operating costs, or to drive server utilization and therefore enable more work to be done within an existing data center, there is a potential systems management nightmare lurking around the corner, and this is something that no one is talking about. If server sprawl was a problem during the dot-com era, what happens when a server is no longer a physical thing, but a virtual thing? What limits are there to virtual server sprawl? Just how crazy are system administrators going to be driven by rampant server virtualization? What is managing all of these virtual machines going to cost?
No one knows, because this is such a new technology, even for large businesses. While mainframes have had partitioning for two decades and Unix and proprietary midrange computers have had the capability since the late 1990s, server virtualization has not exactly gone mainstream yet. But it will do so soon, and now is the time to start thinking about the issue of sprawl–before it gets out of hand.
To date, VMware, the subsidiary of disk maker EMC, has sold its “enterprise class” server virtualization programs, GSX Server and ESX Server, to more than 20,000 customers, giving it by far the largest installed base of virtual machines on the market. Simon Crosby, the chief technology officer at rival XenSource, which is commercializing the open source Xen virtual machine hypervisor, thinks that VMware might have as many as 200,000 licenses to its flagship ESX Server and related virtual machine management programs. (VMware might have millions of VMware Workstation installations, and untold GSX Server–now called VMware Server and given away for free–installations, but neither of these are enterprise-class hypervisors.) Assuming an average of a few production virtual servers per machine (which is reasonable given the heavy loads of enterprise applications), VMware’s ESX Server is probably controlling as many as 500,000 to 1 million virtual machines.
SWsoft, which has a slightly different server virtualization approach from VMware in that its Virtuozzo software does not virtualize whole operating systems, but rather runs partitions on top of a common operating system kernel and file system and isolates everything above that. SWsoft says that it has 8,000 servers using its software, mostly at ISPs who host virtual Web sites for clients, and that a total of about 400,000 partitions are under management. It is safe to say that there are maybe a few million virtual servers out there in the world, running on a few hundred thousand physical servers, including OS/400, mainframe, and Unix servers, too.
You might think that the memory capacity of a server would be the inherent limiting factor of the number of virtual machine or logical partitions a given machine could support. And if a given machine had only a static set of virtual machines associated with it, this would be a good
Both VMware’s ESX Server and SWsoft’s Virtuozzo cost money, which is a limiting factor; so does IBM’s Virtualization Engine hypervisor for the system i and System p boxes, in one way or another. (Saying something is bundled doesn’t make it free.) But by the end of this year, the open source Xen hypervisor will be embedded in Linuxes from Red Hat and Novell as well as Solaris Unix from Sun Microsystems, and tens of thousands of units of these operating system platforms will be rolled into production each quarter with an integrated, free hypervisor. Very quickly, the economic barriers to server virtualization (which costs money on expensive midrange boxes, which themselves cost considerably more than X64 alternatives) will go way down. It is then that system administrators will begin exploring how to use virtualization to deploy different application stacks at different times on the same physical machines. It is also then that there will be no physical limits to stop the sprawl.
In the era of the real physical server, the amount of power, cooling, and floor space limited the amount of server sprawl. Which is why so many data centers are busting at the seams these days, even after advances that allow more servers–each with a lot more computing power–to be packed into a given space. But with virtualized machines, no such limits will be there. And while the processing power, main memory capacity, and I/O bandwidth are limits to the number of concurrent virtual machines a given server can support at any given time, that does not mean CPU, memory, and I/O limits will limit the vast number of different types of virtual machines administrators have to manage for a collection of servers.
Suppliers of systems management tools have been trying to provide automation tools that allow system administrators to move from handling the patching and support of dozens of physical servers to hundreds of servers; what happens when each server has five or 10 concurrent virtual machine partitions and possibly one or two times as many offline virtual machines that, while quiesced, still have to be sorted, stored, updated, and used either frequently or occasionally. This could be a gargantuan increase in the number of servers that system administrators have to cope with. We could be living in a world with fewer physical servers–maybe only a few tens of millions of units in production–but many tens to hundreds of millions of virtual servers.
Which would prove, yet again, that any new and desirable technology always seems to cut both ways.