Mad Dog 21/21: The Fox in IBM’s Storage Henhouse
November 30, 2009 Hesh Wiener
Moshe Yanai became successful by taking enterprise storage business away from IBM. He led the team that created the EMC Symmetrix, which became the leading storage product at IBM’s glass house accounts. EMC and Yanai parted ways in 2001 and after a decent interval Yanai founded a new storage venture, XIV (pronounced Ex Eye Vee). IBM acquired XIV at the start of 2008, naming Yanai an IBM Fellow. Yanai may be able to clobber EMC for IBM, but to succeed he will also have to kill off IBM’s flagship DS8000 array with his XIV boxes.
The reason XIV just might be revolutionary, at least by IBM standards, is that it is a disk array that has some of the attractive characteristics of just about every kind of array in the alphabet soup of today’s storage industry. Basically, an XIV box has a front-end based on X64/Linux servers with software and interfaces that let the machine talk over Fibre Channel, Ethernet, iSCSI (which uses Ethernet), and if there were a need, any other fast hookup the market might want. Behind the gateway machines sit a pile of X64 servers that talk to large SATA disk drives. Today the drives are 1 TB each; tomorrow they will be 2 TB and maybe in 2010 or 2011 they will go to 4 TB. The hardware components are standard and cheap. Each XIV box can have up to 180 drives; this complement of disks yields 79 TB of storage capacity after deductions for internal mirroring and spares. Total cache in each box can be up to 120 GB. (Details galore for the XIV storage cluster can be found in IBM’s XIV Redbook.) But as all the hardware is based on industry standard technology, it is the software that really adds value.
The XIV software duplicates data (making the machine RAID 1) and scatters it across the full set of drives in the cabinet. This, says IBM, lets the box rebuild a failed drive quickly (20 to 30 minutes) and accurately. Data scattering makes it very unlikely that a whole file or dataset gets clipped if a drive fails, but a two-drive failure, which is a very unlikely event, can cause a whole box to lose viability. There has been a blog rant about this but people who read through the material will most likely end up trusting the disk subsystems more than the bloggers.
The actual machine is not quite this simple but neither is it as elaborate as the DS8000 boxes IBM sells to its high-end server customers, including mainframe and Power Systems shops and including AIX 6.1, Linux, and i 6.1 (the latter through the Virtual I/O Server partition) on the Power boxes. IBM says the XIV is a bargain and that it also is less power hungry than alternatives. While IBM’s salesforce makes these claims when comparing the XIV to non-IBM alternatives, the very same comparisons can be made to the DS8000, which remains in the spotlight in part because it is the only large IBM disk array that z/OS will talk to.
There may be a performance case for the DS8000 compared to the XIV (and other high-end arrays), at least for certain kinds of workloads, but in practice there are no independent benchmark tests for big storage subsystems. Industry folklore suggests that a box like the DS8000 should shame the XIV in transaction processing applications, while the XIV would win when data is not structured. Unstructured (or inconsistently structured) data shows up a lot in pop culture sites like Facebook but also is the nature of medical records systems (with computer files, documents, photos, videos, scans, etc.) and insurance claims databases, to name only two of many applications where large disk arrays are used.
Regardless of widely held preconceptions, however, the XIV may be better than DS8000 advocates (such as users who have a big investment in the machines and some IBMers with careers tied to the product) seem to believe. Bank Leumi uses XIVs for everything, including transaction processing, and says it is getting great results. Recently, IBM has been holding dog-and-pony shows for users and analysts that feature a growing number of user organizations that have bet their strategic storage on XIV equipment. rnFor now, IBM is protecting its DS8000 base by not supporting XIV under z/OS. If you want to use a XIV on a mainframe, you have to hook it up using zLinux. (On Power-based servers, AIX and Linux can talk directly to the XIV arrays, but i 6.1 has to work through VIOS to reach the machine.) Clearly, direct links between Power Systems or mainframe machines and XIV clusters could happen just as soon as IBM is willing to let XIVs cannibalize the DS8000 base. Actually, the decision would not so much be IBM’s but that of high-end server shops, who will flee the DS8000 the minute they figure out that XIV is going to be the future of IBM enterprise storage, a process that is beginning to get underway.
Outside the mainframe base, there’s plenty of support for XIV. But for IBM, which is not the prime vendor to the Web 2.0 world of Google, Amazon, eBay, and their ilk, the strategic battle is the fight to store glass house data. And in that world, IBM has to grow by 50 percent to catch up to EMC.
In the second quarter of this year, which is the most recent quarter for which IDC has issued a public report, EMC had 21.5 percent of the market IDC calls external disk systems. IBM had 14.9 percent of this $4.1 billion (in the quarter) segment. Hewlett-Packard followed with 11.4 percent, Dell had 9.89 percent and NetApp had 8.9 percent. Gartner says more or less the same thing about the market, pegging EMC’s share of what it calls controller based disk storage at 21.8 percent in this year’s first quarter, IBM at 15.7 percent, HP at 10.3 percent, Dell at 9.2 percent, Hitachi (which also sells subsystems through HP and Sun) at 8.7 percent, and NetApp at 8.5 percent.
The 2009 market data describes a segment that’s headed for a year-on-year decline of 15 to 20 percent compared to 2008, which itself was not glorious. IBM’s mainframe sales data confirm times are tough in glass house country, and with banks dying like mayflies, the outcome for the industry, for each vendor and for various products is quite hard to predict. Nevertheless, it’s probably safe to say a story about cheaper storage will get users’ ears.
It’s very hard to tell just how cheap a XIV is compared to alternatives, in part because IBM reportedly pumped out so many evaluation machines, meaning freebies, to seed the market. Now EMC bloggers are not about to praise the XIV but they also are not going to put complete balderdash on their Webs because it’s hard enough for them to get eyeballs under the best of conditions. But IBM fans (and people who like to gripe about EMC) might want to discount Big Blue bloggers at least as much as those on the EMC propaganda squad. One of IBM’s (until recently, anyway) visible and often entertaining blogsters, Tony Pearson, seems to have just disappeared . . . or whatever you call it when a whole lot of Google pointers that formerly worked now take you to 404 country.
It’s possible that these developments in the murky world of storage marketing and persuasion stem from the heated discussions about the actual cost and real greenness of XIV. The dispute, which we found close to impenetrable, stems from the fact that a XIV box is at its most cost-effective and power friendly when it is fully configured. In fact, it may be the case that IBM builds every XIV fully loaded, the way it builds large servers with a full load of engines, and then turns on pieces of the machine based on how much a user pays.
It turns out that a serious prospect for the XIV or an alternative from IBM or one of its rivals can get good answers to hard questions quite directly. First, get IBM and any other vendor in the running to quote a five-year total cost including maintenance, software to provide all the required disk array functionality, etc., including mirror systems and whatever it takes to do mirroring if the mirror is going to be remote. At the same time ask the sales rep for power and heat data for the machinery as it will actually be configured. Anything less is just asking for ugly surprises.
Even when you have done that, you may not have all the answers unless you give vendors’ financing arms (and the handful of independents whose names come up if you search for “used IBM mainframe storage”) a chance to offer secondhand equipment. Just as you should in the case of new gear, you have to press the vendors of used disk arrays to quote total costs including maintenance and functionality software as well as power and heat.
Also, if you think you might be moving some or all of your work from one platform to another during the next few years, you need to get prices that include all the relevant interfacing hardware and software. While it ought to be sufficient to specify a standard such as Fibre Channel or iSCSI (over Ethernet), be specific. In this game, you can’t count on good surprises.
The result might cause a bit of a stir. On just about any measure where a XIV looks good compared to, say, a Symmetrix or Sun Microsystems StorageTek array, it is going to look great compared to a DS8000, if IBM’s flagship disk subsystem is in the running.
It’s not that IBM fellow Yanai wants the DS8000 to tank. It’s that he probably doesn’t have any choice if he wants his XIV device to emerge a winner.