IBM Gets Clustered Storage and EMC Founder with XIV Buy
January 14, 2008 Timothy Prickett Morgan
The company’s name is pronounced “Ex Eye Vee,” and it stands for the Roman numeral for “fourteen,” in honor of the fourteenth class of the Talpiot technical university, run by the Israeli military, that XIV Limited‘s founders all hail. And now the Israeli startup specializing in clustered disk storage is part of IBM‘s System Storage unit, which has its hands on the small company’s substantial expertise in block-level, virtualized, clustered storage arrays.
IBM and XIV did not detail the price it paid for the company, but the Israeli business newspaper Globes reported on December 30, several days before the deal was announced on January 2, that IBM was going to pay between $300 million and $350 million for the small storage company. As is often the case with acquisitions among the big IT players, this deal seems to be as much about defense as it is about offense.
XIV was founded in 2002, but really got a big boost when Moshe Yanai, one of the founders of disk array giant EMC, left the company with buckets of money he made as EMC took on IBM in the mainframe disk market in the 1980s and 1990s and more or less won, becoming fabulously wealthy in the process and making untold numbers of investors wealthy as well. (During the dot-com boom a decade ago, EMC and Sun Microsystems had market capitalizations in excess of $200 billion, more than General Motors at the time–and that still meant something.) Since coming aboard XIV to lend a technical hand, to become chairman, and to invest some cash (according to the Globes report) to fund research and development for the company’s Nextra clustered storage product line, Yanai and the founders of the company have been gearing up to take on the establish disk array players with a slightly different kind of disk subsystem–much as EMC’s Symmetrix arrays, designed by Yanai using big gobs of cache memory and cheap SCSI disks allowed EMC to offer better performance and much lower prices for mainframe disk capacity than IBM, Hitachi, and Fujitsu could deliver at the time, thereby allowing EMC to gobble up market share and expand into Unix systems storage and then Windows as it became a part of the data center. So far, according to Globes, XIV has raised $3 million in funding from Yanai and other unnamed investors, and has kept a very low profile as a stealth-mode startup getting its product ready for market.
The server market has been utterly transformed by the ability to cluster servers and operating systems so they can be both virtualized and share workloads across nodes, and the same revolution is happening to storage subsystems (which are, after all, just specialized servers with lots of disks attached to them). The storage market has a number of new players in clustered storage, and IBM wants to jump on the bandwagon through an acquisition rather than do its own substantial research and development. The competition in this space is technically clever, so buying XIV is a smart move for Big Blue, which will presumably get the Nextra product line to market. Other clustered storage array providers include 3PAR, EqualLogic (just acquired by Dell for $1.4 billion), Isilon Systems (which did a $1.4 billion initial public offering in February 2007), Network Appliance (the market leader in network-attached storage that is now ganging up boxes for scalability), and PolyServe (acquired last year by Hewlett-Packard just after Isilon went public).
IBM has been building out its storage-related software business recently, with the acquisitions of Softek, FileNet, and NovusCG, but this XIV acquisition looks like IBM is getting ready to slap the Big Blue label on a whole new product line. If DS is short for Data Server, then CS is a good name for a clustered server line, right? And no matter how much IBM wants to spin the XIV acquisition as a means to support so-called Web 2.0 and digital media workloads and not as at least a partial replacement or augmentation of its DS disk array product line (which is based on its Power-based System p servers), this is utterly silly. IBM will sell monolithic disk arrays like the DS family and clustered arrays like the Nextra products and it will ultimately let customers decide what they want to use in their data center for particular applications. Considering the lower cost and high scalability of clustered disk arrays, the writing is on the wall and anyone can read it. That said, there are times when a big wonking disk array is still necessary or desirable, and IBM will make them so long as customers want to pay for them.
As far as operating systems and their servers are concerned, the Nextra product line presents itself as a block-level disk array, just like any other normal disk array out there today. This is distinct from file-level disk arrays, which don’t think in terms of keeping track of sectors and blocks where data is stored on disks and arrays of disks, but which rather have a virtualized interface to the physical hardware that allows the array to thing in terms of files, no matter where the 1s and 0s that make up that file–often scattered across different blocks that are not adjacent to each other, or in RAID arrays, not even on the same physical disk. Disk arrays engineered for file-level access still have block-level access going on, but applications are insulated from this.
The Nextra disk architecture takes the logical volumes in the clustered array and divides them into 1 MB stripes that are spread across all of the disks in the array in what it calls a pseudo-random distribution. This randomness means that the load on the disks remains balanced and predictable even as disks are added or, heaven forbid, there is a crash in the RAID-X algorithm for data protection that Nextra has created and a disk needs its data to be recreated on a hot spare in the array. (The RAID 5 algorithm is very slow because it has to rebuild all the blocks on a disk, sometimes take as much as six to 25 hours, according to XIV. But RAID X can rebuild a 500 GB disk drive in about 15 minutes because it only rebuilds the blocks where data was written on it, not the empty spaces.) RAID X doesn’t just mirror across disks, but takes the volumes on the disks and breaks them into smaller units where different pieces are mirrored on different disks. When a disk crashes, all of the disks in the array participate a little in the rebuild, instead of a few disks in a group getting hammered with massive I/O requests.
The base disk modules in the Nextra arrays are built on 7200 RPM SATA drives, and have a hierarchy of caches to make up for the slowness of the disks to boost performance–much as Symmetrix ganged up large numbers of SCSI disks to beat out larger and faster mainframe disks two decades ago. The data modules in the array are based on X64 processors, and cache memory units and disks plug into PCI-X buses.
It will be very interesting to see the Nextra product come to market and to see what IBM does with it. In theory, any of IBM’s servers–System z, System i, System p, System x, and BladeCenter–can be supported with the Nextra arrays. IBM will almost certainly support z/OS, i5/OS, AIX, Linux, Windows, Solaris, and HP-UX with the products that ultimately come out of the company in the wake of the acquisition. It will also be interesting to see what role Yanai plays at IBM, if he decides to stay with his one-time rival.