Midrange Shops Not As Protected from Disaster As They Think, Vision Finds
November 24, 2008 Alex Woodie
System i and System p shops have an irrational faith in the capabilities of their disaster recovery systems, and could be in for a harsh reality check if their servers go down unexpectedly. This is according to a report issued last week by high availability software provider Vision Solutions, which looks at the DR and HA technologies used by these organizations and the expectations placed upon them.
For its new report, entitled The State of Resilience and Optimization on IBM Power Systems, Vision Solutions and its strategic partner, the Information Availability Institute, surveyed more than 2,000 IT professionals and executives at organizations that use IBM Power Systems (and System i and System p) servers from July 2007 through July 2008. Survey participants were asked about their organizations’ goals concerning recovery time objectives (RTOs) and recovery point objectives (RPOs), as well as what DR, HA, or other resiliency technologies or processes they have in place to achieve these goals. Vision did not provide a sampling error for its study, but based on the standard rate of error for a sample size of 2,000, it would be about 2 percent.
There were no real surprises about the results concerning RTO–the time it takes to get servers or applications back on line after a failure. According to Vision, 45 percent said they expected a recovery within six hours, while 78 percent would not tolerate a recovery taking longer than 24 hours. Interestingly, System i shops were a little more relaxed in their RTO expectations.
The numbers were also similarly in line with expectations for RPO, or the amount of data (usually expressed in minutes or hours of lost transactions) a business can tolerate losing following an outage. Nearly 60 percent of respondents said they could not tolerate any loss or just the loss of a few minutes worth of transactions, Vision says, while nearly 20 percent said they could stand a full day of business downtime. The results were nearly identical for System i and System p shops.
Next, Vision looked at the types of HA and DR setups (and everything in between) that organizations have in place to avoid or recover from outages (remember, Vision sells HA for AIX as well as i5/OS). Not surprisingly, four out of five shops still rely on tape, to one extent or another, to maintain data integrity. Another time-tested technology is widespread use is RAID disk protection, which is used by nearly three out of four respondents.
Now, this is where it gets interesting. According to Vision, 37 percent of respondents report mirroring data to a second system, while 34 percent are using some form of disk-to-disk replication technology. About 35 percent use tape spruced up with journaling or logging capabilities, while 34 percent report using an automated tape management system. Just 10 percent reported using a hot site service.
When the resiliency pros at Vision parsed all the data, they discovered evidence of a fairly large gap between the organizations’ expectations of recovery and the likelihood that they will actually achieve the expected outcome with the DR and HA technologies and strategies in place.
“It appears that a large percentage of companies have a disconnect between their goals and perception of DR readiness and the actual state of affairs,” Vision concludes in the 10-page executive summary. “A large percentage of shops reported relying solely or predominantly on tape-based methods. Achieving intra-day recovery time with these methods is generally unlikely, and to expect tape to support RPOs of one hour or less is just not realistic.”
Vision even went so far as to call it a case of “irrational exuberance,” in a nod to Alan Greenspan’s take on the dot-com bubble near the turn of the millennia, which could similarly be applied to any number of bubbles since.
So, where did this disconnect come from, and what might result from it? The cause is pretty clear. Most companies have a desire to weather disasters that might come their way, and want to have the strength and ability to quickly get back on their feet after being knocked down. That’s just human nature. But, as Vision has found, many companies simply have not invested sufficient time and resources to adequately meet the stated goal of surviving a disaster. Depending on the industry they’re in, businesses that suffer a major outage could be gone for good.
What can companies do about it? There are really two options. First, they can lower their expectations of recovery and pray that the next hurricane, earthquake, or tornado misses them. Alternatively, they can invest in better technologies and training to make their systems more resilient. Obviously, Vision would prefer the latter, and it will continue beating the RPO and RTO drums until executives wake up and hear the music.
If there’s a silver lining for Vision in all this, it’s that there’s a lot of business yet to be done. “What has become the most encouraging information uncovered in our surveys,” says Vision senior vice president of marketing Edward Vesely, “is the tremendous market opportunity still existing with high availability and disaster recovery.”
Vision will be holding a Webinar on December 2 to discuss the results of the study and what organizations can do about it. To reserve a seat at the Webinar and to register for the paper, go to www.visionsolutions.com/stateofpower.