Doing Disaster Recovery Planning Right For Your IBM Power Iron
October 19, 2022 Jason Hardy
As I was writing this article, I saw this in my LinkedIn news feed, according to Axios.com: “billion-dollar weather disasters now happen once every 18 days.” My perception was that the United States was getting more frequent significant weather events and it turns out that data from the National Weather Service backs that up.
Now add in hardware/software failures, cyber security events and local disasters (fire, water damage, power outages) and it makes sense that disaster recovery is one of the top challenges facing CIOs today. If this is a topic you are contemplating, here are a few questions to consider:
- What am I protecting against? Is it one of the items listed above or is it something else? Knowing the answer to this question will allow you to evaluate solutions and determine which ones actually deliver on your needs.
- What is my recovery time objective (RTO)? In other words, how quickly does my IT infrastructure need to be back up and running? If you are running mission critical applications on your IBM POWER, your RTO is likely very short. How short will be a balancing act between the cost of downtime and the cost of a speedy failover.
- What is my recovery point objective (RPO). This is the question of, how much data can I afford to lose? You will have to evaluate how dynamic your data is and the impact of losing data. In a very dynamic environment, the loss of data may be very costly and difficult to recover from while a less dynamic environment can handle a longer period of downtime without significant harm.
Once you have answered those questions, you can begin evaluating options. Here is what we see and hear in the market.
Stay Current On Maintenance And Support
One of the most basic things you can do is to keep your IBM Power Systems iron current. This means everything from your maintenance and support agreement to the installation of PTFs. Companies that seek assistance from IBM and are not current are often told to “get current” before they can be given assistance as having the latest PTFs proactively installed may correct hardware and software problems before they appear and without further intervention.
For companies that literally hang on to hardware until it dies, the extended maintenance and support model may come back to bite you. These agreements can be expensive and the companies that offer these services are dependent on spare parts in the market and typically don’t make any guarantees of availability. As an example, we have a customer that was running a Power5 machine and experienced a hardware failure. After waiting for nearly a week for replacement hardware they came to us and recovered in our cloud environment. That’s years of payments for a support agreement that ultimately failed them.
Buying Tip: If you are purchasing new hardware and plan to keep it for an extended period of time, consider negotiating extra years of maintenance and support into the agreement. This will enable you to lock in your rates for a longer timeframe and give you a more predictable monthly expense.
Extend The Life Of That Old Server
Many organizations purchase hardware versus leasing it and rather than disposing of older hardware, they will utilize it as their disaster recovery infrastructure. This extends the life of that asset and can be a building block for your disaster recovery program. Depending on how this hardware is deployed, it can provide onsite hardware redundancy or be placed at a remote location, possibly co-located at a datacenter, protecting you in the event of a hardware or site disaster.
Buying Tip: It is important to keep the DR server under current maintenance and support so that it is being patched and is up to date. Too many people look to minimize the cost of this DR solution and rack the old server and forget about it. The problem is, it may be years before the server is needed and if it is not updated, it may or may not be a viable DR option.
Buy A Cloud Insurance Policy
Many cloud providers offer reserved hardware as an insurance policy for DR. With this model, you are purchasing the guaranteed right to recover with a specific amount of resource (processor, memory and storage) on their infrastructure in the event you experience an outage on your primary infrastructure. The cost of this reservation is significantly less than having warm resources available and as a tradeoff your RTO and RPO are likely to be longer. An often-overlooked gotcha is application licensing. Be prepared to work with third party vendors to update your license keys.
Buying Tip: Consider keeping a backup copy of your data, periodically refreshed, at your cloud providers location and conducting a DR test during which you will configure networking to confirm everything works as expected. These two steps can dramatically reduce your RTO.
High Availability In The Cloud
At the pinnacle of DR solutions is high availability (HA). Designed to deliver on a short RTO and a low RPO these solutions get your business back up and running with minimal downtime and data loss. The key to delivery of an HA solution is having the hardware and software infrastructure running and the latest data available to the system. To accomplish this, we start by deploying Rocket iCluster on our server and the customer server. This application performs real-time data replication between the two devices to keep them in sync. In our datacenter, we also build out an LPAR with minimal resources, just enough to allow the replication to happen between the devices, and we reserve additional resources to be utilized in the event of a disaster.
Buying Tip: Much the same as reserved hardware, preconfigure and test the solution annually so that you know everything works and you will know what is necessary in terms of application licensing in the event of a disaster.
At Racksquared Data Centers, we find that every company is unique as are their DR requirements. As a result, we use the above solutions as the foundation for customizing a solution to meet the specific business needs of each organization. If you would like to learn more about these options, get budgetary pricing or even perform a disaster recovery test, check out our IBM Power Solutions on the web, see some of our client success stories, or contact us to schedule a call.
Jason Hardy is director and general manager of Racksquared.
This content was sponsored by Racksquared.