How Good Are Your Backups? Wait, Don’t Answer Too Quickly
April 18, 2022 Pete Massiello
It’s probably a good day to review your backup strategy and your backup itself. If you haven’t done a backup in a while, set some time for either later today or this weekend to get a good backup. Remember, your recovery is only as good as your backup. If you have never tested your recovery, it’s probably a good time to look at that as well.
Over two months ago, Russia invaded Ukraine, and then the United States and its allies put some crippling sanctions on Russia. For every action, there is usually a corresponding reaction. The news is abuzz about cyber-attacks, ransomware, and other security-related issues that may be unleashed. We used to do backups just for recovery in case we lost our machine, which usually involved a hardware malfunction. Now, the world has gone crazy, and backups serve as a great tool to recover your system from any type of cyberattack as well.
Having what is referred to as the “air gap” will help you from cyberattacks, or any recovery where you need to put your system back together. Remember, you also need enough copies of your backup, going back far enough. So, now just one weeks’ worth. Now, cyberattacks can lay idle and have infected your system for months, before they come to life. Do you have a backup of before the cyberattack hit? When you make a backup to physical tape, or what is now more common, Virtual Tape Libraries, you are creating a gap between the data on your machine and the backup copy which is detached from your data on the machine, hence the phrase air gap.
Having knowledgeable people and good procedures are two important parts of recovery. Where are the backup tapes? Which tapes should we use? What gets saved on what backup? Where are the disaster recovery procedures? What order do we use the tapes in? If there are replicated systems, have you already performed a role swap? Where are their replicated systems? You know, I always laugh when people say to me, we don’t need backups because we have replication. Great, if your replication is working correctly, if someone clears a physical file, or puts ransomware on your machine, the replication software should do the same thing to the target machine. Now, what, you have two problems that you can’t fix. You still need backups when you do replication.
In the case of natural disasters like floods, it can be days before the floods start to subside, and even longer before people can get back into their businesses. What will be left? Did they do a backup beforehand? Are the backups underwater with the machine? If this was your company, could you last over two weeks before you were able to be back online? As we watch the aftermath of any hurricane, what are you doing to protect your environment, data, and computers? This might be a good time to look at replicating to our IBM i in the Cloud.
You Never Know When Disaster Is Going To Strike
A while ago, we got a call from a new client who had a disk failure without RAID, and they couldn’t recover their system after the disk was replaced. I asked them why they called us, and they told us that IBM had recommended they contact us. First, you should always have some level of disk protection, which is your first defense against loss of data. My preference is RAID 5 data protection with hot spare for most customers, it gives that added level of protection. There is always mirroring or RAID 6 as well, but for most of our customers, RAID 5 with hot spare provides ample protection. I am not going to get into the specifics of the issues why this customer couldn’t recover, but I thought their experience would make a great article to remind our customers to ensure that your backups are able to be used for a recovery. I want to ask you a few questions. Remember, you only have to answer to yourself right now, but in a disaster, you will be answering to the management of the company, and quite frankly your job is on the line if you aren’t following best practices.
Of course, I am assuming that you are doing backups, but I have seen companies who don’t backup their machines, or they don’t back the machine up often enough. How often is often enough? It depends. Sorry, there is no answer that works for everyone. You need to balance the availability of your system, as your system is usually down during the backup, and how difficult you want to make your recovery. The best backup is a Full system backup (GO SAVE 21) every night, but of course, that isn’t always practical. Although, with external storage and Flash Copy, we have customers backing up 11 TB of data every night doing a full system save, with only 1 minute of downtime. Impressive. It’s an easy process to implement, but it takes a lot of processes to be pulled together. We can help you implement Flash System Copy. Some companies backup their system fully once a week, once a month, or quarterly. Then each night, they backup their data libraries. Some companies backup the entire system quarterly, a few libraries weekly, and then other libraries daily. One problem that I always see with selective library backups is that library names change or new libraries get added, but the backup procedure isn’t changed. Also, don’t forget there could be data in the IFS that also needs to be backed up. Everyone’s backup solution is different, and it should be analyzed to determine if it is meeting the needs of the company. Can you recover from this backup strategy?
Once you have a backup, it is imperative that the tape is taken off-site. You don’t want to keep the backup and the backup tape in the same computer room. What would happen if the computer were to catch on fire, both the backup and the machine could be destroyed. Remember, that iTech Solutions has both Cloud based backups for your IBM i, which gets the data immediately off site, as well as replicated Virtual Tape Libraries that will also get your virtual tape off-site. Our cloud-based backup is encrypted in transit and at rest for your protection.
A Small Problem With Backups
Let’s use an example. Each night at midnight, we do a full backup, which takes one hour. Great. During the day, our company is entering data, running jobs, taking phone orders, shipping, invoicing, etc. If the machine were to fail at 10:00 PM due to cyberattack or hardware failure, then we would have lost 19 hours of data. Oh, that isn’t good. Yet, we have a full backup each night. We have no backup of all the transactions that happened during the day until the next backup at midnight. That is where replication comes in. We have quite a few customers that are replicating to our IBM i in the cloud or to another machine on-prem to ensure that as soon as one transaction is entered on the system, it is immediately sent to our IBM i machine in the cloud. This means these customers don’t have the problem of losing their transactions since the last backup. We can even perform target-side backups on our cloud so that you have no downtime each day on your system (the source system). This provides higher availability for your customers.
The question is, when was the last time you tested your recovery? Don’t put the article down now, because I asked you an uncomfortable question. From my experience, if you haven’t tested your recovery, you have no idea if you actually can recover. I mean really test it out, don’t just try to restore one file from last night’s backup and think you are a superhero. The only time you are allowed to wear a red cape is after you have successfully taken your backups and tried a recovery. Even if you don’t have a machine, iTech Solutions will bring a machine to your location, or you can bring your tapes to us (or send them to us), and we will help you recover your machine. My staff and I have done hundreds of recoveries, and I can tell you that the customer always learns something from each and every recovery I have been involved with.
Collect your quarterly backups, your daily tapes, and let’s try to see if you can recover. Let’s see what you might be missing. Let’s determine any issues that you have now so that you can fix and correct them while it’s a test. Do you want to tell the owners of your company that you “thought” we had good backups? If it isn’t backed up you can’t recover it. If you don’t have the latest copy, you can’t give them the correct data. If you aren’t replicating, then you have to understand you will only be able to recover up to the last backup that you performed.
In summary, backups are your first step in your recovery, but you need to test a restore to make sure you have everything you need to recover your system. Do you know if what you are writing to tape is readable? You have no idea until you try to recover it. Don’t wait until your job is on the line, contact iTech Solutions today to schedule your recovery. The job you save might be your own!
Check out Pete’s webinar The Basics of IBM i Backup and Recovery to learn about what you need to restore your system, the steps involved to perform the restore, as well as tips and best practices. Watch here: https://info.itechsol.com/webinar-basics-of-ibm-i-backup-and-recovery
This content was sponsored by iTech Solutions Group.
Pete Massiello is president of iTech Solutions Group.
IBM i Community Predictions For 2022, Part 1
Long-Term Impacts of COVID-19 Predicted for IBM i Shops
Reading The IBM i Tea Leaves For 2020
Going Off IBM Hardware Maintenance A Risky Move
Massiello Named First Lifetime Champion for Power