Guru: Backup to an NAS using RSYNC
February 14, 2022 Dan Devoe
Like many small shops, we use a simple tape drive to backup most of our libraries and the contents of the IFS. Each workday, the tape would be changed and the previous tape taken home by someone in IT in case of a disaster.
But then, COVID hit – there was no one in the office to switch tapes and the same tape was being used to perform nightly backups. What if we discovered a problem with the data from two nights ago? Ordinarily, we would restore one or more libraries or files from the appropriate tape.
I set up a few image catalogs, one for each day of the week, where each morning, I duplicate the contents of last night’s backup tape to the appropriate image catalog.
This was all fine and good for solving the restoring from two nights ago issue, but what if there was a natural disaster at the office? We still would not have a copy of the data off-site.
We could have copied the contents of the image catalog to a cloud service, but if you’ve ever read the terms of service for a cloud service, you know that there is the question of who actually “owns” the data. Besides, using that much bandwidth throughout the day could cause a bottleneck for the end users.
We have an external Network-Attached Storage (NAS) system, where we store several of our shared documents and images. The contents of this NAS are replicated nightly to a second NAS via RSYNC at our warehouse. What if the nightly backups could be replicated to the external NAS without slowing down the internal network too much?
The external NAS has a second Ethernet port and our IBM has multiple Ethernet ports. I took a spare switch and created a separate subnet, which has no visibility to the outside world (not connected to a router/firewall) and began experimenting with connecting to the NAS over the new subnet.
What if I could use RSYNC on IBM i to connect to our NAS, to backup the image catalogs? Fortunately, RSYNC is available through Open Source Package Management in IBM i Access Client Solutions (ACS).
Once ACS is installed on your local PC, go into Tools > Open Source Package Management.
If you receive the following error message, you’ll need to start SSH on your IBM i.
Once you are connected, you can view installed packages, what packages have updates available, and available (not installed) packages.
Check to see if you have RSYNC installed. If you don’t, go to Available packages and install it.
In order to connect to the remote system without needing to provide a password, you will need to use public-key authentication by creating an RSA key pair and then placing the public-key on the remote location. The document at this link describes how to accomplish this.
I stored the public key under user “root” on the NAS.
To test that connection to the remote system can now be access without providing a password, log into the user profile that you used to create the public-key and then enter the following commands:
CALL QP2TERM. ssh root@<ip address>
The first time you connect, you will receive the following message:
The authenticity of host 'aaa.bbb.ccc.ddd (aaa.bbb.ccc.ddd)' can't be established. ECDSA key fingerprint is SHA256:<some value> Are you sure you want to continue connecting (yes/no/[fingerprint])?
Depending on the system you’re connecting to, you may get “tcgetattr: Invalid argument” message. You can ignore this.
RSYNC uses a “delta transfer algorithm.” This means that rather than copying the entire file, it compares the source and the destination and only copies the differences. This generally results in quicker backups than a straight copy. There are several options available for RSYNC, but that goes beyond the scope of this article. Through trial and error, I’ve found the following options to work the most reliably. Depending on what I’m backing up, the backup runs from either the job scheduler or directly from an RPG program:
Getting back to the daily backups, as you can see, there is a recent copy of GEN01 (the actual backup file from the image catalog) and QIMGCLG (the definition of the image catalog) for each workday.
The NAS units that we have also take snapshots, which is essentially versioning of a file. That means that, depending on the number of snapshots, restoration of a backup could be several weeks or even months old (depending on NAS configuration, storage, etc.).
I also have a few IFS folders that I also backup to the NAS via RSYNC, in case a user accidently deletes a file. I’m also planning to back up full system saves (GO SAVE 21) in a similar fashion.
You can never have too many backups. Even though we are closer to “normal” than we were several months ago, this method of backing up data gives us extra peace of mind.