Bug Busters Tackles Journaling Issues with HA Software
June 26, 2012 Alex Woodie
Problems can crop up in even the best-run high availability environments, which is why HA products typically have a collection of tools and reports to assist with trouble-shooting. Users of the latest release of Bug Busters Software Engineering‘s high availability software, RSF-HA version 9.1, will find new tools for tracking down journaling problems that occur during data replication. The release also includes new features for handling spool file replication, and a bandwidth estimator.
Replication errors are a frequent occurrence for IBM i HA products based on logical replication. In a busy production environment, the task of keeping a perfectly replicated copy of one’s data and objects on the backup machine is a near impossibility. It’s not a matter of if errors will crop up, but rather how frequently they occur, and how one reacts to them.
All IBM i HA products, including RSF-HA, have built-in ways of dealing with replication errors, and can even recover automatically from some types of errors. In the new release of RSF-HA, Bug Busters has introduced new logging options that will help users deal with journal apply errors on the target machine.
The new logging features let users capture much more information about the state of the machine when the error occurred. Users can now direct RSF to collect and report on any number of factors, including: the general condition and context of the error; details about any locks on libraries, objects, members, or records that could have contributed to the problem; any changes made to database constraints or triggers that might have caused the error; a list of changes made to specific objects by jobs or outside users; and a job log that shows any system messages sent around the time the error occurred.
New settings in RSF-HA 9.1 allow users to configure the product to entirely avoid common replication errors, such as those due to save operations on the target machine. When RSF-HA encounters locks on the target machine due to saves, the product can now be configured to wait a certain period of time and retry the replication process later. Users can specify the number of retries, and the amount of time between each retry.
Bug Busters has also introduced new ways of dealing with replication errors that result from attempts to change the attributes and authorities of objects. RSF-HA users can now instruct the product to treat these types of errors differently than data-oriented problems, such as errors that occur when records are changed, added, or deleted within an object. This new feature results in faster and more efficient error correction, the company says.
Bug Busters also updated its integrity checking routines to enable users to omit checks for objects that the user has already configured the product to skip in the sync attribute screen. The checks can also now deal with slight differences between objects on the primary and target machines without generating an error. Bug Busters has also simplified the integrity check report to make it more readable.
The release also introduces four new reports that illuminate the situation around replication journal lag. Journal lag refers to the difference in time between when the primary machine sends a transaction (or journal entry) down the pipe to the target machine, and when the target machine receives it or applies it. RSF-HA can now provide details on the journal lag, expressed in the amount of time and the number of journals, for the transfer of journal entries across the network and the apply process.
Network bandwidth is critical to efficient replication, and to that end, there’s also a new bandwidth estimator in RSF-HA 9.1. The new commands allow the user to view how much bandwidth is available for replication, and also display the bandwidth requirements. Data is gathered from database files or existing journals, and bandwidth requirements are displayed for each hour from the collected data.
Spool files are a critical but often overlooked aspect of IBM i operations, and should be included in any HA strategy. Bug Busters has made several enhancements to its spool file replication capabilities with this release. The company says its spool file replication processes are now more thorough and integrated with library replication. Users also get new capabilities for selecting or omitting which output queues will be replicated.
This release also makes it easier to start replication. In previous releases of RSF-HA, it was easy to start replication for all libraries and IFS directories where replication was defined. However, replication of system items–like user profiles, authorization lists, and system values–had to be started individually. Now RSF-HA will start all defined system replication tasks at one time.
RSF-HA 9.1 is available now. For more information, see the company’s website at www.bugbusters.net.