Home
TFH
OS/400 Edition
Volume 11, Number 30 -- August 5, 2002

Many of the points made in the following letter from Chris Hird were later refuted in a reader feedback letter (see "Remote Journaling Is Not Just for Database Any More") from Larry Youngren, an OS/400 microcode designer for IBM in Rochester, Minnesota, whose current assignment involves future performance and recovery improvements affecting journaling.


Tech Insight: OS/400 Remote Journaling Issues

by Chris Hird



I read with some interest the article on remote journaling and its position as a high availability tool ["Vendors Differ on the Importance of Remote Journaling"]. As someone who has been around the HA industry for some time, I was very excited when remote journaling was first announced. I remember discussions with Larry Youngren and my good friend Guy Dehond many years ago, when I worked at IBM, about a concept of what was then termed a "Gold Link," which would be a High Speed Optical Link using the concept of twin tail journaling.

This was at a time when IBM Europe used a product called Multiple Systems Software as its prime HA tool. We then saw OptiConnect emerge as a viable high speed option for the HA vendors to use as a communications link between two systems; at the time we felt that one major problem was bandwidth availability with the low speed connections, which were available to most AS/400 users. If you join OptiConnect and remote journaling, you would have what I believe was then termed the "Gold Link."

I have spent many years installing MIMIX as my choice of product for HA customers. While some may feel I am biased about the product, I am always looking at what the other products offer. I always suggest to customers that they look at all of the offerings, not from a sales perspective--you always get told something is the best product, regardless of your requirements--but from an overall capabilities standpoint. My view is that they are all, more or less, the same, and it's the quality of the total solution that matters, not the individual product. But each has its own benefits. Your article did touch on some of these, but I feel it should have covered in a little more depth just what remote journaling does and doesn't offer.

First, the main benefit has to be the fact that it is integrated with the operating system. As Youngren said in your story, it's better plumbing. But one drawback is the lack of tools associated with that great plumbing. IBM has a good record of supplying great technology, but it has always lacked the ability to provide simple, easy-to-use interfaces to that technology. It has always relied on the independent software vendors to supply that interface. So, unless you know how it ticks, it can be quite daunting to implement.

Second, synchronous and asynchronous capabilities need to be carefully planned before implementing. But the fact that it can be turned on and off at will, with simple commands, and does not require application changes is a major plus for remote journaling.

Third, not having to rely on the HA products' communications programming quality is another great plus for remote journaling. I have spent many hours trying to debug communications problems between systems and trying to marry up the message output from the HA product and what is happening in the operating system, and this can be hard work.

The downside of remote journaling is that you lose control on when and how the data will be sent to the remote system. Tying up object and data replication together cannot be done using remote journaling. Remote journaling does not support the data area, the data queue, or IFS object journaling, which are now supported in OS/400 V5R1. These are available only to those vendors who have their own journal scraper and transport mechanism. Perhaps IBM will allow this support in the future? I am not sure why it is restricted, so I cannot comment on whether it would be possible in the future. So if you have a requirement to replicate anything other than data files, and the product doesn't provide another method of replication, your choices are limited to those products that have that capability. The vendors that provide their own journal scraper and transport methods generally allow use of remote journaling as well, making them the obvious choice, as you can mix and match the environments.

Moreover, you cannot use the RMVJRNCHG commands on remote journals. This is one of the issues most vendors will skirt around, even those with normal journals. The problem is, the products are now very efficient. This means data is replicated between systems in near real time for most implementations. So when a system fails, generally the products are up-to-date with the data replication. The problem occurs when you need to recover the data back to a known point. If you have commitment control, you don't have the same issues, but I have yet to come across many applications that have implemented it. How do you now remove any data changes? What options/tools does the product provide? To be fair, the normal replication method doesn't fare much better, even with the database being journaled on the target system. The job details are lost on the remote journal--it is the product's apply job that creates the entries! So to correlate the entries received with what is in the journal is a fairly complex task. No job information is passed between the systems from any of the HA products, so you are in a guessing game for identifying what jobs were active and which had finished.

Object replication is a complex subject, too. The integration of object and data replication can cause many HA implementations to fail when a recovery is required. They are separate processes with no method of communication between them to determine their synchronicity. To replicate an object, most products use some method of saving and restoring to replicate the object to the remote system. This can be a time-intensive operation, so marrying a database change and an object change is nearly impossible. Many of the HA products also use a last-change method for replication. This means that an object may change many times, but it is only saved and restored once. This relates to the last change, but, again, this is based on the last read of the object journal, so it can be even more complex to understand the correlation between a database change and an object restore on the target system. The save itself can be of an object state that relates to a journal entry that hasn't been read by the journal scraper!

User profiles are another issue, and are generally mirrored between systems, especially passwords. You can always run programs to create profiles and set passwords on the target system, so this is not a major drawback to those products that don't support object replication. Products that rely solely on remote journaling don't have these issues, because they can't support them, but some customers require this object level support regardless of the issues.

I am sure there are other issues I haven't covered, but I feel these are the main issues facing HA customers examining remote journaling and other transport methods. I believe both methods have a place in the market, a position held by many of the HA vendors themselves. But if a product only supports remote journaling, you are going to be restricted in the level of recovery you have. You may need some of the additional functionality provided by those products that support both methods. The pricing of the products that don't have the support for both is not, in my opinion, sufficiently low enough to warrant limiting your future capabilities.

IBM has released remote journaling support because of a market need. The HA suppliers must now do the math to determine if they wish to include the support in their data replication products. This could be a difficult decision, as the use of this technology could affect their bottom line, since they rely on data area and data queue replication as a major influencer for object replication. So if this is covered in their data replication products, people will find it more difficult to justify the high cost of the object replication product.

Chris Hird is president of Shield Advanced Solutions, which provides tools and utilities aimed mainly at supporting HA environments. He first worked with high availability at IBM Havant, in the United Kingdom, in 1989, and was responsible for the technical interface with developers of HA products and for setting up a support structure in the UK to support the IBM customers. Hird left IBM in 1993 to set up Shield Software Services, which was an IBM business partner and a MIMIX reseller, and in 1997 he moved to Canada and launched Shield Advanced Solutions. You can contact Hird at chrish@shield.on.ca.


Sponsored By
CENTERFIELD TECHNOLOGY

Take the guesswork out of iSeries DASD Management with disk/HUNTER
Why wait until your system slows down or stops because a run-away query eats up your disk space?

Many application and database problems can lead to disk spikes, including:

  • Database queries that create large, temporary indexes (access paths)
  • Database queries that create large temporary spaces to implement hash joins or grouping
  • Applications that use heap storage but do not free it (memory leaks)
  • Java-based applications that create a large number of objects but that do not get reclaimed because of dangling object references
  • Applications that use the Integrated File System (IFS) and create large stream files but do not always delete them when they are done
  • Users who create copies of data in IFS as backups not realizing how much disk space they consume
  • Jobs caught in an infinite loop writing printing and creating very large spool files
  • Applications that write to log files that do not get cleared often or at all
  • Data collected by IBM performance and trace tools

Once installed and configured, disk/HUNTER runs continuously in the background on your AS/400 iSeries, and “wakes up” at intervals that you specify. Every time disk/HUNTER wakes up, it will measure the amount of free space remaining within specified ASPs or all ASPs.

When the amount of free space decreases by a user-specified percentage or by a user-specified amount (in megabytes or gigabytes), disk/HUNTER will go to work, generating messages and reports pointing to the origination of the spike.

The combination of detection and diagnostic capabilities within disk/HUNTER provides a powerful element of control to iSeries shops that experience high levels of activity and that also need to be proactive in managing storage.

To really see how disk/HUNTER can be both an insurance policy and a time-saver in your shop, click here to download a demo or order a no-charge 30-day support-assisted trial evaluation.


THIS ISSUE
SPONSORED BY:

Aldon Computer Group
BCD Int'l
Elite Document Solutions
Centerfield Technology
RJS Software Systems
FAST400


BACK ISSUES

TABLE OF CONTENTS
IBM Clarifies One Green Streak Deal, Adds Another

IBM Buys PricewaterhouseCoopers IT Consulting Biz

Admin Alert: Copying IFS Directories Between Two iSeries, Part 1

Tech Insight: OS/400 Remote Journaling Issues

But Wait There's More . . .

As I See It: Loving Your Manager

Editor
Timothy Prickett Morgan

Managing Editor
Shannon Pastore

Contributing Editors:
Dan Burger
Joe Hertvik
Kevin Vandever
Shannon O'Donnell
Victor Rozek
Hesh Wiener
Alex Woodie

Contact the Editors
Do you have a gripe, inside dope or an opinion?
Email the editors:
editors@itjungle.com



Last Updated: 9/24/02
Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.