fhg
Volume 8, Number 29 -- August 13, 2008

Admin Alert: Common Mistakes When Failing Over to a CBU

Published: August 13, 2008

by Joe Hertvik

My shop ran its first high availability failover test in 2007. We've since run many more tests where we configured two different System i 550 Capacity BackUp (CBU) units to impersonate other systems. We've also made many mistakes in the process. Today, I'm focusing on common CBU failover problems and how to avoid them. Hopefully, this information will help make your failover process run smoothly.

Meet the CBU

In the i5/OS world, a CBU system is a specially configured Power i, System i, or iSeries machine that communicates with your production machine to continuously replicate production data and applications using high availability software. In case of disaster, the CBU can be switched over to "impersonate" the production box, servicing your users, devices, and companion servers with very little delay. When the main production machine comes back up, the CBU relinquishes its role and production is switched back to the regular system.

Because it's delicate work to configure a real-time replacement machine that your entire organization may have to run on, most CBUs are exhaustively configured and tested to ensure that stupid mistakes don't wreck your processing if the CBU ever has to pinch hit for the production server.

Enter Stupid Mistakes

There are three critical areas that either allow or prevent a CBU's successful impersonation of a production i5/OS server:

  • Setting up your replication software to provide up-to-date copies of data and applications that reside on your production box.
  • Auditing the replication process to ensure that your CBU objects remain in sync with the objects on your production machine.
  • Creating and refining the run book that details the steps needed for your CBU to impersonate the production machine during a failover situation. Failover tests are needed to validate and improve the process that the CBU will use to impersonate production.

Given this broad outline, here are some simple mistakes that can blow your CBU implementation right out of the water. Avoid these issues and you'll be sitting pretty. Trip them, and you'll have a good-sized problem on your hands.

Replication and Auditing--Eyes On the Prize

When setting up replication, make sure that you're replicating all the necessary objects your system relies upon, not just the data and programming libraries. These objects include:

  • All object libraries and fix libraries that run specific or critical pieces of your application software.
  • IFS folders and directories containing additional configuration and program files for your software. They may also contain stream file data in the AS/400 Integrated File System (AS/400 IFS) that your software needs to run.
  • DLO folders and objects for other PC-like files that your environment may need.
  • User profiles.
  • i5/OS devices that specially describe configured printers and other peripherals, complete with IP address, identifying parameters, and other custom configurations.

The first common mistake is neglecting to audit your object replication. If your objects get out of sync, you will lose necessary data or program objects when you failover. So take full advantage of your replication software's auditing functions, and check your audit reports every day. There is no other way to ensure that your objects remain in sync. When auditing, beware of out-of-sync objects and production system libraries that are not present on the CBU partition.

New libraries are especially critical to replicate, as applications programmers frequently add new functionality that must be ported over and set up on the CBU. On a regular basis, you may want to cross-reference the list of libraries and IFS directories that you are replicating to the CBU with the list of all the libraries and IFS directories on your production box. This comparison will help you catch any new libraries or directories that need to be added to your replication list.

Over-Replicating

While auditing helps ensure that you have a complete replicated copy of your database and application software, it is possible to replicate too many objects, particularly for third-party software packages. Be careful to exclude these items from your replication scheme, or they will cause a problem when you failover.

  • License keys for third-party software. i5/OS license keys are usually specific to the serial number of the machine that they are running on. If you've installed a second copy of a package on the CBU, complete with its own license key, you don't want to overwrite it and disable the software by replicating a different license key from the production partition. Check with vendors to determine which program files and objects (including the key file) are machine specific and should not be replicated.
  • System libraries and IFS folders that have nothing to do with your application programs or data. In particular, be very careful about replicating objects in any library whose name begins with the letter 'Q'. If you need to replicate objects in the general purpose (QGPL) or IBM user libraries (QUSRSYS), be very specific as to what you include or exclude from your replication, as you may inadvertently damage your CBU system. On the AS/400 IFS, watch out for overwriting files in the /QIBM folder.
  • Third-party vendor packages that shouldn't be replicated because you are already running a different instance on your CBU. The best example here is Help/Systems' Robot/SCHEDULE software, which is used to sequence and run automated jobs on an i5/OS system. When failed-over to the CBU, you will need to run the SCHEDULE instance from your production box (complete with all your production entries). However, when the CBU isn't failed-over, it may need to run its own instance with schedule entries for backup jobs and other CBU housekeeping. In cases where you run two different instances of a package on your production and CBU partitions, you need to contact your software provider and work out instructions for reinstalling the production instance on your CBU during failover.

Run Book Issues

As you put together the run book (your bible for failing over), you may wind up working with a consultant who will provide you with a basic run book for failing over. This book will contain all the information you need to failover to the CBU, have your CBU impersonate your production machine, and then fail back to the production box.

However, you're going to need more than the basic run book to fully failover to the CBU. Once the initial run book is finished, you're going to have to put in your own run book instructions to set up the CBU exactly the way it must look to successfully impersonate the production box. Some of the additional items you may need to add to the run book include:

  • How to change communication parameters so that remote devices and partitions can talk to your CBU. This is particularly important if you're developing on a separate partition and promoting those changes to production through a change management program, such as Aldon's Lifecycle Manager. You may also encounter this situation with some i5/OS fax serving products that require you to change an Ethernet address inside their configurations to match the Ethernet card you're running on the CBU.
  • Replacing the i5/OS System Distribution Directory (SDD) on your target machine with the SDD from your production machine. The SDD plays an important role in distributing output from many i5/OS products, and some products will not work for your users without having appropriate SDD entries in place.
  • Changing the i5/OS NetServer server name (formerly AS/400 NetServer) on the CBU to reflect the NetServer name of the production box. Many applications use NetServer to access i5/OS data.
  • Reinstalling/reconfiguring third-party software instances of popular programs that run on both the CBU and on your production box. As noted above, if you're running a program that has custom parameters for both your production and your CBU partitions, you'll have to include run book instructions for reconfiguring it on your CBU. The Help/Systems' Robot/SCHEDULE example cited above falls into this category.
  • Retrieve any necessary temporary software keys from third party vendors. For some packages, you don't need to purchase a second production license or a disaster recovery/high availability license to run the product on your CBU during failover. Instead, the package may run for a few days on a grace period, and you may call the vendor if you need to run the product for longer than the grace period. You should include specific instructions for how to handle this type of licensing in the run book.
  • Obtain temporary keys for certain pieces of IBM software. Because of CBU licensing, IBM may not provide keys for running certain i5/OS products on the CBU. If you want to use these products in a failover situation, you may need to call IBM's Key Center to get a temporary key to run your product. Common IBM products falling into this category include Query (5722QU1) and DB2 Query Manager and SQL Dev Kit (5722ST1).

There's a lot more to failing over than just changing your CBU's identity to impersonate the production box. Hopefully these tips will help you understand and avoid some key mistakes that can prevent your failover process from running successfully.


RELATED STORIES

How to Recreate/Restore a System Distribution Directory

How System i Boxes Impersonate Each Other, Part 1

How System i Boxes Impersonate Each Other, Part 2

I Lost My License Key



                     Post this story to del.icio.us
               Post this story to Digg
    Post this story to Slashdot


Sponsored By
GROUP8 SECURITY

The most effective way to improve security is by making the right business decisions
--not just the right technical decisions.
At the heart is the security equation: security=ƒ(cost, risk). Simply put, security is as much about business decisions (cost) as it is about mitigating risk. Technology alone cannot solve all of your problems.

If you're ready to take a new approach to security, learn more about Group8 and how our approach is designed to put you in control. We'll be your partner throughout the process and beyond, always there to make sure your security is the right fit for you.

Learn more. Call 775.852.8887 today.


Senior Technical Editor: Ted Holt
Technical Editor: Joe Hertvik
Contributing Technical Editors: Edwin Earley, Brian Kelly, Michael Sansoterra
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message.

Sponsored Links

looksoftware:  snap the best back-end into the coolest front-end
Computer Measurement Group:  CMG '08 International Conference, December 7-12, Las Vegas
COMMON:  Join us at the Focus 2008 workshop conference, October 5 - 8, in San Francisco, California


 

IT Jungle Store Top Book Picks

Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
Getting Started with PHP for i5/OS: List Price, $59.95
The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
The System i Pocket RPG & RPG IV Guide: List Price, $69.95
The iSeries Pocket Database Guide: List Price, $59.00
The iSeries Pocket Developers' Guide: List Price, $59.00
The iSeries Pocket SQL Guide: List Price, $59.00
The iSeries Pocket Query Guide: List Price, $49.00
The iSeries Pocket WebFacing Primer: List Price, $39.00
Migrating to WebSphere Express for iSeries: List Price, $49.00
iSeries Express Web Implementer's Guide: List Price, $59.00
Getting Started with WebSphere Development Studio for iSeries: List Price, $79.95
Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
Getting Started with WebSphere Express for iSeries: List Price, $49.00
WebFacing Application Design and Development Guide: List Price, $55.00
Can the AS/400 Survive IBM?: List Price, $49.00
The All-Everything Machine: List Price, $29.95
Chip Wars: List Price, $29.95


 
The Four Hundred
Why Blade Servers Still Don't Cut It, and How They Might

Power Systems Memory Prices Slashed to Promote Virtualization

Database Modernization Still Unknown Territory

As I See It: God Bless Technology

Virtualization Adoption Skyrockets on Power Systems Iron

The Linux Beacon
What the Heck Is the Midrange, Anyway?

Intel Talks Up Larrabee X64-Based Graphics Engine

IBM's Q2 Server Sales: Let's Do Some Math

As I See It: Babes in Broadband

Gartner Is Projecting a Decline in IT Hiring This Year

Four Hundred Stuff
Paperless System Brings Unexpected Benefits to Power Company

LogRhythm Partners with PowerTech to Support i OS Log Data

Profound Debuts Graphical Admin Interface for Web-Enabled Apps

Correction: WebFacing Lives On, in HIS and HATS

RJS' WebDocs Gets Google-ized

Big Iron
Unisys: Crunch for the Last of the BUNCH

Top Mainframe Stories From Around the Web

Chats, Webinars, Seminars, Shows, and Other Happenings

System i PTF Guide
August 2, 2008: Volume 10, Number 31

July 26, 2008: Volume 10, Number 30

July 19, 2008: Volume 10, Number 29

July 12, 2008: Volume 10, Number 28

July 5, 2008: Volume 10, Number 27

June 28, 2008: Volume 10, Number 26

The Windows Observer
What Art Thou, Midori?

Microsoft Works to Put the Clamps on 'Exploit Wednesday'

Yahoo Shareholder Meeting Anti-Climactic

Gartner Is Projecting a Decline in IT Hiring This Year

Microsoft to Buy DATAllegro for Data Warehouse Appliances

The Unix Guardian
Sun Carbon Copies Another Q4 and Fiscal Year

Q&A with IBM's Ross Mauri: Talking Power Systems and Power7

Sun Delivers AMP Stack for Solaris and Linux, Windows Coming

As I See It: Babes in Broadband

SAP Profits Under Pressure in Q2, Software Prices Get Jacked

Four Hundred Monitor
Four Hundred Monitor's
Full iSeries Events Calendar

THIS ISSUE SPONSORED BY:

WorksRight Software
Help/Systems
Group8 Security


Printer Friendly Version


TABLE OF CONTENTS
Serving Up Spreadsheets

V6R1 Enhancements for Run SQL Scripts

Admin Alert: Common Mistakes When Failing Over to a CBU

Four Hundred Guru

BACK ISSUES

From the IT Jungle Forums
IFF ACTIVE Equivalent in CL

Printer Problem

Capture Sort File and Copy to Database File

SNMP Traps on i5OS

Java Messages

Copying recs from a subfile to a file and keeping highlights





 
Subscription Information:
You can unsubscribe, change your email address, or sign up for any of IT Jungle's free e-newsletters through our Web site at http://www.itjungle.com/sub/subscribe.html.

Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc., 50 Park Terrace East, Suite 8F, New York, NY 10034

Privacy Statement