• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Admin Alert: The Road to Live CBU Fail Over, Part 1

    September 2, 2009 Joe Hertvik

    One of the companies I work with performed its first live Capacity BackUp (CBU) switch test last month, where they switched over and used their CBU system as their live production system for several days. In the next few issues, I’ll use their experience in prepping for a live switch as a possible guide for others trying to ensure that their CBU can substitute for a live system.

    CBU 101: Understanding the CBU

    A CBU is an i, System i, or iSeries machine that is an exact duplicate of a live production system. CBUs generally contain the same amount of memory, disk, and CPU activations as their source counterparts. With the help of replication software from companies such as Vision Solutions, IBM‘s DataMirror division, or Bug Busters Software Engineering, production information (including databases, programs, and system objects) is automatically replicated from the source machine to the target CBU. In the event of an emergency where the production system is not available, you can keep your business moving by quickly switching processing over to the target box. See the “Related Stories” section for more articles describing i5/OS high availability and CBUs.

    While the CBU is an i5/OS machine in waiting, many companies consider it a big step to actually switch over and run the target machine as a production system substitute for any appreciable amount of time. Our example company took the following steps to reach this goal.

    Certifying the CBU Switch-Over Process

    A live CBU switch-over doesn’t happen overnight. It takes a great deal of planning and testing to gain confidence that if you switch live processing to the CBU, you are not putting business processing at risk. To allay this fear, the staff developed the idea of certifying the CBU for use as a production machine.

    CBU certification evolved because switching live production processing to a duplicate machine was a scary thought to both management and IT staff. Imagine what might happen if you were processing orders and a key data library was out of sync, such that thousands of orders were filled, delivered, and invoiced to customers with incorrect pricing? Or what would happen if you switched over and your key application wouldn’t work, holding up your production and shipping line for days? Company executives and the IT staff were looking for a comfort level that the business would continue to function efficiently if they lost the production machine.

    The certification process encompassed a series of switch tests and accompanying documentation that tested critical processing features that the company relied on every day. To meet this end, CBU deployment was subdivided into the following certification steps.

    • Initial CBU configuration and infrastructure certification–Determine that the CBU itself is set up correctly to impersonate the production machine. This step tests the basic mechanics of switching over to the CBU.
    • Application certification–Determine whether all the critical custom-written and homegrown applications can function on the CBU. This includes obtaining software licensing, license keys, and testing the applications to see whether they work as intended on the machine.
    • User certification–Determine whether the user community can perform its essential business processing on the CBU.
    • Process certification–Determine whether critical automated processing can run on the machine.
    • Audit certification–Confirm with an outside authority that the company’s CBU configuration was correct and that no key pieces were missing.
    • Extended switch over certification–Determine whether the company can actually switch processing over to and run their business on the CBU.

    Each completed step led to the next step and cumulatively, all the steps would give the company confidence and documentation that the CBU would perform correctly in a crisis. The group felt that by certifying CBU fitness for duty this way, they could reap the following benefits.

    • Certification by step would slowly build confidence that the CBU would work as intended. IT, management, and users could watch the progress as the CBU was readied for usage.
    • Segmentation would create ownership and comfort that each group’s particular needs were being addressed. The system administrators would ensure the infrastructure worked correctly. The applications people would tend to application configuration. The users would directly test that their needs were being met.
    • Documentation after each step would create a reporting system for CBU progress. It would produce accountability and motivation for each group to ensure that they tested thoroughly before they gave the go-ahead to move on.
    • Certification would provide flexibility to reconfigure and retest. The company could identify problems and ensure that each step was perfected as much as possible before moving on the next step. It also provided structure for how to deploy the CBU.

    It’s also worth noting that this framework didn’t appear overnight. It was the result of two or three earlier switch tests where the company worked with the CBU and determined that this was the best course of action to follow. In particular, most of the initial CBU configuration and infrastructure configuration and the entire application configuration were completed before the company determined that the other steps were needed. Once all the steps were identified, the rest of the certifications proceeded as presented here.

    Initial CBU Configuration and Infrastructure Certification

    After the CBU was purchased, the company hired an outside consultant to perform the initial configuration. They used Vision Solutions’ MIMIX HA software as its high availability solution. The consultant worked with the company to install the software and determine what information (data, programs, and system objects) should be replicated to the CBU, set up the replication configuration, and started the process of replicating information from one machine to another. He helped the company create their initial “run book”, which is the set of instructions the company follows to switch processing from the production machine to the target machine and back again. The consultant also helped them set up HA audits that would alert staff by email when libraries or objects were out of sync between the machines and when libraries were added to the production box that were not available on the CBU.

    When dealing with high availability scenarios, one of the hardest situations is performing the first switch-over test. This test does nothing more than run the procedures for switching processing from the production machine to the CBU and back again. When switching over in this test, the CBU performed little information processing. Rather, this exercise tested the mechanics of switching over and switching back again to see if it was possible to perform the switch using their existing run book.

    The first test also helped the company understand if their replication scheme was valid. When processing was temporarily switched over to the CBU, the company shut off all normal information processing functions (interactive jobs, Website updating, remote updates, batch jobs, etc.). The testers had to remember that a CBU switch-over is a fundamentally different animal than a traditional disaster recovery test. In a switch-over, the CBU is functioning as the production machine and any processing that occurs will be replicated back to the source production system at the end of the test (i.e., all CBU testing uses live data).

    To check that changed CBU data would be replicated correctly back to the production machine at the end of the test, the testers only changed data on a few insignificant files on the CBU. When the testers switched processing back to the CBU, they checked the test files on the source box to ensure that any changes that were made on the target machine during a switch test were replicated back to the production machine.

    The goal of the first test was to create and test the basic structure of a switch-over, including basic data update and replication. The testers wanted to be comfortable enough with the exercise that they could perform this switch again and again as needed for later tests. The initial test answered a few simple questions:

    • Can a switch-over be performed?
    • Is data replicated correctly from the production system to the target system and back again?
    • What steps should be taken to make succeeding switch over tests more successful?

    The first test was the building block on which all of the other CBU testing would rest. Until the CBU infrastructure was correct, the company couldn’t move on to the more complicated CBU functions.

    For the example company, it was necessary to run two tests to make sure the basic infrastructure of the CBU was correct. That is, the CBU needed to totally impersonate the production machine so that the outside world (including network equipment, DNS servers, communications partners, printers, etc.) couldn’t tell the difference between the two machines. After two tests, the testers felt that they could move on to the next certification step.

    Between Tests: Tweaking the Run Book

    During each switch test, the testers took detailed notes in the run book as to what went right with each step, what went wrong during the test, and what they did to fix it. After each test, those notes formed the basis of the next run book. The previous run book was archived for reference and a new run book was created.

    The new run book contains all the fixes, shortcuts, and expansions needed to make the next test more successful. It became mandatory to update the run book during the first few days after the test completed, while all the events were still fresh in the testers’ minds. If the run book sat for a few weeks before being updated, the testers could misunderstand some of their own notes and accidentally omit important changes that were needed for the next test.

    More To Come

    As I mentioned, this company identified CBU configuration as a series of steps. Next week, I’ll look at what was required for the next certification steps and how they led up to the ultimate goal of a live switch-over.

    RELATED STORIES

    Beyond Replication in an i5/OS High-Availability Environment

    Common Mistakes When Failing Over to a CBU

    Five Benefits of a High Availability System

    How System i Boxes Impersonate Each Other, Part 1

    How System i Boxes Impersonate Each Other, Part 2

    The System i High Availability Roadmap



                         Post this story to del.icio.us
                   Post this story to Digg
        Post this story to Slashdot

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    UCG Technologies – Vault400

    Do the Math When Looking at IBM i Hosting for Cost Savings

    COVID-19 has accelerated certain business trends that were already gaining strength prior to the start of the pandemic. E-commerce, telehealth, and video conferencing are some of the most obvious examples. One example that may not be as obvious to the general public but has a profound impact on business is the shift in strategy of IBM i infrastructure from traditional, on-premises environments to some form of remote configuration. These remote configurations and all of their variations are broadly referred to in the community as IBM i hosting.

    “Hosting” in this context can mean different things to different people, and in general, hosting refers to one of two scenarios. In the first scenario, hosting can refer to a client owned machine that is housed in a co-location facility (commonly called a co-lo for short) where the data center provides traditional system administrator services, relieving the client of administrative and operational responsibilities. In the second scenario, hosting can refer to an MSP owned machine in which partition resources are provided to the client in an on-demand capacity. This scenario allows the client to completely outsource all aspects of Power Systems hardware and the IBM i operating system and database.

    The scenario that is best for each business depends on a number of factors and is largely up for debate. In most cases, pursuing hosting purely as a cost saving strategy is a dead end. Furthermore, when you consider all of the costs associated with maintaining and IBM i environment, it is typically not a cost-effective option for the small to midsize market. The most cost-effective approach for these organizations is often a combination of a client owned and maintained system (either on-prem or in a co-lo) with cloud backup and disaster-recovery-as-a-service. Only in some cases of larger enterprise companies can a hosting strategy start to become a potentially cost-effective option.

    However, cost savings is just one part of the story. As IBM i expertise becomes scarce and IT resources run tight, the only option for some firms may be to pursue hosting in some capacity. Whatever the driving force for pursing hosting may be, the key point is that it is not just simply an option for running your workload in a different location. There are many details to consider and it is to the best interest of the client to work with an experienced MSP in weighing the benefits and drawbacks of each option. As COVID-19 rolls on, time will tell if IBM i hosting strategies will follow the other strong business trends of the pandemic.

    When we say do the math in the title above, it literally means that you need to do the math for your particular scenario. It is not about us doing the math for you, making a case for either staying on premises or for moving to the cloud. There is not one answer, but just different levels of cost to be reckoned which yield different answers. Most IBM i shops have fairly static workloads, at least measured against the larger mix of stuff on the public clouds of the world. How do you measure the value of controlling your own IT fate? That will only be fully recognized at the moment when it is sorely missed the most.

    CONTINUE READING ARTICLE

    Please visit ucgtechnologies.com/IBM-POWER9-systems for more information.

    800.211.8798 | info@ucgtechnologies.com

    Article featured in IT Jungle on April 5, 2021

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    Maximum Availability:  Upgrade to *noMAX - save 20% on current fees
    ARCAD Software:  Start 5250 emulation sessions from your RDi workspace - download freeware!
    COMMON:  Celebrate our 50th anniversary at annual conference, May 2 - 6, 2010, in Orlando

    IT Jungle Store Top Book Picks

    Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
    The iSeries Express Web Implementer's Guide: List Price, $49.95
    The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
    The System i Pocket RPG & RPG IV Guide: List Price, $69.95
    The iSeries Pocket Database Guide: List Price, $59.00
    The iSeries Pocket SQL Guide: List Price, $59.00
    The iSeries Pocket Query Guide: List Price, $49.00
    The iSeries Pocket WebFacing Primer: List Price, $39.00
    Migrating to WebSphere Express for iSeries: List Price, $49.00
    Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
    Getting Started with WebSphere Express for iSeries: List Price, $49.00
    Can the AS/400 Survive IBM?: List Price, $49.00
    Chip Wars: List Price, $29.95

    IBM Delivers Optim Archiving and Test Software for JDE, But Goofs Up i OS Support The Feeds and Guessed Speeds of Power7

    Leave a Reply Cancel reply

Volume 9, Number 27 -- September 2, 2009
THIS ISSUE SPONSORED BY:

ProData Computer Services
East Coast Computer
Manta Technologies

Table of Contents

  • Use the Dup Key in Subfiles
  • An Overview of User-Defined Types in DB2 for i
  • Admin Alert: The Road to Live CBU Fail Over, Part 1

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Query Supervisor Gives Database Engineers New Power
  • IBM Unveils New and Improved IBM i Services
  • 3 Takeaways from the 2021 PowerTech Security Report
  • Four Hundred Monitor, April 14
  • IBM i PTF Guide, Volume 23, Number 15
  • Big Blue Unveils Spring 2021 IBM i Technology Refreshes
  • Thoroughly Modern: Innovative And Realistic Approaches To IBM i Modernization
  • Guru: Web Services, DATA-INTO and DATA-GEN, Part 2
  • Back To The Future With A New IBM i Logo
  • IBM i PTF Guide, Volume 23, Number 14

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2021 IT Jungle

loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.