• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Admin Alert: Keep Your Data Synced Up During an HA Switch Over

    July 14, 2010 Joe Hertvik

    When performing high availability (HA) switch exercises where production processing is temporarily switched from a live system to an HA system and back again, there is one essential procedure that must be followed or you risk screwing up your databases and losing data. Here’s the issue and how to avoid it.

    Rule #1: Don’t Do This!!!

    In my humble opinion, this is the cardinal rule for an HA switch over:

    Don’t update your production data without replication running.

    It’s a good simple rule to follow, but it’s also an easy rule to mess up. Whenever you switch processing from a production system to a Power i Capacity Backup (CBU) system, there is a simple sequence you must follow or else you will lose any last minute production data. The sequence is this:

    • Shut down all processing that affects your production databases. This includes interactive users; batch jobs; Website interfaces; System Network Architecture Distribution Services (SNADS) jobs; SQL jobs using ODBC, OLE, or JDBC; and any other jobs where data is updated on your production system.
    • Allow your replication software to catch up so that all recent changes have been transmitted from the production system to the CBU, and that the CBU database is up-to-date with production.
    • Switch production processing over to the CBU.

    This may sound straightforward, but it is a deceptively easy process to violate. During a recent test, all it took to take our CBU database out of sync was to accidentally leave some Web servers up for 10 minutes after the replication software was shut down. Several orders came into the system, and these orders were not replicated to the CBU before we attempted switch over. Because of this, we had to stop our switch over and reconcile the databases between the production and the CBU machines, which took several hours and led us to cancel our planned switch over.

    To prevent this from happening in any CBU role switches, here are three tips you can use to modify your CBU run book and configuration to ensure that all production data is replicated to the CBU before you switch processing.

    Tip #1: Separate Replication and Production IP Traffic

    You can avoid this issue is by using separate IP addresses for production traffic and for CBU replication. In a simple HA scenario, you would only have the following two IP addresses active on your production system.

    • IP address one (xxx.xxx.xxx.001) services production traffic on your system. All interactive users, Web servers, clients, and other partner machines communicate over the .001 interface.
    • IP address two (xxx.xxx.xxx.002) services all replication tasks that transmit information between the production machine and the CBU. The .002 interface is dedicated to production-to-CBU traffic only.

    By segmenting traffic this way, you can shut down all outside IP production traffic simply by taking down the .001 IP interface with the following End TCP/IP Interface (ENDTCPIFC) command.

    ENDTCPIFC INTNETADR(XXX.XXX.XXX.001)
    

    Ending your production TCP/IP interface separately from your CBU interface ensures that no outside clients or servers are updating production data as you start the switch over process. The separate .002 interface also allows you to finish replicating all production transactions to the CBU after the system goes quiet.

    Tip #2: Checking CBU System Integrity

    Many software packages contain integrity reports where you can list out the number of records in key system databases or create an aggregate dollar amount of all production orders or your inventory value. One way to partially double-check that production and CBU databases are in sync is to run and double-check your integrity reports on both systems after the production system is quieted. If you find that these totals are out of sync, it is a red flag that something is wrong with replication and you can delay the switch until you find the problem.

    Tip #3: A Blueprint for Shutting Down Replication

    Ensuring file synchronization during a switch over is a run book process issue. The solution is making sure that you have working procedures in place for shutting off the production system correctly and for ensuring that production and CBU data are synchronized before you switch roles. Here’s a rough blueprint that you can use in a run book for shutting down a production system during switch over.

    1. Start by ending QINTER interactive processing. Issue the following End Subsystem (ENDSBS) command to take down your interactive subsystem.

    ENDSBS SBS(QINTER) DELAY(120)
    

    This will give each user two minutes to finish their work before the system ends all interactive processing in QINTER. If you have multiple interactive subsystems running on your partition, run this command for each subsystem.

    2. If you have SNA traffic on your subsystem, end the QSNADS subsystem.

    ENDSBS SBS(QSNADS) DELAY(120)
    

    This shuts down all SNA processing, again allowing two minutes for any active jobs to finish processing.

    3. Shut down any i/OS Web servers that are running in the QHTTPSVR subsystem, using this command:

    ENDSBS SBS(QHTTPSVR) DELAY(120)
    

    4. Shut down any TCP/IP servers that may be exchanging data with the outside world. Make sure that you specify each server you want to shut down instead of using the default value of all servers (*ALL) on the End TCP/IP Server (ENDTCPSVR) command. To end the TCP/IP server for i/OS FTP, for example, issue the following command:

    ENDTCPSVR SERVER(*FTP)
    

    End other TCP/IP servers, as needed.

    5. iSeries, System i, and Power i systems use QZDASOINIT pre-start jobs to process SQL requests from clients using ODBC, JDBC, OLE DB, or other connectivity techniques. These jobs generally run in the QUSRWRK subsystem, but they can sometimes run in the QSERVER subsystem. Use this End Prestart Jobs (ENDPJ) command to end these servers.

    ENDPJ SBS(QUSRWRK) PGM(QSYS/QZDASOINIT) OPTION(*CNTRLD) DELAY(120)
    

    6. After steps 1 through 5 are completed, you can end the production IP interface (.001) as described above (if you have separate production and CBU IP interfaces).

    7. End batch processing in the QBATCH subsystem by running this command:

    ENDSBS SBS(QBATCH) OPTION(*CNTRLD) DELAY(*NOLIMIT)
    

    By ending this subsystem in a controlled manner (*CNTRLD) with an unlimited controlled delay time of *NOLIMIT, i/OS allows all currently running QBATCH jobs to complete before ending the subsystem.

    Repeat this command for any other subsystems that are running batch jobs.

    8. After all of your batch jobs are finished running, check your HA software to ensure that all pending replication entries have been transferred from the production system to the CBU.

    9. End all subsystems on the production machine to ensure that no more processing is occurring.

    ENDSBS SBS(*ALL)
    

    10. If you have set up integrity reports, run the reports on both systems and validate that the systems are in sync. If they are not in sync, investigate and correct.

    11. Proceed with your switch over process.

    Don’t Forget Synchronization on Switch Back

    These three tips should also be implemented when you are ready to send production processing home again when your CBU switch over exercise is completed (i.e., switch processing back from the CBU to the production machine at the end of your exercise), so don’t forget to add these items to your run book switch back procedures.

    It’s always the little things that get you, but if you implement some of the techniques here, you shouldn’t get caught with out of sync data during a planned HA switch over.



                         Post this story to del.icio.us
                   Post this story to Digg
        Post this story to Slashdot

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    Midrange Dynamics North America

    Git up to speed with MDChange!

    Git can be lightning-fast when dealing with just a few hundred items in a repository. But when dealing with tens of thousands of items, transaction wait times can take minutes.

    MDChange offers an elegant solution that enables you to work efficiently any size Git repository while making your Git experience seamless and highly responsive.

    Learn more.

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    Shield Advanced Solutions:  Receiver Apply Program ~ affordable availability for the IBM i
    Linoma Software:  Secure and automate data transfers with GoAnywhere Director
    COMMON:  Join us at the Fall 2010 Conference & Expo, Oct. 4 - 6, in San Antonio, Texas

    IT Jungle Store Top Book Picks

    Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
    The iSeries Express Web Implementer's Guide: List Price, $49.95
    The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
    The System i Pocket RPG & RPG IV Guide: List Price, $69.95
    The iSeries Pocket Database Guide: List Price, $59.00
    The iSeries Pocket SQL Guide: List Price, $59.00
    The iSeries Pocket Query Guide: List Price, $49.00
    The iSeries Pocket WebFacing Primer: List Price, $39.00
    Migrating to WebSphere Express for iSeries: List Price, $49.00
    Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
    Getting Started with WebSphere Express for iSeries: List Price, $49.00
    Can the AS/400 Survive IBM?: List Price, $49.00
    Chip Wars: List Price, $29.95

    Wood Distributor Shaves Inefficiencies with IBI Solution The Rest of the Power7 Lineup Is Coming August 17

    Leave a Reply Cancel reply

Volume 10, Number 21 -- July 14, 2010
THIS ISSUE SPONSORED BY:

SEQUEL Software
ProData Computer Services
System i Developer

Table of Contents

  • Heads Up! Additional Configuration Required for Windows 7/Windows Server 2008 R2
  • Who’s the Scoundrel That Corrupted My Database?
  • Admin Alert: Keep Your Data Synced Up During an HA Switch Over

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Public Preview For Watson Code Assistant for i Available Soon
  • COMMON Youth Movement Continues at POWERUp 2025
  • IBM Preserves Memory Investments Across Power10 And Power11
  • Eradani Uses AI For New EDI And API Service
  • Picking Apart IBM’s $150 Billion In US Manufacturing And R&D
  • FAX/400 And CICS For i Are Dead. What Will IBM Kill Next?
  • Fresche Overhauls X-Analysis With Web UI, AI Smarts
  • Is It Time To Add The Rust Programming Language To IBM i?
  • Is IBM Going To Raise Prices On Power10 Expert Care?
  • IBM i PTF Guide, Volume 27, Number 20

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle