• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Admin Alert: Elements Of An IBM i Incident Management Plan, Part 2

    April 16, 2014 Joe Hertvik

    Last issue, I started outlining how to set up an IBM i incident management plan, going through four of the seven elements that are crucial for IBM i monitoring and response. This issue, let’s finish up and discuss the final elements an IBM i incident management template should provide.

    The Elements Of IBM i Incident Management, Revisited

    As presented last time, here are the critical elements every IBM i incident management plan should include.

    1. What type of monitoring are you doing: Manual, automatic, or hybrid?
    2. What are you monitoring for?
    3. Call trees: Who should be alerted when a problem occurs?
    4. Call tree protocol: How do you contact responders?
    5. Redundancy: What happens if your response protocol breaks down?
    6. Who handles damage control and keeping users/management informed?
    7. Recovery: What happens after the problem is over?

    Last issue, I covered items 1 through 4. Today, let’s look at what you can do to add redundancy, damage control, and recovery planning to the list (items 5 through 7).

    Part 5: Redundancy: What happens if your response protocol breaks down?

    It’s important to plan for what happens if your notification system breaks down.

    Let’s say your notification protocol calls for having your IBM i server send out an email message and a text message to your responders when a problem occurs. It uses the company’s email server to deliver those messages. (Check out this article for how to deliver text messages via email). But suppose the email system is down or the TCP/IP network hosting your IBM i is unavailable? How do your responders receive alert messages in those cases?

    One way to answer this question is with a two-pronged approach that takes advantage of email and an old fashioned modem. With this approach, every IBM i alert is sent out through two different transmission methods.

    • The first alert is sent out through the company email system as both an email and a text message.
    • The second alert is sent out as a text message through an analog modem and a phone line.

    This set up takes advantage of TAP paging terminal phone numbers. Many telecommunications companies still supply their own dial-up number for sending out text messages. This means you can send all your alerts out through standard email AND through an analog phone line to your cell phone provider’s TAP numbers. Doing this, you can insure that your automated IBM i text alerts will always go out, even if your email service is down.

    See this article I previously wrote for more information on setting up an IBM i modem to use TAP in conjunction with email messages for IBM i system monitoring.

    Part 6: Who handles damage control and keeping users/management informed?

    When you’re in the middle of handling an IT emergency, it’s easy to forget there are people who may be unable to work because the system is down. Conversely, other parts of the system may not be working due to the IBM i problem you’re working on.

    So any good IBM i incident management plan should also specify people who play the following roles:

    User liaison–Keeps your users informed about what’s happening and how soon a fix will be implemented. Ideally, this should be an IT manager or someone else who isn’t involved with solving the actual issue. The help desk manager is also a good candidate for user liaison.

    The user liaison’s job is to get the latest information on progress for an incident fix and to notify affected users how the fix is going. The user liaison’s other job is to keep the pressure off the responders, so they have time to troubleshoot and fix the issue. Depending on how wide-spread the issue is, the user liaison may need to notify the following groups when a problem occurs.

    • Management–Depending on proximity and company preferences, notification can be accomplished through an email, but you may also have to make a personal phone call or visit.
    • Users–Can generally be notified by email. If the issue only affects one department or a small set of users, you may also want to discuss by phone call or personal visit.
    • Business partners –Call or email.
    • Customers–May need to be contacted either by the IT department or the business owner of the customer relationship.

    It can be helpful to use a form email that can be updated as problem resolution proceeds. Any email notification you send out should include time of notification; a short description of the problem; the expected fix; the expected time the fix will be implemented; and the expected time you’ll send out the next notification email. An hour is a reasonable amount of time between updated notifications, and it’s important to keep users updated on a regular basis for an extended issue.

    Damage control–A production bust may also affect other IT processing or production functions. Aside from your responders, you may need someone to gather a team to devise work arounds for the affected systems. Again, this should be someone besides the people working on the problem, though you might employ the User Liaison team to perform this function.

    Part 7: Recovery: What happens after the problem is over?

    After the problem is finished, you need to perform the following functions:

    • Final notification to users that the problem is fixed–This notification should include any special instructions the users need to follow or items they need to be aware of.
    • Clean-up–Determine who needs to perform follow-up work to correct any additional issues that occurred because of the original problem. If the issue happened during off-hours, you may need to call in a crew to affect cleanup. You may also use the damage control crew from step 7 for this function.
    • Setting things straight–Reverse any temporary changes that were put in during the fix period, such as holding reactive jobs, limiting customer or employee access to affected functions, etc.
    • Lessons learned–Analyze the root cause of the problem and determine whether additional items need to be changed to prevent the issue from occurring again. Both the responders and IT management should participate in this phase. New projects may need to be created and approved because of this phase.

    This completes my template on setting up an IBM i incident management plan. If you have any comments or other items to add to the plan, email me at joe@joehertvik.com and I may use them in a future column.

    Joe Hertvik is an IBM i subject matter expert (SME) and the owner of Hertvik Business Services, a service company that provides written marketing content and presentation services for the computer industry, including white papers, case studies, and other marketing material. Email Joe for a free quote for any upcoming projects. He also runs a data center for two companies outside Chicago, featuring multiple IBM i ERP systems. Joe is a contributing editor for IT Jungle and has written the Admin Alert column since 2002. Check out his blog where he features practical information for tech users at joehertvik.com.

    RELATED STORIES

    Admin Alert: Elements Of An IBM i Incident Management Plan, Part 1

    Admin Alert: Adding Redundancy to Power i SMS Monitoring

    Configuring Messaging Software for Overnight Monitoring



                         Post this story to del.icio.us
                   Post this story to Digg
        Post this story to Slashdot

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    UCG Technologies

    CYBER-ATTACKS ON THE RISE. PROTECT WITH THE TRIPLE PLAY.

    COVID-19 has not only caused a global pandemic, but has sparked a “cyber pandemic” as well.

    “Cybersecurity experts predict that in 2021, there will be a cyber-attack incident every 11 seconds. This is nearly twice what it was in 2019 (every 19 seconds), and four times the rate five years ago (every 40 seconds in 2016). It is expected that cybercrime will cost the global economy $6.1 trillion annually, making it the third-largest economy in the world, right behind those of the United States and China.”1

    Protecting an organization’s data is not a single-faceted approach, and companies need to do everything they can to both proactively prevent an attempted attack and reactively respond to a successful attack.

    UCG Technologies’ VAULT400 subscription defends IBM i and Intel systems against cyber-attacks through comprehensive protection with the Triple Play Protection – Cloud Backup, DRaaS, & Enterprise Cybersecurity Training.

    Cyber-attacks become more sophisticated every day. The dramatic rise of the remote workforce has accelerated this trend as cyber criminals aggressively target company employees with online social engineering attacks. It is crucial that employees have proper training on what NOT to click on. Cyber threats and social engineering are constantly evolving and UCG’s Enterprise Cybersecurity Training (powered by KnowBe4) is designed to educate employees on the current cutting-edge cyber-attacks and how to reduce and eliminate them.

    A company is only as strong as its weakest link and prevention is just part of the story. Organizations need to have a quick response and actionable plan to implement should their data become compromised. This is the role of cloud backup and disaster-recovery-as-a-service (DRaaS).

    Data is a company’s most valuable asset. UCG’s VAULT400 Cloud Backup provides 256-bit encrypted backups to two (2) remote locations for safe retrieval should a cyber-attack occur. This is a necessary component of any protection strategy. Whether a single click on a malicious link brings down the Windows environment or an infected SQL server feeds the IBM i, once the data is compromised, there is no going back unless you have your data readily available.

    Recovery is not a trivial task, especially when you factor in the time sensitive nature of restoring from an active attack. This leads to the third play of the Triple Play Protection – DRaaS.  Companies have myriad concerns once an attack is realized and a managed service disaster recovery allows employees to keep focus on running the business in a crisis state.

    The combination of training employees with secure backup and disaster recovery offers companies the best chance at avoiding financial disruption in an age of stronger, more frequent cyber-attacks.

    Reach out to UCG Technologies to discuss your company’s security needs and develop a data protection plan that fits you best.

    ucgtechnologies.com/triple-play

     800.211.8798 | info@ucgtechnologies.com

     

    1. https://theconversation.com/cyberattacks-are-on-the-rise-amid-work-from-home-how-to-protect-your-business-151268

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    Essextec:  Linux on Power. Lunch on us. A winning combination.
    LANSA:  Webinar: Mobile and the IBM i: Why Should You Care? May 21, 9 am PT/11 am CT/Noon ET
    COMMON:  Join us at the COMMON 2014 Annual Meeting & Exposition, May 4 - 7 in Orlando, Florida

    More IT Jungle Resources:

    System i PTF Guide: Weekly PTF Updates
    IBM i Events Calendar: National Conferences, Local Events, and Webinars
    Breaking News: News Hot Off The Press
    TPM @ EnterpriseTech: High Performance Computing Industry News From ITJ EIC Timothy Prickett Morgan

    Electronic Storage Taps Japanese Reseller to Carry LaserVault UBD IBM i TR8, Database Driven

    Leave a Reply Cancel reply

Volume 14, Number 9 -- April 16, 2014
THIS ISSUE SPONSORED BY:

Help/Systems
WorksRight Software
Bug Busters Software Engineering

Table of Contents

  • The Geezer’s Guide to Free-Form RPG, Part 2: Data Structures and More
  • Here’s Help For A Huge Hardship
  • Admin Alert: Elements Of An IBM i Incident Management Plan, Part 2

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Why Open Source Is Critical for Digital Transformation
  • mrc Refreshes IBM i Low-Code Dev Tool
  • Unit Testing Automation Hits Shift Left Instead of Ctrl-Alt-Delete Cash
  • Four Hundred Monitor, March 3
  • IBM i PTF Guide, Volume 23, Number 9
  • Doing The Texas Two Step From Power9 To Power10
  • PHP’s Legacy Problem
  • Guru: For IBM i Newcomers, An Access Client Solutions Primer
  • IBM i 7.1 Extended Out To 2024 And Up To The IBM Cloud
  • Some Practical Advice On That HMC-Power9 Impedance Mismatch

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2021 IT Jungle

loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.