Admin Alert: The i5 Battery Checking Process
Published: May 9, 2007
by Joe Hervtik
What you can't see can slow down i5 disk performance, particularly when you're dealing with the lithium ion batteries that i5 I/O adapters use to provide hard disk drive caching. Fortunately, the system i5 will warn you about aging batteries but if you ignore its warnings, failed batteries can adversely affect disk drive performance. The good news is that it's easy for IBM service to replace old batteries but there are a few tricks in determining when replacement is necessary.
Why Regular Battery Changes Are Important
Lithium ion battery packs provide power for disk caching on your i5 disk arrays. When one of these batteries dies, it may not cause a disk drive failure, but it will disable caching for the disk array it controls and that will have a significant impact on your hard drive performance. According to IBM, the lithium ion batteries are hot swappable, which means that they can be replaced while your system is up and running. However, depending on your machine type, one of the batteries may be controlling a disk array inside the system CPU and the system must be taken down to replace that battery.
Determining When Your Cache Batteries Need Changing
Like the metal nickel batteries that came with older iSeries and AS/400 machines, it may not be apparent when and how these batteries should be changed. Fortunately, i5/OS provides two different mechanisms that warn you when the cache batteries are approaching the end of their life cycle and need changing.
The first warning is automatically issued when the batteries reach their designated warning condition, which is about 90 days before IBM generally estimates they will fail. The i5/OS operating system calculates when a battery is reaching the end of its life cycle, and it will issue the following two messages to the System Operator Message Queue (QSYSOPR) when it reaches that state.
CPPEA13 - *Attention* Contact your hardware service provider
CPP8988 - A critical system hardware problem has occurred. Critical Message Handler has been run.
If you see these messages in QSYSOPR, you can enter option '5=Display' in front of the CPPEA13 message and then select F14=Work with problem to enter the Work with Problems screen for this message. You should also note that that you can enter this function from the green-screen by entering the Work with Problem command (WRKPRB) from a command line. Once inside WRKPRB, enter an option 5=Display details in front of the new Problem ID that was created with this warning. This will bring you to the Display Problem Details screen. On this screen, you should note the System Reference Code value, which will be needed when you call IBM support to replace the battery.
In order to ensure that this is a cache battery problem, select F5=Display possible causes on the Display Problem Details screen. The Select Possible Cause Information screen that appears will provide an option to view a Problem analysis list (which you can reach by selecting option 1). If this incident is a battery problem, the screen will display the Cache battery pack message as a possible cause. If you see this cause, call IBM immediately to order new batteries and to schedule battery replacement.
Besides visually spotting the CPPEA13 message in QSYSOPR, you can also set up an email or paging alert to contact you if the system issues the CPPEA13 message, provided your system is running a system monitoring and notification package, such as Bytware's MessengerPlus.
If you don't see the CPPEA13 message in QSYSOPR, you can also specifically monitor for the situation if you're running i5/OS V5R3 or above. In V5R3, IBM added a new option to the System Service Tools function (SST) that allows you to display and work with any system resources that contain cache battery packs. You start System Service Tools by running the following Start System Service Tools command (STRSST).
After you sign in to SST, you can check the status of all cache batteries on your machine by selecting option 1 (Start a Service Tool) followed by option 7 (Hardware Service Manager), and option 9 (Work with resources containing cache battery packs). The Work with resources containing battery packs screen displays all the resources that contain a battery pack. If you take an option 5 (Display battery information) for any of the battery packs, you will see a screen that looks something like this.
Resource name . . . . . . . . . . . : DC01
Serial number . . . . . . . . . . . : xx-xxxxxxx
Type-model . . . . . . . . . . . . : 2780-001
Frame ID . . . . . . . . . . . . . : 3C01
Card position . . . . . . . . . . . : C02
Battery type . . . . . . . . . . . : Lithium Ion (LiIon)
Battery state . . . . . . . . . . . : Warning condition
Power-on time (days) . . . . . . . : 806
Adjusted power-on time (days) . . . : 945
Estimated time to warning (days) . : 0
Estimated time to error (days) . . : 75
The critical items to note on this screen are the Estimated time to warning (days) and the Estimated time to error (days) values. When the Estimated time to warning (days) field falls to zero, the system issues the CPPEA13 error message mentioned above. As the Estimated time to error (days) field approaches zero, the chance of your battery cache pack failing grows stronger and you should replace the batteries as soon as possible to prevent disk drive degradation. So even if the system hasn't yet issued a CPPEA13 error, you can check these values on each of your partitions to determine how close a cache battery pack is to exceeding its regular life cycle.
Battery Checking Guidelines
If you haven't received an error message and your i5 machine is more than a year old, it is worth your while to check your cache pack batteries once every few months to make sure that you are aware of any upcoming battery replacements. Here are my guidelines for checking your batteries.
- Check all the batteries on all of your partitions. If you find one battery that's ready to be replaced, chances are good that other batteries in that same partition or other partitions will also need to be replaced. It's worthwhile to change as many batteries as possible at the same time, so you don't have to deal with multiple visits by IBM service. So be sure to check all your partitions to make sure that you catch any old batteries that are approaching end of life anywhere within your i5 box.
- Call IBM as soon as possible if you detect a failing battery. Note that both the message warning value and the time to error value are marked as estimates. This means that these numbers are not exact predictive values; battery failure could be much closer than you think and you should get IBM in as soon as possible to help survey the situation and schedule a cache battery pack change. Again, if one of the lithium ion batteries fails, it will not crash you machine. However, it will affect disk drive performance and cause a slowdown in returning data to your applications.
Once you detect a failing battery and call, IBM should send out a service tech (if you're on maintenance) to survey the situation and order the batteries. Once you receive the batteries, call IBM back to schedule your install. Again, don't wait too long to change the batteries as the batteries could theoretically fail any time after the warning message period expires. Remember the warning periods represent estimated life cycles for your cache battery packs. In actual usage, the battery packs may last longer or shorter than IBM's estimate.
Note that this advice is valid for i5 machines running i5/OS V5R3 or above. For older iSeries and AS/400 machine running i5/OS V5R2 and below (including earlier versions of the OS/400 operating system running OS/400 V4R5 and below), check the instructions for these machines that I've previously documented in an article called Checking Your iSeries Batteries.
About Our Testing Environment
All configurations described in this article were tested on an i5 550 box running i5/OS V5R3. Most of the commands used here are also available in earlier versions of the i5/OS and OS/400 operating systems, so the configurations should be usable in prior releases. However, you may notice some variations in pre-V5R3 copies of these commands. These differences may be due to command improvements that have occurred from release to release. In particular, the Work with resources containing cache battery packs screen described here in not available in pre-V5R3 environments.
Checking Your iSeries Batteries
Post this story to del.icio.us
Post this story to Digg
Post this story to Slashdot