Newsletters   Subscriptions  Forums  Store   Career  Media Kit  About Us  Contact  Search   Home 
tfh
Volume 14, Number 7 -- February 14, 2005

IBM Issues PTFs to Patch RAID Controllers


by Timothy Prickett Morgan


I told you last October that I had heard some feature 2757 PCI-X RAID 5 disk controllers were on the fritz and that IBM had issued some PTFs to do diagnostics on the boards, to find failing components before the feature 2757 card failed and took down customers' machines. While that problem has apparently been solved, there are still issues with the feature 2757 and feature 2780 RAID controller cards.

On Friday, February 4, IBM sent a technical support bulletin to its business partners alerting them that if they have customers with mirrored disk arrays that are also using the feature 2757 or feature 2780 RAID controllers, they had better get in gear and install some PTFs to patch the microcode on those controllers. While electronic components within the feature 2757 and feature 2780 cards are not failing in the field this time around (as was the case last fall with the feature 2757 controllers, according to my sources at IBM's Rochester iSeries labs), they do have microcode issues that can, nonetheless, under very specific circumstances, can cause system crashes.

The text of the IBM tech support bulletin was brief and, as usual, did not provide much in the way of an explanation of the problem:

All users with a mirrored system AND with 2757 or 2780 IOA controllers

It has been discovered that, given certain conditions, systems that you have setup with Mirroring AND have 2757 or 2780 IOA controllers could result in system down time if a disk failure occurs. We have fixes available for both V5R3 (MF34472) and V5R2 (MF34589). They have been designated as HIPER PTFs and should be downloaded and applied to your systems along with scheduling a system IPL at your earliest possible date to remove the threat of exposure from this issue on your systems. Also, please remember to apply these PTFs and IPL on ALL of your V5R2 and V5R3 partitions on LPAR systems.

To better explain what is going on now with these two RAID controllers, let's talk about what they are, what went wrong with the feature 2757 card, and what still seems to be the problem with the feature 2780s.

The feature 2757 RAID 5 controller was announced in January 2003. It has 235 MB of write cache memory, and with its data compression turned on this cache memory turns into what is effectively (or, depending on whether or not yours crashed, maybe ineffectively) a 757 MB cache. This Ultra3 SCSI controller also supports a RAID5 set with a minimum of three drives, but that RAID set can be expanded to 18 drives. Considering how disk drives have gotten increasingly capacious over the years, being able to make a RAID set with only three drives is a lot better than the 10-drive limit on feature 2778 and feature 4778 Ultra2 SCSI RAID5 controllers. The feature 2757 controller supports up to four SCSI buses, which run at 160 MB/sec, compared with 80 MB/sec. The maximum PCI burst rate on the feature 2757 controller is 532 MB/sec, four times that of the prior card. The compressed write cache, at 757 MB, is more than seven times as large as the 104 MB effective cache on the feature 2778 and feature 4778 cards. The new controller also supports SCSI bus tagged command queuing, which yields faster response times under heavy loads and has new hardware-assisted array parity checking and cache memory scrubbing algorithms that are five times faster than with prior cards. The net effect of using the new controller plus 15K RPM disk drives (also new in January 2003) and the PCI-X slots or expansion towers was that customers could see their disk subsystems improve by a factor of three. Feature 2757 PCI-X RAID 5 controllers plug into second-generation iSeries and i5 models in their internal PCI-X slots or in PCI-X slots in I/O towers. Older iSeries machines must attach these new cards to their servers through I/O towers, since older iSeries machines did not support PCI-X slots.

The feature 2780 card is a variant of the feature 2757 card; it was launched as part of the second wave of eServer i5 announcements back in July 2004. This RAID 5 controller is the same as the existing feature 2757 controller, but it adds 1 GB of read cache on top of the write cache. Feature 2780 was designed to replace a special RAM disk offering from a few years back that was used to boost the batch performance of iSeries machines. The feature 2780 controller was only initially available on the i5 520 and 570 servers, then it was rolled out to the remaining i5 550 and 595 servers in the fall. At about the same time last year, after numerous customer requests, IBM made the feature 2780 card work on first-generation iSeries 270, 820, 830, and 840 machines, as well as on second-generation iSeries 800, 810, 825, 870, and 890 boxes.


According to my source, the original problem with the feature 2757 card was that there was an electronic component that was faulty, and in some cases this could cause the battery-backed write cache on the controller to not be able to flush itself to disks so it could accept new data. This could cause OS/400 objects to fail, and thanks to the single-level storage architecture of OS/400, failing objects can cause a big crash; and sometimes they can even cause a crash so bad that you have to completely reload the system. In fact, according to my source, some customers lost objects, and some had to rebuild their systems from tape. Once the problem was discovered, IBM went through its sales and configuration database and figured out who had the faulty 2757 cards and got them replaced. Just for good measure, it released two PTF diagnostic tools to check for bad 2757s. (That's PTF MF33849 for OS/400 V5R2 and MF33850 for i5/OS V5R3.)

This kind of failure is one of the reasons why IBM has been recommending that OS/400 shops that absolutely cannot deal with this kind of outage should mirror their disk subsystems at the bus level. This way, if one RAID 5 set blows, the other one is there, and single-level store is not corrupted in any way.

But there was a catch, and IBM has only just now figured it out. As IBM's techies looked over the microcode for the feature 2757 and 2780 controllers, they saw that there were still conditions under which even mirrored disk arrays using these sophisticated RAID 5 controllers could cause a system crash if a disk drive or the caches failed in a RAID group. What exactly those conditions are, IBM isn't saying. But the February 4 patches are all about fixing whatever the problem is.

Sponsored By
BCD INT'L

========== Boost your iSeries - Web Development Productivity! ============

          · Try the proven WebSmart technology that RPG people truly understand.

          · Succeed with a proven web tool that's installed in 750+ iSeries organizations.

          · Develop using a Flexible Web tool that creates ILE-CGI or JAVA.

          · Receive a FREE* license of the Integrated Nexus Portal with Web Object
            Warehousing (WOW) and a FREE* License of Catapult the Automated Report
            Management System with WOW.

          · Trust and use Products that have been voted the Best in the Industry.

          · Receive the absolute Best support in the Industry to insure your success.

        Then try any of these BCD products with confidence:

             They've all won major Industry Awards:

                          · iSeries News - APEX Award Winner
                          · Search400.com - Products of the Year Gold Winner
                          · eServer - iSeries Magazine - iSeries Magazine - Honor Roll Winner
                          · Showcase - Product Excellence Award Winner


======= These are the Proven and Integrated Products that make up =======

BCD's Integrated iSeries - Web Deployment Bundle

iSeries - Web App Development, Web Portal & Automated Report Distribution

WebSmart     ·     Nexus     ·     Catapult

______________________________________

Click Here for FREE DOWNLOAD · Click Here for Price Quote

Click here to view more WebSmart details www.bcdsoftware.com/progenwebsmart.htm

These products offer significant advantages and lots of real world experience. Combined, these products are field proven by over 500,000 end-users. Most iSeries shops launch WebSmart apps directly from their iSeries. Many also launch from Linux, NT and Unix.

Create new iSeries-browser based applications or extend existing ones as 750+ iSeries organizations do now. Trust BCD's Proven and Award Winning Technology and our 14-year history of developing and supporting iSeries, and Web App Development tools. BCD's robust and evolving product line has earned the respect of iSeries - AS/400 professionals worldwide and Awards throughout the industry.


Now is a Great time to get WebSmart.


Purchase WebSmart and get Free licenses of Nexus Portal & Catapult -
Savings of up to $17,000!


Please view the technical resources, user guides and sample sites by visiting www.progenwebsmart.com.


Trust BCD, Winner of 25 Industry Awards
10,000+ worldwide customers · 28,000+ products sold
630-986-0800 · sales@bcdsoftware.com · www.bcdsoftware.com

* Maintenance contract required for the first year of Free licensed products.


Editor: Timothy Prickett Morgan
Managing Editor: Shannon Pastore
Contributing Editors: Dan Burger, Joe Hertvik, Shannon O'Donnell,
Victor Rozek, Kevin Vandever, Hesh Wiener, Alex Woodie
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message.


THIS ISSUE
SPONSORED BY:

BCD Int'l
LOOKSOFTWARE
PowerTech
COMMON
Cosyn Software


BACK ISSUES

TABLE OF
CONTENTS
IBM Issues PTFs to Patch RAID Controllers

eServer i5 Line Enhanced with New Features

The i5 Gets SAP, Clear Technologies Solution Editions

Old i5 520 Express, iSeries Upgrades, and OS/400 V5R2 Booted

But Wait, There's More


The Linux Beacon
Scalix Ports Messaging Software to zSeries-Linux

Egenera Adds Opterons, Upgrades BladeFrame

Unisys Certifies SUSE Linux, Sells Support Alongside Novell

The Windows Observer
Patch Tuesday Yields Banner Crop of 12 Fixes, 8 of Them Critical

Lucid8 Doing Well with Exchange Maintenance Tool

Microsoft to Buy Antivirus Software Vendor Sybari

The Unix Guardian
Fiorina Quits HP As Board Questions Her Execution

IBM Rolls Out Compact, Two-Core p5 Unix/Linux Server

Sun Starts CPU Cycle Exchange with Archipelago


Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc. (formerly Midrange Server), 50 Park Terrace East, Suite 8F, New York, NY 10034
Privacy Statement