Data in iSeries Shops At Risk from Gremlins
by Timothy Prickett Morgan
What a simple world it would be if there were only one database format and a unified set of database field types that spanned all applications. While the OS/400 platform got the first part right by integrating the database with the operating system, nothing can protect a company from having different field formats for different applications. While this situation has caused tremendous headaches, according to a survey from data extraction and cleansing specialist Coglin Mill bad data is causing bad business decisions.
It would be hard to find an iSeries shop that didn't have multiple databases with different field formats either within the same DB2/400 database management system or across DB2/400 and various Windows, Unix, or mainframe systems. According to Pete Wangen, sales and marketing director at Coglin Mill, most companies have at least two different database formats, and often have more. The applications that use these databases have different and conflicting date formats (some ERP systems do it one way, while homegrown applications do it another way, for instance), cryptic product codes, and other abbreviations, and have various other kinds of gremlins in their data that are fine when it comes to processing a transaction, but can wreak havoc when the data is consolidated into a data mart or a data warehouse for the purpose of generating reports to run the business or to report financial results on a quarterly or annual basis.
Most companies today rely on programmers to run queries against their databases, each with its own format, and then suck all of the data into a monster staging table, where all the different formats for data in incompatible fields are massaged into the same format. This is a time-consuming and manually intensive process, and each time an application or data field changes, the errors cascade through. There are similar problems integrating two companies during a merger. In a lot of bigger iSeries shops, says Wangen, one or two programmers spend most of their time doing this work just to keep reports coming out of the systems so business managers can do this job. "This is just how they have always done it, and they don't think about it," says Wangen. "But this process takes too long and requires far too much manual manipulation of data."
Coglin Mill just finished a survey of iSeries shops with annual revenues ranging between $250 million and $3 billion, the kinds of companies that have complex data issues and generally have or should have a data mart or a data warehouse, and therefore the need of a product like Coglin Mill's own Rodin tool for data extraction, transformation, and loading (ETL). There are others, of course. Outside of the iSeries market, Informatica and Ascential Software are the two biggest players in the ETL space, but neither has a native OS/400 tool, as Coglin Mill does, and neither has done an interesting survey to try to figure out how bad the situation is out there when it comes to bad data in the OS/400 market. (To be fair, there are a number of other data transformation tools in the OS/400 market, such as DataMirror's Transformation Server, SPSS's ShowCase Warehouse Builder, IBM's Warehouse Manager, Vision Solutions' Symbiator, Lakeview Technology's OmniReplicator.)
The Coglin Mill survey was performed in conjunction with IBM. The two companies sent surveys to 2,500 public and private companies in North America and received 238 responses, which is a pretty good response rate. According to the respondents, the most challenging data integration or data warehousing issue they faced is data quality management, which is IT speak for making sure that data stored in systems and generated in reports isn't polluted with garbage. Some 39 percent of survey respondents said this was their biggest challenge. Another 17 percent said that consolidating data into a consistent format was their biggest issue, while another 13 percent said that the tools they did use (where companies did automate such extractions and massaging) were too hard to use.
The impact of these data issues, according to the survey, is that companies are making bad decisions, spending too much time and money on manually massaging data, and responding too slowly to changing market conditions or competitive situations. Specifically, 44 percent of respondents said that the result of these convoluted ways of getting data into reports was resulting in poor decisions being made. Another 23 percent said that it lowered productivity (presumably for managers as well as programmers), and 17 percent said it meant they reacted too slowly to business conditions. Obviously, Coglin Mill wants to push Rodin into such companies as a solution for extracting information from incompatible systems with different formats and then uploading that information to data warehouses, for running reports and financial statements. If the survey responses are any judge, the company has a good shot at this, since 56 percent of respondents said that they would want to host a data integration tool or a data warehouse on an OS/400 server. Another 31 percent said they would consider a Windows platform, while 5 percent would consider Unix.
While the number of public companies using AS/400 and iSeries servers is not as large as the number of small and midsized private companies using the box, Coglin Mill thinks that the need to get compliant with the Sarbanes-Oxley Act is going to help vendors of ETL tools push sales into OS/400 shops. Under Sarbanes-Oxley, CEOs and CFOs of public companies have to personally certify their financial results and supply financial reports in a more timely fashion than was allowed by the U.S. government before the law was enacted. Not only does data have to be consolidated correctly, but there also has to be an audit trail that proves the information has been moved around and merged correctly. According to a new whitepaper (in PDF format) written by Alan Jordan, chief technology officer at Coglin Mill, the Rodin tool can address some of the Sarbanes-Oxley compliance issues that public companies using OS/400 servers are facing. And the good news for OS/400 enthusiasts: Companies that choose Rodin to do this will be driving a new application onto the iSeries for a change.