Data Quality Tool from AMB Now Supports i and z/OS Platforms
June 3, 2008 Alex Woodie
Feeling good about the quality of your data? Not losing any sleep over those promises made to the CIO? Then stop reading here. If, on the other hand, you hold reservations about what might be lurking in your databases, then it’s time to admit you might have a data quality problem. One software vendor that can help identify and fix common data quality problems, AMB Dataminers, recently introduced support for i OS and z/OS servers in its flagship product, AMB-Predictive Data Management.
People are reluctant to talk about data quality issues, and are even more reluctant to do something about it, according to Steven Meister, president of AMB Dataminers, which is based in Chicago. “People are scared of it,” he says. “They don’t want to open up a can of worms.”
But that unopened can has gotten huge, and is growing with every new disk and stick of memory you add to your infrastructure. Meister cites a study by the Data Warehousing Institute that says bad data costs the world $600 billion in lost productivity every year. Those numbers should be taken with a grain of salt, considering the global IT budget is estimated to be about $1 trillion. But even if the problem is only half of that, that’s a lot of worms.
While you may have processes in place to validate data before it hits your system of record, there are probably cracks that allow little errors–such as misspellings of names, extra spaces added to fields, the use of special characters, or not abiding by formatting conventions–to creep in through side entrances. When these little goofs add up, you may be able to believe only 80 percent of your data. That costs your company real money when mailers go out, the call center ramps up, or sales come in way under expectations.
Meister’s team at AMB Dataminers developed AMB-PDM to help companies fix these problems, and to do so without breaking the bank. The software, which is available in Java and .NET flavors, enables users to analyze their data quality while it sits in its original database repository, including Oracle, SQL Server, Teradata, flat files, DB2 UDB, and, now, DB2/400. The suite also includes a front-end PC-based data analysis tool, and a third component for connecting to Microsoft‘s extract, transform, and load (ETL) tool, SQL Server Integration Services (SSIS).
AMB added support for i (formerly i5/OS) and z/OS earlier this year at the request of customers. The company has more than 30 customers using its software, including some big names like AMWAY, Cooper Tires, and Zurich Insurance, and some of these customers (not necessarily the ones listed) wanted to be able to use AMB-PDM to analyze their DB2/400 and Z/OS databases, without having to move all the data to another box, as many of the data profiling and data quality tools require, Meister says.
“Even IBM themselves, their data profiling cleaning tools can’t talk directly [to the AS/400 or mainframe]. They’re an in-memory process,” Meister says. “The mainframe-AS/400 world has been forced to move their data. In a lot of cases, that’s just extra effort. We added support for these platforms because we felt that the market’s been dropped by software vendors.”
AMB-PDM’s specialty is data profiling, Meister says. “When it comes to data profiling, outlier discovery, matching capabilities and matching data anomalies, there’s nobody close to us,” he says. “Finding values outside of the standard deviation range, matching tables to another by single and multiple columns by using fuzzy probabilistic matching–that’s our strong suit.”
The software also enables users to fix the bad data–the data quality piece–although the IBM Datastages and Trilliums of the world probably offer richer data quality capabilities, Meister concedes. Informatica and SAS are also names he mentions.
Meister doesn’t concede anything when it comes to price, and the huge expenses organizations pay to house “armies” of consultants for weeks at a time to fix their data. “Ten years ago, you would have had to bring in an army of 10 to 30 people from Arthur Anderson or one of the big accounting firms for a year. They’d go though all your files and figure out what’s wrong,” he says. “With this software, I can do in three hours what would take 30 top level consultants six months to figure out.”
In contrast to the multi-month, multi-man, multi-million-dollar approach, Meister stresses a simple, low-cost approach. The AMB-PDM suite ranges in cost from $5,000 to $50,000, considerably less than competitors. The company also offers free proof-of-concept to anybody interested in seeing the software in action.
Stripping out the bad data isn’t always about saving money. In some cases, it’s a safety issue. “If you’re a hospital, and you have bad information, you’re going to mistreat a patient. You’re going to give a patient the wrong drugs–somebody else’s drugs–because you spelled the names wrong and they came up the same,” Meister says.
Whether you’re implementing a data warehouse, starting a master data management (MDM) project, or installing a new ERP or CRM system, you need to take pains to ensure the cleanliness and reliability of the data. Without certainty in the quality of your data, you can’t trust new systems you bring online.
“If you have all this information, and now you know you have these issues, you need to fix them, or at least be aware of them, so when you build your ETL process, when you build your data warehouse or install an ERP or CRM system, you know you need to fix them,” Mesiter says. “Otherwise, all you’re doing is moving garbage in, garbage out, and you paid millions of dollars for a new system that basically gives you bad results quicker.”
For information on how to receive your free ABM-PDM proof-of-concept, contact the vendor at www.predictivedatamanagement.com.