Coglin Mill’s ETL Makes Data Warehousing on IBM i More Attractive
October 25, 2011 Alex Woodie
When it comes to data warehousing, there are still some in the IBM i community who think that workload belongs on Windows or Unix boxes. But with its integrated database, huge scalability, and fair DASD costs, there are no good reasons today why the IBM i server can’t perform this function. And with the latest release of its extract, transform, and load (ETL) software, the folks at Coglin Mill are giving customers an important tool for enabling the IBM i server to be the workhorse of their data warehouses.
There’s no better way to get Alan Jordan’s goad than to suggest that the IBM i server just might not be cut out for data warehousing workloads. “There’s been some misconception and misinformation spread over the last eight to 10 years about where is the best place to put your data warehouse,” says Jordan, a vice president at Coglin Mill, which develops the RODIN ETL tool out of offices in Australia and Rochester, Minnesota.
“Some of the vendors have suggested that the IBM i not a good place to put it, and say you’re better off putting this on a SQL Server box,” he continues. “There was, of course, the high cost of disk. But most of that is old stuff. There’s no reason why you shouldn’t put it on IBM i today. The price of disk has come down, and people are recognizing that it’s more expensive in the long run to manage a server farm.”
The overall market for business intelligence has been growing strongly for the last few years, and even midsize IBM i shops today are looking to do something more with all that data they’ve been collecting, such as creating dashboards for the Web and mobile devices. But before you do business intelligence, you need a data warehouse, and that means business has picked up for Coglin Mill, which makes the most full featured, IBM i native ETL tool in the IBM i marketplace. “We’re seeing good opportunities, and being able to show that IBM i is a great place to be doing business intelligence,” Jordan says.
RODIN Release 7
In the next few weeks, Coglin Mill expects to ship RODIN Release 7, the first major revision of the product in two years.
Release 7 brings enhancements in several areas, including support for text files, new date field transformations, a new graphical editor that simplifies complex expression-based transformations, enhancements to its change data capture (CDC) function, and the capability to push data warehouse data out to other database management systems.
The new support for processing text fields will improve RODIN’s capability to input and output flat files and delimited text files to or from the DB2/400-based data warehouse. This is important for organizations that want to bring outside data feeds from an industry group, or perhaps from business partners that store this information as a spreadsheet (which must be transformed into a delimited text file first) into their data warehouse. It’s also helpful for pulling data in from protected databases, where the data must be manually transferred via FTP because a direct database connection is not allowed.
One could get this information into RODIN before, but it took a bit of work and didn’t offer a lot of “bells and whistles.” “What we delivered is a new editor that identifies the columns in text files or a delimited file based on delimiters, and then treats them as if they’re a real table, so you don’t have to parse them in substrings and that sort of stuff,” Jordan explains. “We’re seeing more people that have to deal with other types of data. It’s not just IBM i data. Now 60 to 70 percent of our customers are pulling data from outside sources. They want to be able to join it together, and create a single report on that data, so the data warehouse is critical for their implementation.”
The updated software also gets improved support for converting all types of date and time values into real date and time formats. “When you’re doing analysis based on dates and you want to know when something happened, you must have your dates as true date fields, because when your end tools are all SQL-based, you can do any sort of date manipulation,” Jordan says. “In RODIN Release 7, we now basically support any date format, whether it has words in it, three or four character months, or a.m. or p.m. in the time field. You tell RODIN ‘This is the format for the date,’ and it automatically recognizes it, validates it, and converts it to a true date time field in your data warehouse tables.”
A new expression-based editor gives experienced developers more powerful tools for creating complex conditional rules in RODIN Release 7. Jordan says this new “expert editor” will save the developer’s time when he wants to insert a calculation or a built-in function into RODIN’s transformation processes, compared to using the existing, or “intermediate editor.”
The new feature will be useful if, for example, a developer wants to replace the abbreviations “rd” and “dr” in his source data with the full words “road and “drive” in his data warehouse. “A scan and replace function is ideal for that, but it’s two different functions involved,” he says. “So, using the new expression-based editor, you can do that in one step, whereas in the current rules editor, it’s multiple steps and more difficult to do.”
RODIN has supported the capability to pull data from external databases since Release 4. With Release 7, Coglin Mill has delivered the capability to push data from the DB2/400 data warehouse out to Oracle, SQL Server, and other relational database management systems. This feature supports the idea of using DB2/400 as the central data warehouse, and using other databases to house divisional data marts for regionalized processing.
Finally, the CDC function in RODIN, which was introduced in Release 6, has been enhanced with Release 7 to allow for real time ETL and capturing of data across multiple tables. CDC is just a small component of RODIN, Jordan says, but it has been gaining steam since it was introduced two years ago.
RODIN Release 7 is slated to become generally available toward the end of the month or early November. Pricing starts at about $5,400 for annual subscriptions, or $15,000 for a perpetual license. For more information, see the vendor’s website at www.coglinmill.com.