Options Abound for IBM i Data Replication
October 14, 2014 Alex Woodie
As an IBM i system administrator, it’s your job to ensure that data in DB2 for i is available where it’s needed. No server is an island these days, not even the IBM i-based Power Systems servers, and so that means replicating data to external databases is a requirement. But what’s the best way to power data integration? Should you build it yourself or buy off the shelf? We’ll try to provide you some answers.
At many organizations, the IBM i server might house the core ERP application and be the system of record, but there are plenty of other databases and programs that need data from the central record keeper. For example, Oracle-based business intelligence systems or SQL Server-based reporting systems are commonly fed from production IBM i systems.
Keeping the systems in-synch and up-to-date is not always easy, and the difficulty level goes up as your latency requirements go down. It doesn’t take a tech genius to use FTP to initiate batch dumps from a source database to a target on an ad-hoc basis. Managed file transfer (MFT) tools have taken some of the risk out of using barebones FTP.
If you’re looking to maintain a separate warehouse on a semi-regular basis, there are also extract, transform, and load (ETL) tools that will pull data from DB2 for i, translate it into the correct format, and move it into a target system, such as a data warehouse. Coglin Mill, developer of the RPG-based RODIN tool, probably has the highest functioning ETL tool on the IBM i market. Informatica also has some native capabilities, and Talend has been known to play in the IBM i arena. If you like to roll your own, you could develop a DB2 for i SQL stored procedure that allows an external database to pull data on demand via ODBC or JDBC.
The complexity level and the risks go up a notch if you’re looking to do real-time integration. If you’re handy with DB2 for i and feel comfortable using triggers and journaling, you could develop your own system that monitors the journal receiver for changes to files. If you want something with more bells and whistles–not to mention professional technical support to call when something goes wrong–then you’re probably in the market for change data capture (CDC) software.
IBM offers CDC capabilities with its InfoSphere product that was formerly based on DataMirror Transformation Server. Because high availability and data replication are so closely related, other HA software vendors, such as Vision Solutions, also support database replication among multiple, non-similar databases, such as DB2 for i, SQL Server, Oracle, MySQL, Sybase, and the others.
There are also dedicated CDC products from Oracle, HiT Software, and Attunity. Oracle added DB2 for i support to its GoldenGate offering just a couple of years ago. Pricing for GoldenGate is not cheap, however.
You will probably get a better deal for CDC software from HiT or Attunity than Oracle or IBM. HiT Software (which is now owned by BackOffice Associates) has a good thing going with DBMoto, a Windows-based CDC engine that can move data in real time among more than 20 different databases, including relational databases like DB2 and Oracle, but also the databases powering massively parallel column-oriented data warehouse offerings, such as those from Actian, as well as Amazon Web Services‘ RedShift cloud-based warehouse.
Attunity can also keep IBM i-resident data moving in real time among a variety of different systems with Replicate, its log-based CDC offering. In addition to providing a trickle-feed of updates from source databases, the product can replicate entire database schemas and supports a “snapshot” extract-and-load option that gives customers the capability to replicate entire databases to their target systems. Like HiT, Attunity has been moving to support “big data” analytics platforms recently, such as Teradata, IBM Netezza, and EMC Greenplum MPP databases.
Coglin Mill, by the way, also offers CDC functionality as part of its “real-time ETL” functionality in RODIN. Coglin Mill was recently bought by HelpSystems, which has several business intelligence tools in its stable now, with ShowCase and SEQUEL Software. If you’re looking to do data warehousing and BI on the IBM i platform, this combination of tools offers some advantages.
Data is the lifeblood of business these days, but moving it still requires time and effort. There’s no one-size-fits-all answer for data integration and replication. You may pick one solution over another depending on your speed, latency, and budget requirements. Carefully analyzing all the factors is the only way to help you make the right decision.