• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • IBM Mulls Using DataMigrator as Cloud Warehouse Pipeline

    June 29, 2022 Alex Woodie

    IBMers are playing around with an update to a little-known ETL tool called DataMigrator for i that would make it easier for IBM i shops to move transactional data to cloud-based data warehousing platforms. The update would use Apache Kafka for the final connection to the cloud.

    DataMigrator ETL Extension, as the product is formally known, is an extract, transform, and load (ETL) tool created by IBM back in 2015. The product was developed to work with Db2 Web Query, the Java-based business intelligence tool that it OEMed from Information Builders (now owned by TIBCO).

    DataMigrator for i (as most users call it) was originally envisioned to create, populate, and maintain database tables on Db2 for i, usually from other Db2 for i instances, but also external data sources. The software has the capability to performs bulk loads as well as real-time updates of data via changed data capture (CDC) functionality based on IBM’s remote journaling technology.

    While DataMigrator is typically used to load data onto Db2 for i so that it can be queried with Db2 Web Query, IBM is also looking at using DataMigrator to extract Db2 for i data and load it into a cloud data warehouse, such as those run by Snowflake, AWS, and Microsoft Azure, says Doug Mack, a Db2 for i consultant with IBM Systems Lab Services.

    DataMigrator for i provides a native ETL tool for loading data warehouses on IBM i. (Graphic courtesy IBM.)

    “We’ve been building some proof of concepts around using Apache Kafka to act as the streaming hub, if you will, to get data into any of these cloud-based services,” Mack told IT Jungle at the recent POWERUp conference in New Orleans, Louisiana.

    “Kafka provides these connections to just about everything,” he continues. “So we view the architecture to be, for IBM i data . . . more of a push mode, where I’m using the ETL component to grab changed data – it could be near real time – stream it over to Kafka, and then Kafka pushes it up to the target.”

    IBM i already supports Apache Kafka (supported was added nearly two years ago). Kafka, which was written in the JVM-compatible language Scala, is a distributed publish and subscribe (pub/sub) framework that was originally developed at LinkedIn to handle the large number of messages the social media site handles on a daily basis. Kafka essentially acts as a real-time message queue with built-in delivery guarantees. The open source Kafka community has built a number of connectors that enable developers to plug various databases, file systems, and applications as either sources or sinks. Confluent, the commercial entity behind Kafka, offers Kafka as a service in the cloud.

    IBM Systems Lab Services is writing code that would help to glue DataMigrator and Kafka, Mack says. “We have to build programs to do that, whether it’s data queues or REST Web service,” he says. “There’s code that has to be tied to DataMigrator to interface into Kafka.”

    While the IBM i server can function as a data warehouse – Mack certainly expressed his opinion on that at POWERUp – most IBM i shops prefer to use it strictly as a transactional machine. The reality is that many IBM i shops are adopting cloud data warehouses, such as Snowflake, Amazon Redshift, and Azure SQL Warehouse. Accommodating this reality makes good business sense.

    However, there’s a bit of an impedance mismatch between the way the cloud data warehouse folks think about data and the way IBM i folks want to run their system. Bringing them together in a control manner can help both sides get what they want, according to Mack.

    “The analytics team that’s sitting out here is saying, ‘We want to bring IBM i data into whatever it is we’re doing.’ They’re kind of looking at kind of a pull approach. ‘Let’s just get an ETL tool on the cloud side and pull data,’” Mack says. “But the IBM i people say ‘We don’t want you running these big extraction queries against our production database.’ They’re scared to death that somebody over there that doesn’t understand indexing and EVIs and things like that, or pulling data out of a remote journal.

    “That’s why they reach out to us,” Mack says. “There’s got to be a better way.”

    The work is still not quite production ready. “We’re a little bit in our infancy,” he says. Interested customers would most likely need to sign up for a Lab Services engagement to implement the solution, he says.

    For more information on the product, check out https://www.ibm.com/support/pages/datamigrator-etl-extension.

    RELATED STORIES

    Additions To The Db2 Web Query Family

    Apache Kafka And Zookeeper Now Supported On IBM i

    Db2 Web Query Lives On (Just Not V2.1)

    IBM Unveils ETL Solution for DB2 Web Query

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: Amazon Redshift, Apache Kafka, Azure SQL Warehouse, DataMigrator, DataMigrator for i, DB2 for i, Db2 Web Query, ETL, IBM i, Java, Snowflake

    Sponsored by
    Maxava

    Migrating to a new IBM Power System?

    Whether it be Power8, Power9 or Power10 – Maxava has you covered

    Our migration service moves data from the old to the new server without disruption while the business continues to operate without impacting performance. Our service avoids long periods of downtime and means businesses can reduce the risk of moving to new hardware.

    To learn more about Maxava’s migration service, call us on 888 400 1541 or VISIT maxava.com

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    PowerTech AV Automatically Detects Ransomware Activity Reader Feedback On Guru: The Finer Points of Exit Points

    Leave a Reply Cancel reply

TFH Volume: 32 Issue: 46

This Issue Sponsored By

  • Maxava
  • Racksquared
  • LaserVault
  • OCEAN User Group
  • Raz-Lee Security

Table of Contents

  • IBM Mulls Using DataMigrator as Cloud Warehouse Pipeline
  • PowerTech AV Automatically Detects Ransomware Activity
  • Infor Puts CM3 Project On Hold
  • Four Hundred Monitor, June 29
  • IBM i PTF Guide, Volume 24, Number 26

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Fortra Issues 20th State of IBM i Security Report
  • FNTS Launches Managed Services for Power Servers in IBM Cloud
  • Total LTO Shipped Capacity Up Slightly in 2022
  • Four Hundred Monitor, May 24
  • Update On Critical Security Vulnerability In PowerVM
  • Critical Security Vulnerability In PowerVM Hypervisor
  • IBM Power: Hosted On-Premises Or In The Cloud?
  • Guru: Watch Out For This Pitfall When Working With Integer Columns
  • As I See It: Bob-the-Bot
  • IBM i PTF Guide, Volume 25, Number 21

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2023 IT Jungle