• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Hadoop and IBM i: Not As Far Apart As One Might Think

    May 20, 2015 Alex Woodie

    The worlds of IBM i and Apache Hadoop appear to be diametrically opposed. One is a proprietary, RISC-based platform used primarily to run transactional systems. The other is an open source, X86-based platform used primarily for big data analytics. But as far apart as the two platforms seem to be, at least one IBM i software vendor, mrc, is aiming to find some common ground between them.

    Hadoop is a distributed data storage and processing framework that engineers and researchers at Yahoo and Google are credited with helping to create. While indexing the Internet was Hadoop’s first use case, it’s since been adapted and adopted to do all sorts of other stuff, such as executing machine learning algorithms, running SQL- and NoSQL-style data warehouses, and even graph databases. (You can read all about Hadoop at www.datanami.com, a big data analytics publication that I write for and manage.)

    Chicago-based mrc used the recent COMMON conference to highlight the work it’s doing around Hadoop. Mrc, you will remember, is the company behind m-Power, the template-based Web application development tool that generates enterprise Java code that can run on any Java-supported platform and database, including (but not limited to) the IBM i and DB2 running on IBM Power Systems servers.

    Mrc is supporting Hadoop in two distinct ways, including analytical and transactional workloads. For analytical workloads, the company is supporting m-Power-generated apps running against Hive and Impala, two SQL-based database engines that run atop Hadoop. This seems like a natural fit for mrc, considering that the majority of the applications that its customers generate are business intelligence and reporting applications.

    Impala and Hive are standard elements of the emerging Hadoop stack (which is mostly written in Java), and basically create the analytical databases that you can find from the likes of Teradata, Hewlett-Packard Vertica, and IBM Netezza. The primary difference with Impala and Hive is that they operate against the Hadoop Distributed File System (HDFS), which can store hundreds of petabytes of data striped across tens of thousands of X86 servers. Traditional data warehouses, by comparison, typically max out at a few hundred terabytes, and they cost a lot more, too.

    For transactional workloads, mrc has partnered with a Silicon Valley startup called Splice Machine. Founded by Monte Zweben, a veteran of the first dot-com boom (and bust), Splice has built a traditional row-oriented relational database that runs atop Hadoop. The companies have tested the combination, and any Java-based app that was initially generated to run on an IBM i-based Power Systems server can also run on the Splice RDBMs running on Hadoop.

    This gives customers more options for getting the most bang for their buck out of programming investments, says Zweben. “Our partnership with mrc gives businesses a solution that can speed real-time application deployment on Hadoop with the staff and tools they currently have, while also offering affordable scale out on commodity hardware for future growth,” he says in a press release.

    Not every IBM i shop is asking for Hadoop capabilities, but there have been some inquiries, says mrc’s marketing director Steve Hansen.

    “Our message to the IBM i crowd is businesses have a lot more data that they realize,” he tells IT Jungle. “There’s a lot more data out there, with sensor data and software log files. Every piece of hardware these days produces data. IBM i shops need to start storing that, and Hadoop is the easiest way to start storing that data.”

    What mrc is doing shouldn’t be viewed as a threat to the IBM i way of life, but rather a way to augment what it already does for you, Hansen says. “We’re not telling people it’s time to replace IBM i. We’re saying the data is getting bigger. There’s unstructured and social data, and businesses just aren’t doing much with it yet. I think it’s overwhelming.”

    Hansen, who also oversees mrc’s website, used Hadoop to store website server log files that were going to waste, and used m-Power to build some simple dashboards against it that told him what areas of the site were being used and which ones weren’t. It’s those types of simple applications that customers can start with and begin to explore how Hadoop can benefit them.

    “Right now we’re trying to build awareness to what Hadoop is and how people who are using IBM i can take this data that they’re not taking advantage of and put it into Hadoop,” he says. “I don’t see it as a replacement for their IBM i. It’s more something that can enhance what they’re currently doing and tracking all this data they’re not tracking.”

    Don’t think for a second that the smart folks at IBM–in Armonk and Rochester and Somers and Austin–aren’t watching this trend closely and looking for a way to sell the IBM i customer base on this thing called Hadoop. Of course, that’s part of the problem–Apache Hadoop is free. IBM, of course, sells something called IBM InfoSphere BigInsights that is a distribution of open source Hadoop. But IBM i shops don’t have to pay a penny to get started with Apache Hadoop.

    RELATED STORIES

    IBM Power Systems Can Do Big Data Analytics, Too

    What Does ‘Big Data’ Mean for IBM i?

    Big Data Gets Easier to Handle With IBM i TR7

    Big Data, OpenPower Are Big Levers For Power Systems

    Gartner Says Big Data Getting Bigger, Skills Lag

    Analytic Skills Is The Top Big Data Priority, Lavastorm Says

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    Midrange Dynamics North America

    With MDRapid, you can drastically reduce application downtime from hours to minutes. Deploying database changes quickly, even for multi-million and multi-billion record files, MDRapid is easy to integrate into day-to-day operations, allowing change and innovation to be continuous while reducing major business risks.

    Learn more.

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    The Omni User:  Chicago's OMNI Technical Conference, June 4-5, Palos Hills, Illinois
    LaserVault:  FREE Webinar - IBM i: Understanding Tapeless Backups. May 27
    United Computer Group:  VAULT400 BaaS delivers secure cloud backup and DR solutions

    Job User Name And Current Job User Power8 Iron Gets New I/O Options

    Leave a Reply Cancel reply

Volume 25, Number 28 -- May 20, 2015
THIS ISSUE SPONSORED BY:

Maxava
ARCAD Software
LaserVault
Remain Software
Manta Technologies

Table of Contents

  • Hadoop and IBM i: Not As Far Apart As One Might Think
  • Shield Goes Lean and Mean with HA Software
  • ARCAD Release Management Fits With UrbanCode DevOps
  • Kisco Rolls with 2FA, Revs Network Security Tool
  • BCD Tweaks Web Dev Tools. . . Focal Point Updates DR FlashCopy. . . Boadway Puts IBM i Performance Data on the Cloud. . .

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23
  • SEU’s Fate, An IBM i V8, And The Odds Of A Power13
  • Tandberg Bankruptcy Leaves A Hole In IBM Power Storage
  • RPG Code Generation And The Agentic Future Of IBM i
  • A Bunch Of IBM i-Power Systems Things To Be Aware Of
  • IBM i PTF Guide, Volume 27, Numbers 21 And 22

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle