• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • What Does ‘Big Data’ Mean for IBM i?

    November 12, 2013 Alex Woodie

    When IBM i 7.1 Technology Refresh 7 (TR7) ships on Friday, it will contain several updates to the DB2/400 database designed to help it handle big data, including an expansion of SQL indexes, easier movement to SSDs, and tools to track the growth of tables over time. But what exactly does big data mean on the IBM i? We set out to find some answers.

    The traditional definition for big data has to do with the three “Vs,” which refer to the volume, velocity, and variety of data types. IBM sometimes likes to add a fourth “V” to the mix to represent veracity, or the lack thereof.

    Data volumes are no doubt increasing, but that’s been true since the first 8-bit processors started to make their way into businesses. What’s different today is that data volumes are starting to get really, really big. The numbers are actually mind-boggling. According to IBM, every day the world generates 2.5 quintillion bytes (the equivalent of 2.5 exabytes, or 2,500,000,000,000,000,000 bytes) of data. Data volumes are roughly doubling every year.

    The variety of data is getting wider and more disparate, too. Structured data, such as transactions logged into DB2/400 or other relational database systems, are growing, but at a slower rate than less-structured data types, such as HTML Web pages, pictures taken with smart phones, social media posts, and PDFs. IBM estimates that, by 2020, more than 40 percent of all data will be machine-generated data coming from Web servers, RFID and GPS sensors, financial transactions, medical devices, HVAC systems, and other machines that will encompass the so-called “Internet of Things.”

    As people try to capture all these increasing data volumes and data types, the velocity becomes apparent. These pieces of data–such as Web clickstreams, call record details, and transaction information–are valuable, but that value can diminish as the data ages. Hence, it’s important to act on data as it arrives, or soon thereafter.

    New data processing paradigms have emerged to help people store and process these big new data sets. The most popular is Apache Hadoop, an open source framework that enables users to turn ordinary X86 Linux servers into huge distributed clusters that can apply supercomputing-like capabilities against petabytes worth of unstructured data.

    Then there are new NoSQL and NewSQL databases, such as MongoDB, that can easily handle semi-structured data and also scale-out horizontally in a fault-tolerate manner more easily than their relational cousins. Hadoop and the NoSQL/NewSQL databases are changing the economics of data storage and processing, and have become the building blocks of a new paradigm of big data-driven applications.

    Big Data on IBM i

    So where does the IBM i server fit into this new big data landscape? As you might imagine, you’re not going to run Hadoop or NewSQL on IBM i; those products run primarily on Linux. The proprietary nature of IBM i means it’s shielded somewhat from the big data goings-on of the wider IT universe. The fact that the IBM i server is primarily used by brick-and-mortar companies, as opposed to companies that make their money from the Web, also helps to keep the platform grounded in a firmer reality.

    But on the other hand, there’s no doubt that IBM i is being impacted by the explosion of information. While the general IT world goes ga-ga for anything with “Hadoop” in the name, and NewSQL database companies continue to sprout up like mushrooms after a spring rain, organizations are counting on their IBM i servers to quietly deal with steadily increasing data volumes, if not necessarily varieties or velocities.

    The biggest big data issue facing IBM i shops is growth of structured data stored in the DB2/400 relational database, according to IBM i experts with IBM and SEQUEL Software who talked with IT Jungle for this story.

    It used to be fairly rare for IBM i customers to have super massive databases, but now it’s become quite common, according to Mike Stegeman with the Help/Systems‘ subsidiary. “It seemed at one time to be gradual growth, but then all of a sudden it exploded,” he said.

    One SEQUEL Software customer had a requirement to access a single database file that had a billion records in it. The file supported a critical transactional system that, due to the structure of the file and the database tables, could not be purged, he said.

    “With the IBM i, a lot of it is the transactional data,” Stegeman said. “We’re getting these customers who have these extremely large files on the i, and maybe some other databases that we can access. That’s kind of what their pain points are, and they want to have a tool that’s easy to use and can access the information without breaking the bank.”

    Another common big-data related pain point has to do with partitioned tables. There’s a limit to the number of records that can be stored in a table, which leads some IBM i shops to utilize table partitioning. However, some business intelligence tools can’t support partitioned tables, and must run separate queries against them, according to SEQUEL Software, which touts its capability to run single queries against partitioned tables as a competitive advantage.

    The IBM i server excels as a database machine, and since that database is relational in nature, people aren’t going to try to squeeze into it all the different data types. There is some growth in storage of Binary Large Objects (BLOBS) and Character Large Objects (CLOBS) on IBM i, but it appears to be minimal outside of specific industries (such as healthcare, with its requirement to store diagnostic images) and ERP systems (such as SAP‘s Business Suite running on IBM i, which is apparently weird). However, many customers are starting to store lots of PDFs in their IFS systems, which is worth noting.

    Big Data Causes on IBM i

    In the wider big data world, the big data phenomenon is being driven by the desire (and the new capability) to detect and exploit business opportunities in much shorter timeframes. Companies like Facebook, Google, and Twitter use big data technologies to serve ads based on all sorts of things they know about their users, while Netflix and Amazon use it to make product recommendations based on their collected intelligence.

    Things are a little different on IBM i. In the IBM i world, the big growth of mostly relational data appears to be driven by two things: regulation and forecasting.

    Purging unused data from DB2/400 used to be a standard part of good housekeeping on the platform. But today less than 20 percent of IBM i shops purge their data on a regular basis, according to informal polls taken by Help/Systems’ vice president of technical services, Tom Huntington.

    “You have these various regulations and people aren’t able to purge their data,” he says. “[Through PowerTech] we see more and more people who are struggling with how to keep audit data around it.”

    The combination of the declining cost of storage and availability of new data warehousing technology like Hadoop are impacting IBM i shops and what data they decide to keep, Stegeman says.

    “You don’t need a whole floor on a gigantic skyscraper just to hold your hard drives to handle all your data,” he says. “They’re keeping their history around longer, either for auditing purposes or to find out how the company is doing overall. Or they may say, ‘Hey we’re not using it now, but maybe we will two or five years down the road.'”

    RELATED STORIES

    Big Data Gets Easier to Handle With IBM i TR7

    Big Data, OpenPower Are Big Levers For Power Systems

    Gartner Says Big Data Getting Bigger, Skills Lag

    Analytic Skills Is The Top Big Data Priority, Lavastorm Says

    IBM Completes One Big Data Analytic Acquisition, Announces Another

    Power Systems Marketing VP Sees Big Data Bulls Eye



                         Post this story to del.icio.us
                   Post this story to Digg
        Post this story to Slashdot

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    Rocket Software

    Unlock the full potential of your data with Rocket Software. Our scalable solutions deliver AI-driven insights, seamless integration, and advanced compliance tools to transform your business. Discover how you can simplify data management, boost efficiency, and drive informed decisions.

    Learn more today.

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    RJS Software Systems:  A to Z Forms Management Webinar. November 14
    BCD:  Recorded Webinar: Presto 5 gives IBM i green screens a more modern web GUI
    ASNA:  RPG Goes Mobile! Free Webcast! Thursday, November 21.

    More IT Jungle Resources:

    System i PTF Guide: Weekly PTF Updates
    IBM i Events Calendar: National Conferences, Local Events, and Webinars
    Breaking News: News Hot Off The Press
    TPM @ EnterpriseTech: High Performance Computing Industry News From ITJ EIC Timothy Prickett Morgan

    Silver Surfers Shredding Up The Technology Market Power Systems Provisioning For Enterprise-Level Academics

    Leave a Reply Cancel reply

Volume 13, Number 33 -- November 12, 2013
THIS ISSUE SPONSORED BY:

CCSS
ASNA
HiT Software
Essextec
RJS Software Systems

Table of Contents

  • What Does ‘Big Data’ Mean for IBM i?
  • Mobile Password Management App Supports IBM i
  • Sugar Says CRM-to-Social Opportunity is Sweet
  • HCS Moves IBM i Health Care App to Connectria’s Cloud
  • Deconstructing IBM i Cloud Migration Myths
  • MyEclipse Secure Gets the Web Goodness
  • CGC Announces More Customer Wins for Construction ERP
  • Zend Hits the Throttle with PHP Dev Tool
  • Manufacturing ERP Costs Remain High, Panorama Says
  • JD Edwards Security Found In the ‘Q’

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24
  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle