• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • IBM i Gets An Influx Of Machine Learning Tooling

    July 25, 2018 Alex Woodie

    Thanks to the new RPM and Yum open source software delivery methods unleased by IBM earlier this year, IBM i shops can now run the latest in Python-based machine learning tools, including NumPy, SciPy, Pandas, and Scikit-Learn.

    Interest in using Python to build machine learning models has exploded in recent years, and we’re now at the point where Python is considered to be the predominant language for doing data science, followed closely by R, which is taught more in academic settings.

    While the capability to train or run a machine learning model isn’t generally the number one requirement for organizations when they selected the IBM i server to be their business computing platform, having popular Python libraries like NumPy and Scikit-Learn run on it sure doesn’t hurt, says Jesse Grozinski, IBM‘s business architect for open source software on IBM i.

    “We have right now at least a dozen different companies we have talked to who are actually interested in application development using these packages,” Grozinski told IT Jungle in a recent interview. “I can’t disclose completely what they’re doing, but they see that machine learning could possibly help them and they’re willing to do their own application development on these packages to see if it will help them.”

    Here are the Python-based tools that IBM i shops can now use:

    NumPy is a Python-based library for creating large, multi-dimensional arrays and matrices, and also includes a collection of high-level math functions to operate on these arrays. It was created by Travis Oliphant and released as open source through a BSD license in 2005.

    Oliphant, who did his Ph.D. work in biomedical imaging at the Mayo Clinic College of Medicine and Science in Rochester, Minnesota, also had a hand in SciPy, another Python module that has modules for optimization, linear algebra, integration, interpolation, special functions, FFT, and signal and image processing. SciPy was initially released in 2001 and is also open source with a BSD license.

    Scikit-Learn is a collection of classification, regression, and clustering machine learning algorithms libraries for Python. Specifically, it includes vector machines, random forests, gradient boosting, k-means, and DBSCAN algorithms. It was originally created by David Cournapeau in 2007 and is designed to complement NumPy and SciPy. It also has a BSD license.

    Pandas, meanwhile, is a Python-based software library that’s designed to help data scientists with data manipulation and analyses. It includes structures and operations for manipulating numerical table and time-series data sets, or re-shaping data sets, and merging and joining data. It was initially created by Wes McKinney and released in 2008 with a BSD license.

    Grozinski admits that a lot of it is experimental at this point. “A dozen companies say machine learning might help us and eight of them might say ‘That didn’t really do what we wanted to do’ but there could be two or three who say ‘This is great,'” he said. “So we definitely have that interest.”

    While Python was initially created as a high-level interpreted language for general-purpose computing, interest in the language has exploded thanks to the big data and data science phenomena. Even though the IBM i is not currently considered a platform for machine learning, it just made sense for IBM to bring the most popular Python-based data science packages to the IBM i.

    “When we first released Python in 2015 in 5733-OPS, one of the disheartening things with the development team is we said, ‘Hey everybody, here’s Python.’ And the first things people wanted to do with it was machine learning stuff, and it didn’t work,” Grozinski said. “None of these package work. Numpy didn’t work. Pandas didn’t work, SciPy didn’t work. It wasn’t even close to working. That was kind of disheartening for us. But now that we’re doing things in a way that’s aligned with the rest of the open source world, it’s just plain working.”

    That experience is similar to the one that many Python-using data scientists went through in the early days of the big data boom. Getting all the various Python packages required to do productive data science work with the language was difficult. There were various package dependencies involving languages, operating systems, and middleware, and it proved extremely frustrating for scientists who just wanted to do science.

    That problem was largely solved by Anaconda, which Oliphant founded in 2012 with Peter Wang. The Austin, Texas, company packaged all the required Python software products up and delivered it in an open source package.

    Anaconda’s software is a standard part of almost every data scientists’ toolkit at this point, and is downloaded at the rate of 2.5 million copies per month. Anaconda also has a partnership with IBM to run the data science software on System z mainframes and also to include the data science software in its PowerAI software.

    Could the full Anaconda package soon be available IBM i shops via the Yum installer? “All I can say is stay tuned,” Grozinski said. “We’re having interesting conversations.”

    RELATED STORIES

    RPM and Yum Are a Big Deal for IBM i. Here’s Why

    Should Spark In-Memory Run Natively On IBM i?

    Unwinding Python’s Data Science Potential On IBM i

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: IBM i, NumPy, Power z, PowerAI, Python, RPG, SciPy, Yum

    Sponsored by
    Midrange Dynamics North America

    With MDRapid, you can drastically reduce application downtime from hours to minutes. Deploying database changes quickly, even for multi-million and multi-billion record files, MDRapid is easy to integrate into day-to-day operations, allowing change and innovation to be continuous while reducing major business risks.

    Learn more.

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    IBM i Slated to Support Java 11 More Transparency Needed For Open Source Running on IBM i

    One thought on “IBM i Gets An Influx Of Machine Learning Tooling”

    • Arun says:
      February 6, 2019 at 12:57 pm

      Hi,
      Great article. We use python on IBM i currently but can’t get the cool packages in our system since they have dependencies and need to be recompiled.
      Any updates on a way to load Pandas into IBM i machine. Appreciate your insights.

      Thanks,
      Arun

      Reply

    Leave a Reply Cancel reply

TFH Volume: 28 Issue: 49

This Issue Sponsored By

  • Rocket Software
  • ARCAD Software
  • COMMON
  • Computer Keyes
  • WorksRight Software

Table of Contents

  • More Transparency Needed For Open Source Running on IBM i
  • IBM i Gets An Influx Of Machine Learning Tooling
  • IBM i Slated to Support Java 11
  • Four Hundred Monitor, July 25
  • IBM i PTF Guide, Number 20, Volume 29

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24
  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle