tfh
Volume 18, Number 29 -- August 10, 2009

A Peek Inside IBM's Smart Analytics System

Published: August 10, 2009

by Timothy Prickett Morgan

While The Four Hundred was on holiday at the end of July, IBM hosted a shindig at one of its adjunct T.J. Watson Research Center campuses in Hawthorne, New York, which is just a 30-minute drive from my house in Upstate Manhattan. And even though I was busy with a bunch of non-work work--you know what I am talking about here, people--I decided to put on some clean clothes and go listen to some of the top brass at Big Blue talk some about this business analytics and optimization (BAO) opportunity that IBM is chasing and the systems it is deploying to chase it.

The fact that IBM decided to shell out $1.2 billion to buy statistical and predictive analytics software vendor SPSS that very morning I was visiting IBM was actually a coincidence. But just the same, the SPSS code is going to play a big part in this BAO opportunity and the systems that the company is going to build to do predictive analysis on real-time business data. I was amused no end that no one, and I mean no one, at the event had any clue that IBM might even be interested in SPSS, much less that it was in the process of buying it, which only goes to show either how much we need predictive analytics or how much this stuff is actually snake oil.

But seriously, I don't think the BAO market that IBM has defined as distinct from the business automation segment (meaning, enterprise resource planning, supply chain management, customer relationship, management, and that mission-critical, back-end stuff) is some kind of mirage, and IBM definitely needed some heavy duty and predictive analytics to accomplish its goals for building BAO boxes, as I have come to call them since first hearing about IBM's plans back in May at its annual IT and Wall Street analyst day.

As we go to press on Friday (August 7), it looks like IBM might have to fight to keep SPSS now that it is in play. The rumors are already swirling around that Oracle, Hewlett-Packard, SAP, and Microsoft might offer SPSS shareholders a better deal, setting off a bidding war. The kind of capabilities that IBM wants to put into the Smart Analytics System is something all the big players want--and need--to sell. And that, by the way, puts privately held SAS Institute, which had $2.26 billion in sales in 2008, into play as well. Whether SAS wants to be acquired is another matter--the company has resisted that temptation for more than three decades.

So what is in this future-predicting machine that IBM wants to sell you? A lot of familiar technology, all highly integrated and optimized for the particular job at hand, and more importantly, sold under a single product number with six different configurations, making the process of buying it easier than buying piecemeal parts, and having a single means of supporting the entire stack, including twice-yearly tunings by Big Blue's engineers to ensure that the data warehouse and analytics code is running at its most efficient level.

Let's start with the hardware and software. The underlying iron in the Smart Analytics System is familiar enough: a Power 550 server with half its complement of processor cores--two dual-core Power6+ chips running at 5 GHz--and 32 GB of main memory. Each server node in the shared-nothing database cluster has two dual-port Gigabit Ethernet ports, with two being used for managing the server nodes or extracting or loading data into the data warehouse and two being used to cluster the boxes so they can talk to each. Each server also has two dual-port 4 Gb/sec Fibre Channel adapters to link out to a DS5300 disk array, which is cross-coupled to four Power 550 nodes and which is in turn back-ended by eight EXP5000 disk drawers for active data and another EXP5000 that has hot spare drives. You build up the Smart Analytics System by cookie-cutting multiple Power 550, DS5300, and EXP5000 boxes together. The servers are cross-linked to multiple DS5300 arrays through redundant SAN40B switches and the servers are linked to each other and to the outside world through EX4200-48T Ethernet switches from Juniper Networks. Each server has a bunch of 146 GB 15K RPM disks, and the DS5300s and EXP5000s use the same drives such that each Power 550 server has 32 disk drives of its own to play with for data warehousing and others for supporting operating systems and applications. The disks are protected with RAID 5 algorithms and also have hot spares, and the data warehouse has a 4 TB user space to play with.

Now, I know what you are thinking. Because this BAO box is a Power 550, that means it could use i 6.1 and its integrated DB2 for i database as the foundation of the data warehouse. It could also use Linux and the DB2 variant for that operating system, too. But, alas, the machine is based on AIX 6.1 at Technology Level 2 and Service Pack 3 and DB2 9.5 at the Feature Pack 4 level. IBM is using its General Parallel File System (GPFS) V3.2.1, the 64-bit implementation of its parallel file system, to support the data warehouse underlying the Smart Analytics System. On the data warehouse, each Power 550 supports four logical data partitions, or LDPs, with each having one processor core, 8 GB of memory, and eight disk drives (1.17 TB) of disk capacity associated with them. (The LDPs are database partitions, not logical partitions carved up with the PowerVM server virtualization hypervisor that comes with Power Systems iron.) The basic system also has IBM's Tivoli System Automation V3.1.0.3 software, and then adds on IBM's InfoSphere Warehouse 9.5.1 and Cognos 8 BI Server, BI Samples, and Go Dashboard 8.4 FP2 software.

In the base configuration of the Smart Analytics System, customers have two Power 550s supporting data for the warehouse (one active and one a high availability backup) plus a management node running the Tivoli software and another being used as an administrative node for the software. I am no expert, but this seems to be an incomplete configuration, even if it does take up two racks of space with the servers, storage, and switches. There are six different configurations of the Smart Analytics System, which range up to a hefty box with 19 racks of iron, including 53 data nodes, a slew of standby gear, and giving the analytics applications a 200 TB user space. (This is the XXL size, and the entry configuration is known as the XS size. Each successive size above the S "T-shirt" sized BAO box basically doubles the size of the user space from 12 TB to 25 TB (M), to 50 TB (L), to 100 TB (XL), to 200 TB (XXL). These configurations support between 100 and 5,000 named users, but only 50 concurrent users except for the largest XXL setup, which is rated at 100 concurrent users.

In terms of performance, the BAO boxes are showing scale as well as better performance compared to prior setups running the InfoSphere Warehouse and Cognos software. How's this for scale: IBM is working with Northrop-Grumman on a version of the BAO box for some spook agency of the U.S. government that has 200 TB of active data and 20 PB (that's petabytes) of archived data that is capable of handling 20,000 queries per day. Most companies don't need that kind of scale for their data warehouses and analytics, of course. But they definitely want a setup that comes in the door ready to suck data out of their production systems and start fielding up some answers to complex questions. On the Cognos Mixed Marketing benchmark test, the Smart Analytics System, with all of its tunings and optimizations as well as database compression and storage tweaks, was able to handle three times as much work and do so with 50 percent less floor space than a "best practice" Power Systems cluster running the same Cognos tools.

IBM is not talking about price, but I get the feeling that all of the dough that customers might have paid consultants to design, configure, and integrate the servers, storage, and software in a data warehouse with analytics extensions is not going to be passed down to customers as a discount when they buy a BAO box. No, my friends, this Smart Analytics System is going to be an AS/400 in terms of how it is sold (even if it doesn't support i 6.1 and its version of DB2). And that means, I am guessing, that IBM will charge more than the sum of the parts because of the lower cost of running the Smart Analytics System and the optimizations it has put in there to make it sit up and bark.

Still, it would be nice to see a blade version of this running i 6.1 and DB2 for i as the data warehouse, aimed at midrange i shops who really don't want to use AIX unless they have to. It could yet happen, if you all start yammering in IBM's ear and stamping your feet a little.


RELATED STORIES

IBM Gets Hybrid with Servers, Talks Up BAO Boxes

IBM turns back on server history: To give and to hybrid (The Register)

Beep, Beep: Roadrunner Linux Super Breaks the Petaflops Barrier

IBM Aims for Server Expansion in 2008, Including System i Reincarnation

Brazilian Game Site Chooses Hybrid Mainframe-Cell Platform

IBM's Plan for an Adjacent, Custom Systems Market



                     Post this story to del.icio.us
               Post this story to Digg
    Post this story to Slashdot


Sponsored By
PRODATA COMPUTER SERVICES

Create your own stimulus package!

DBU - super easy to use. The leading data access tool on the market.

DBU RDB - does the work for you. Analyze data on all your servers.
MySQL, Microsoft SQL Server, Oracle, DB2 databases and others.

RDB Connect - programmatic access to remote data! Full SQL access to
remote databases from all System i high-level languages.

Download your free trials NOW.
Order today and SAVE $500!
www.prodatacomputer.com
800.228.6318


Editor: Timothy Prickett Morgan
Contributing Editors: Dan Burger, Joe Hertvik, Brian Kelly, Shannon O'Donnell,
Mary Lou Roberts, Victor Rozek, Kevin Vandever, Hesh Wiener, Alex Woodie
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message.

Sponsored Links

ARCAD Software:  Managing ILE Development - View recorded Webinar!
PowerTech:  Request a free System i security assessment today!
COMMON:  Celebrate our 50th anniversary at annual conference, May 2 - 6, 2010, in Orlando

 

 

IT Jungle Store Top Book Picks

Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
The iSeries Express Web Implementer's Guide: List Price, $49.95
The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
The System i Pocket RPG & RPG IV Guide: List Price, $69.95
The iSeries Pocket Database Guide: List Price, $59.00
The iSeries Pocket SQL Guide: List Price, $59.00
The iSeries Pocket Query Guide: List Price, $49.00
The iSeries Pocket WebFacing Primer: List Price, $39.00
Migrating to WebSphere Express for iSeries: List Price, $49.00
Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
Getting Started with WebSphere Express for iSeries: List Price, $49.00
Can the AS/400 Survive IBM?: List Price, $49.00
Chip Wars: List Price, $29.95


 
Four Hundred Stuff
Infor Launches New CRM App for System i

ERP Application Functionality Prompts Migration to IBM System i

Linoma Adds Features to i OS Encryption Utility

SEA Brings i OS Performance Tool to North America

IBM to Buy SPSS for $1.2 Billion

Four Hundred Guru
Designing DB2 for i Stored Procedures for Simulated Array Handling

Using Free-Format Calcs with Cycle Programs

Admin Alert: Changing User Passwords on the Fly

Four Hundred Monitor
Four Hundred Monitor's
Full iSeries Events Calendar

System i PTF Guide
August 1, 2009: Volume 11, Number 31

July 25, 2009: Volume 11, Number 30

July 18, 2009: Volume 11, Number 29

July 11, 2009: Volume 11, Number 28

July 4, 2009: Volume 11, Number 27

June 27, 2009: Volume 11, Number 26

TPM at The Register
SGI chases Cray with baby cluster

Cisco claims unified computing is taking off

Can Liquid Computing ride Cisco's California coattails?

BMC ekes out quarterly growth

Ethernet switch sales pounded by falling economy

Canonical removes middleman from Ubuntu management

AMD demos live Magny-Cours migration

Cray lands $70m super deals

Cray swings to a profit in second quarter

HPC CIOs drifting into clouds

Sun deals Sparc boxes, x64 iron

Micron preps fat DDR3 server memory

Chip group says 2009 will be terrible, but not awful

GlobalFoundries inks wafer baker deal with STMicro

THIS ISSUE SPONSORED BY:

ProData Computer Services
Bytware
East Coast Computer
SkyView Partners
RJS Software Systems


Printer Friendly Version


TABLE OF CONTENTS
A Peek Inside IBM's Smart Analytics System

Maximum Availability Foresees Growth with 20/20 Program

Vision Solutions Promotes Two Flavors of Continuous Data Protection

As I See It: Daniel, Part One

Avnet and Arrow: System Sales Might Have Hit Bottom

But Wait, There's More:

New Midrange User Group for Tennessee Valley . . . Amtrak Re-Ups Server Outsourcing Contract with Big Blue . . . Magic Software's Revenue and Profits Decline in Q2 . . . IT Shops Struggle to Control Personnel Costs . . . Who Has the Strongest IT Brands? . . .

The Four Hundred

BACK ISSUES




 
Subscription Information:
You can unsubscribe, change your email address, or sign up for any of IT Jungle's free e-newsletters through our Web site at http://www.itjungle.com/sub/subscribe.html.

Copyright © 1996-2009 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc., 50 Park Terrace East, Suite 8F, New York, NY 10034

Privacy Statement