In The Shadow Of Database Hype
November 18, 2013 Dan Burger
It always chaps a few hides in the IBM midrange community when a technology gets hyped by another company as being shockingly innovative when it has been part of the IBM midrange systems for many years. In-memory database processing is one of those provoking topics. It’s a great idea for cranking up database performance, but how about giving credit where credit is due? This is about setting the record straight.
Six weeks ago, Mike Cain, a member of IBM’s senior technical staff for DB2 on i, wrote about this topic in his blog with the no-nonsense name DB2 for i.
Prior to that, Ron Schmerbauch, technical leader of the SAP on IBM i team, wrote an article under the headline DB2 for i and SAP Live Up to the Hype as a contributor to the “Architecture Matters” blog that appears in IBM Systems Magazine.
Both IBMers made note of the marketing hype that surrounds in-memory databases. Cain focused on the characteristics of the operating system that was built to leverage memory from the beginning and how other databases have just recently gained the capability to better leverage memory by adding it to a designated server.
It’s great that other technology companies are catching up, but marketing it as if it never existed before is like taking credit for inventing the light bulb because yours is frosted rather than clear.
Cain’s blog explains the concepts of reducing latency and increasing data processing throughput by taking advantage of memory available in the system, while crediting IBM with building this capability into the System/38. That was 35 years ago, which makes a good example of what’s old is new again if you just alter it a bit and put a new name on it.
Schmerbauch pointed out that IBM i with the DB2 for i database puts up impressive in-memory performance numbers that compare with SAP HANA in this benchmark. But that’s not the whole story, as he also explains. The tests, as benchmarks have a habit of being, are not an apples to apples comparison because IBM i is running both DB2 for i and the SAP application server within just one partition and given the workload was specifically architected to showcase SAP HANA by putting a premium on complex ad-hoc queries with random selection criteria. And, just to confound comparisons a little more, the benchmark was set up so the number of records accessed by each database was not equal.
If you’d like to get deeper into the hardware matchups used in this benchmark, see One Power 750 Matches Two Xeon Servers On SAP BW Test in the August 26 issue of The Four Hundred.
Both Cain and Schmerbauch make note of the native single level storage on IBM i as an architectural feature that favors IBM i in a comparison with SAP databases. At a glance, the single level storage used in DB2 for i is designed so that users select the objects that are placed in memory.
Cain talks about technologies that put IBM i in the same league as HANA by minimizing latency using asynchronous I/O that allows data to be read from disk and pulled into memory at the correct time and correct rate; I/O parallelism that brings the capability to not only read ahead, but to fill the available memory with data via independent parallel tasks or threads; and with symmetrical multi-processing to fill memory space with data.
Schmerbauch notes SAP databases are at a disadvantage in the IBM i comparison because they have a tendency to grow in size at about 20 percent to 50 percent annually, therefore requiring more frequent replacement.
Kent Milligan, a DB2 consultant at IBM, provided his version of the differences between the in memory capabilities of DB2 for i and SAP HANA during a phone call last week.
“The main difference between SAP HANA and how we approached it with the IBM i 7.1 support is that the in-memory enablement has to occur at the individual object level, whether it’s a table or an index,” Milligan says. “HANA is like a container that holds all the database objects–all the tables and indexes–so that everything in the container is in memory enabled. In DB2 for i there is a selection process to determine the key objects.”
As Milligan pointed out, it was only with the IBM i 7.1 release that DB2 could keep objects in memory. Prior to i 7.1, it was because of single level storage that IBM midrange users had “in memory-like capabilities.”
Then or now, the system needs to have healthy (not memory constrained) memory pools to gain the performance advantage by placing DB2 objects in memory. The decision to do this is driven by how much memory is available, whether the memory is constrained, the size of the database, and a determination of the performance objectives. Some companies, Milligan noted, notice page faulting behavior when attempting to reference DB2 objects. With adequate memory, the most frequently referenced objects can be put in memory to eliminate the page faulting. In the days when memory was relatively expensive, the cost may have prevented this option from being discussed. There’s also the consideration that if you put too many objects in memory, the page faulting problem comes back to the surface. Changing the in memory attribute requires proper planning. The items to take into account are the amount of memory available and the size of the most commonly referenced DB2 objects. Make sure the memory can account for the objects and watch the performance curve accelerate.
“All in-memory solutions are trying to improve performance by using database techniques, whether it is having an object in memory or using column-stored databases to allow data to more easily remain in memory,” Milligan says. “Some databases are using the columnar store capabilities to take advantage of memory by keeping the database objects in memory. We are not changing the underlying storage mechanism for the DB2 tables to do this. We keep the traditional structure and allow it to exploit in memory processing. If you use columnar store technologies to keep the objects in memory, you may be sacrificing some transactional performance, if you are running mixed workloads in that environment.”
The capability to do hefty database work alongside serious transactional work all in one system often goes unrealized by companies that are unable to adequately judge the value of IBM i on Power Systems. It’s only when it is apparent that this comparison is between a single-purpose system with a multi-purpose system that you see the entire picture.
The SAP benchmark that compared DB2 for i against HANA, Milligan says, is BI focused. It compares something created specifically for in memory database technology with DB2 for i, which has integrated technologies into a core engine because it makes sense for a particular workload. And as we have pointed out repeatedly, the comparison is not apples-for-apples.
“What it shows is that there are good attributes to in-memory databases,” he says. It also shows there is a lot of hype that overshadows some of the traditional query optimization technologies that can also deliver really good performance.
“You don’t have to be a full-fledged in memory database to exploit large memory pools that are available to customers and are able deliver high performing SQL.”
Don’t forget the SQL part. The database enhancements in IBM i 7.1 only apply to SQL interfaces. It’s been a long time since DB2 for i database upgrades applied to anything but SQL.