Big Data Gets Easier to Handle With IBM i TR7
October 14, 2013 Alex Woodie
IBM i shops are struggling to adapt to big data, just like organizations that use other types of servers to store their data. Whether it is email messages, social media posts, or a sequential list of invoices dating back to 1978, you will find it squirreled away inside of some DB2 for i database, somewhere. Keeping up with the growing volume of data sets should be easier as a result of the enhancements IBM will deliver next month with IBM i 7.1 Technology Refresh 7 (TR7).
The IBMers in Rochester refer to the really big databases using the acronym VLDB, which (you may have already guessed) stands for very large database. There are just a handful of VLDBs in the wild that are pushing the limits of DB2 for i today. When a VLDB starts hitting its head against the limits, bad things can happen that cause downtime and require expensive application modifications, as Mike Cain recently explained in his very informative DB2 for i blog.
As time goes on, IBM expects the number of VLDBs to increase, hence the need to periodically boost DB2 for i’s capabilities to keep ahead of the curve. This should provide the headroom necessary to keep IBM’s biggest customers happy, and provide some room for smaller operations to grow into the VLDB realm, according to IBM i product manager Alison Butterill.
“We have many customers that store volumes of data and are required by law to store data for many years and, as a result, their databases grow,” Butterill says. “Some of what we’re doing with this support is an answer to those requirements of today, but also anticipating where more of our clients will be tomorrow.”
Not all of the growth comes from regulatory requirements, however. In addition to big volumes, IBM i customers are experimenting with different varieties of data. “What we’re seeing is they’re storing lots of social data, documents, archives of data,” Butterill says. “It’s a variety of different formats. Some of it is in the IFS. But some of it is being stored in native database structures as well,” in BLOBS and CLOBS.
SQL Index Expansion
IBM expanded the natural limits of DB2 for i with TR7 when it increased the size an individual SQL index can be, from 1 TB to 1.7 TB. While few IBM i shops today have SQL indexes that are close to that size, there are a handful that do. The expansion will help them immediately, but it will also other organizations whose indexes are starting to encroach on that magical 1 TB number.
Expanding the upper size limit of SQL indexes is an important element of a VLDB strategy, says Scott Forstie, a DB2 for i expert in IBM’s Rochester, Minnesota, lab. “The indexes are there to help applications realize efficient access to data. However the existence of an index isn’t free because it has to be maintained,” he says via email. “Therefore, we regularly advise our clients to have both an index strategy and a VLDB [aka big data] strategy.”
Bigger SQL indexes, by the way, doesn’t necessarily mean bigger DB2 for i databases. In fact, the maximum amount of data that can be crammed into DB2 for i hasn’t changed with TR7. According to Forstie, that figure–which can be calculated by multiplying the maximum size of a data partition (also 1.7 TB) by the maximum number of data partitions in a single partitioned table (256)–shows that the maximum storage capacity of DB2 for i is 435.2 TB. That may not be super big by Hadoop standards, but it’s pretty darn big by the measure of traditional relational database management systems.
Table Growth Tracking
Another TR7 enhancement that should keep IBM i apps from hitting their heads on the natural limits of DB2 i comes in the form of automatic tracking and trending for growth rates of DB2 tables and indexes.
As mentioned earlier, when an application runs out of room to store stuff in DB2 for i, bad things can happen that cost businesses money, in the form of downtime and application changes. The expansion of the data store is good, but the new automatic tracking and trending capabilities will warn IBM i shops about their current trajectories long before they get anywhere close to that big, scary headwall of no return.
IBM’s customers explained to Big Blue how painful hitting that database headwall can be, and the vendor’s response was to put warning lights in place. “This new ability to observe runaway conditions or a steady march toward a hard limit provides our clients with the ability to study, learn, and intercede, instead of react,” Forstie says. “Even the best prepared customer cannot tolerate such disruptions.”
DDS to SQL Migration
IBM hasn’t made any DDS enhancements for years, save for the Rational Open Access-RPG Edition technology, which didn’t enhance DDS so much as it gave customers new options for what they could do with existing DDS code. With that said, SQL has been the future of DB2 for i for so long now, it’s practically ancient history now. Despite the longevity of SQL on the platform, the adoption rate of SQL data structures has lagged behind what IBM would like.
Now IBM is doing something about that slow SQL adoption curve with a new feature that makes it easier to move from DDS to SQL DDL. Instead of just creating an SQL view from a DDS-created file, users now have the option to create a SQL DDL index as well as a view.
The hope is the new feature will push IBM i shops and ISVs in the direction of migrating their databases from using DDS structures to SQL. “We’ve had a big push on asking our clients to modernize their applications and their databases for quite some time,” Butterill says. “We believe that many of them created a database a long time ago, created the definition, and have never touched it since.”
The legacy DDS data structures won’t fly in an always-on world that demands agility and speed–they need SQL, says Time Rowe, a business architect at IBM who deals with application modernization issues. “As customers move into other aspect of business–whether it’s mobile or Web–they need to be able to think about accessing the database in different ways, and moving to an SQL manner is a lot more flexible way to help them deal with that,” he tells IT Jungle.
System Management Catalog
TR7 brings something brand new to DB2 for i called a systems management catalog. It is, in effect, a way to store various pieces of information about an IBM i instance, including system values and settings; user profile information; user storage consumption information; authorization lists; configured privileges; and function usage IDs and settings.
These are not new types of data, of course. But by exposing them through DB2 for i, it gives customers the capability to query and mine this data through SQL, which can be useful. IBM’s Forstie explains the significance.
“We call the new breed of catalogs IBM i services,” he says. “The fun aspect is that we’re externalizing detail for parts of the IBM i operating system that would normally surface through IBM i commands and APIs. Those traditional interfaces are tried, true, and will continue to be an integral part of IBM i. However, being able to access the same information using an industry standard interface (SQL) opens the door to solving problems with less effort.”
There’s a catch, though. None of this services data is actually stored in DB2 for i. The catalogs are really SQL views that reference User Defined Table Functions (UDTFs), Forstie says. “That’s a fancy way to say that the user queries the catalog like it’s a traditional table and we determine the values of the rows and columns at the point of being queried,” he says.
Movement to SSDs
IBM i shops who are adopting solid state drives (SSDs) will appreciate another database enhancement in TR7 that gives them a new “live movement” capability to move DB2 tables and indexes to SSDs.
The change has to do with how the media preference change is made, Forstie explains. “Prior to our enhancement, an exclusive no-read lock was required to change the media preference. After this change, an exclusive allow-read lock is utilized for the media preference change.”
That may seem like a small difference, but it’s huge, Forstie says. “It allows customers to adjust their use of SSDs while remaining in production mode with the applications hot. It’s possible with TR7, and easy for the administrator.”
Free Format RPG
You’ve probably already heard about the new Free Format RPG specs that IBM is delivering in the RPG compilers with TR7. We’re covering the database aspects of the TR7 enhancements, but the RPG enhancements do overlap with the database when it comes to the SQL pre-compiler.
According to Rowe and Butterill, RPG written in the new free format method will work just fine in the SQL pre-compiler. “The free form aspects were incorporated into that, so you don’t need to do anything special with your SQL that’s being embedded into RPG,” Rowe says. “The pre-compiler will recognize that SQL as part of a free format RPG structure, so you don’t need to get rid of the keywords and all those other things that’s part of free format RPG.”
The database enhancements (as well as the RPG enhancements) in TR7 will ship as part of the database group PTF that will ship in November, Rowe says. For more information see announcement letter 213-423 and the IBM i Technology Updates portion of the developerWorks website.