• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Treasury of new DB2 6.1 (V6R1) Features, Part 4: Index and Data Type Enhancements

    March 25, 2009 Michael Sansoterra

    So far in this series, we’ve covered how IBM has made life easier for DB2 for i (formerly called DB2/400 or DB2 UDB for iSeries) developers by:

    • Part 1: Cramming greater query functionality and operations into a single statement
    • Part 2: Providing an easy way to summarize data with super groups and grouping sets
    • Part 3: Giving developers options to implement audit tracking and other features useful in a client/server environment by implementing several new client special registers

    This tip will cover yet another valuable subset of features: new indexing and data type features.

    Building an Index Using Expressions

    The CREATE INDEX statement has been enhanced to allow column expressions. Previously, indexes containing limited expressions could be built using DDS based logical files, but now SQL has been granted this power. To do this in SQL, simply substitute a column expression where a column name would normally go in the CREATE INDEX statement:

    CREATE INDEX APPDATA/ORDERLINE_I1 ON
    APPDATE/ORDERLINE (SALESREP, LINEAMT * QTY DESC)
    

    In the above example, an index is built on the ORDERLINE table using the SALESREP column followed by the calculation of the extended dollar amount in descending order. The following “Top Sales” query will allow the query optimizer to choose the newly created ORDERLINE_I1 index:

      Select SalesRep,Order,Item,LineAmt*Qty As ExtAmount
        From OrderLine
    Order By SalesRep, ExtAmount Desc
    

    As expected, the expressions and the sort sequence in the ORDER BY need to match the column names and expressions in the index. The first time I tried this I noticed that the optimizer was performing a table scan instead of using the intended index, but my original test table only had about 500 rows. Doing a similar test on a table with over 100K rows yielded the expected result with the query optimizer picking the expression index.

    Here’s an example of how an index expression can benefit a legacy application. Sometimes legacy packages store dates in a less than optimal format. Say you have a PO header table that contains the following columns:

    POCC

    PO Date Century

    Numeric(2,0)

    POYMD

    PO Date (YYMMDD)

    Numeric(6,0)

    It’s always a pain to do a “cross century” search on dates stored like this. However, we can build an index that will combine these two date piece fields into one large numeric field:

    Create Index DataLib/PurchaseHeader_I1 ON     
    DataLib/PurchaseHeader (POCC*1000000+POYMD,PO)
    

    Or, we can even go a step further (assuming there’s valid date data in every row) and create an expression that will cast the numeric values to a DATE data type:

    Create Index DataLib/PurchaseHeader_I2
    On DataLib/PurchaseHeader 
    (Date(Digits(POCC)||Left(Digits(POYMD),2)||'-'||
     Substr(Digits(POYMD),3,2)||'-'|| 
     Substr(Digits(POYMD),5,2)),PO)
    

    Before building an index like the one in this example, you’ll want to ensure that the values in all of the rows are valid. Remember, just because the expression is in your index doesn’t mean you can use it in your Select’s column list. For instance, if you want to see the result of the above date expression in a query result, you’ll need to re-code the expression in your Select statement.

    Unfortunately at this time, deterministic User-Defined Functions are not supported in index expressions. Understandably, special registers, sub-selects, aggregate expressions, and other non-deterministic functions are not supported either.

    Indexes that Contain Column Names

    SQL indexes can now have a list of column names and a record format specified. Previously, SQL indexes did not contain column information and hence could not be directly read by Query/400, COBOL or RPG programs. Ted Holt already did a write up on this topic, so I won’t cover it in detail again.

    For quick reference, here is a sample index that uses a key expression and a column list:

    Create Index AdvWorks/TranHistII On AdvWorks/TransactionHistory 
    (ProductId, Case When Quantity=0 Then 0 
                     Else ActualCost/Quantity End Desc) 
    RcdFmt TransHistR 
    Add ProductId, ActualCost, Quantity, 
    ReferenceOrderId, TransactionId, ReferenceOrderLineId, 
    TransactionDate, TransactionType, ModifiedDate 
    

    This SQL index will be built with a column expression in the index, a record format of TransHistR, and will contain all of the column names listed in the Add clause. Instead of specifying individual column names on the ADD clause, you can also specify the “ALL COLUMNS” option as “ADD ALL COLUMNS.”

    Having column information in a specific index should save the query optimizer time by allowing SQL to pull data out of the index without having to reference the data in the underlying base table (a.k.a. physical file). This will also allow RPG and COBOL programs to use an SQL index like a traditional logical file without using embedded SQL. These index enhancements have virtually eradicated the remaining benefits of using DDS. Although, since IBM is moving us away from DDS, I would still like an “i” specific extension to DB2 SQL to allow edit codes or edit words to be specified on a numeric column.

    The big remaining limitations here are that column expressions are not allowed in the record format’s column list (that would be very nice), and they suffer the inability to reference multiple tables. To get rid of these limitations IBM’s next logical step is to give developers the ability to create Indexed Views.

    Binary Constants and Comparisons with Character Data

    Now moving on to our data type related discussions, a new binary string constant has been added to SQL that is very similar to the hex form of the character string constant. A binary string constant starts with the letters BX:

    BX'0D0A'   /* ASCII Carriage Return & Line Feed Characters */
    

    A binary string constant is automatically interpreted by DB2 as the BINARY data type and hence will not be translated to character data under any circumstances (except an explicit cast). If you’re wondering “why not just use the existing hex constant?” the answer is that the hex constant can, in certain circumstances, be interpreted as character data and hence can be translated to a different character set. For instance, a result set containing a hex constant could be translated by, say, the ODBC driver whereas a binary constant will not undergo such changes.

    A related change is that character data tagged with the FOR BIT DATA option specified is now compatible with the binary data type. This means that these two data types can be compared, concatenated, etc., without first explicitly CASTing the data. Prior to V6R1 specifying FOR BIT DATA on a character column would still be treated as character data (even though the intention of this encoding is to store binary data). For example, the following concatenation will fail in V5R4 but succeed in V6R1:

    Select Binary(X'F1')||Cast(X'F1' As Char(1) For Bit Data)
           As Test_Concat 
      From SysIBM/SysDummy1
    

    Even so, there is a still a significant difference between the two types as binary columns are padded with binary zeros (BX’00’) and character based columns (including those with FOR BIT DATA specified) are padded with spaces. While the two are now somewhat compatible, all new applications should specify the BINARY data type for storing binary data unless there is an overriding legacy requirement to keep the column as character.

    New DECFLOAT Data Type

    A new data type called DECFLOAT has been introduced this release. This data type based on the IEEE 754 standard (actually it’s IEEE 754R) combines the benefits of the compact yet large value range capacity of floating point numbers, but also includes the precision of the decimal data type that is critical to business applications. (Something traditional floating point numbers are not guaranteed to have!)

    DECFLOAT can be defined as having either 16 or 34 digits of precision. DECFLOAT(16) requires eight bytes of storage and can hold an exponent range of 10383 to 10384. DECFLOAT(34) requires 16 bytes of storage and can hold 106143 through 106144. If you need a small storage footprint, big numbers and accuracy then DECFLOAT is a great option.

    Further, in addition to numbers, this data type is capable of holding any of the following special values:

    • Infinity–A value that represents an infinitely large (and can be positive or negative).
    • Quiet NaN–This value represents an undefined result (i.e., “not a number”) that does not generate a warning when processed.
    • Signaling NaN–A value that represents an undefined result that causes a warning message to be generated.

    When inserting these values into a DECFLOAT column, use one of the following special constants: INFINITY, NAN, and SNAN (quotes are not required).

    New companion built-in functions for use with the DECFLOAT type are:

    • QUANTIZE–This function can be used to control the precision of a DECFLOAT number, since floating point numbers can contain slight inaccuracies QUANTIZE can help by controlling the number of digits to be reported.
    • TOTALORDER–This function reports the result of a comparison operation between two values and takes into consideration the special values when doing a comparison. (For instance, infinity is considered less than a NaN value!)
    • DECFLOAT_SORTKEY–This function returns a special binary value that can be used to sort DECFLOAT values, which will correctly take into account the special values and rules when sorting according to the IEEE 754 standard. (For instance, in this standard the value 2.10000 is less than 2.1. In fact, 2.1000 < 2.100 < 2.10 < 2.1. DECFLOAT_SORTKEY can be used to sort values consistent with the standard.)
    • DECFLOAT–This function is used to CAST a value to the DECFLOAT data type.

    For the record, the ILE/C compiler is the only high level language that currently supports DECFLOAT. For other high level languages, the DECFLOAT data will need to be cast to another data type (DEC or DOUBLE, for example) before it is stored in host variables.

    New Unicode Data Types

    Three new data types have been introduced: NCHAR, NVARCHAR, and NCLOB. These data types are double-byte characters tagged with the Unicode (UTF16) CCSID of 1200. In other words, defining the following column:

    CustomerName NVARCHAR(35) 

    is equivalent to defining a column as:

    CustomerName VARGRAPHIC(35) CCSID 1200

    Similarly, NCHAR is the equivalent of GRAPHIC and NCLOB is the equivalent of DBCLOB (with CCSID 1200 specified).

    While DB2 has been able to work with UTF16 since V5R3 this is a nice enhancement because:

    • The syntax is understandable (non DB2 developers don’t usually know about CCSIDs or the GRAPHIC data type)
    • You don’t have to remember the CCSID number for UTF-16
    • These data types provide nice compatibility with SQL Server and to a lesser extent with Oracle (which has NCHAR and NVARCHAR2 data types)

    Summary

    The index enhancements that enable indexes to have expressions, record format names, and specified column names eliminate many of the few advantages DDS had. Additionally, the new implementation of binary constants and binary comparisons with character “FOR BIT DATA” have eliminated most of the remaining issues related to binary data handling within DB2 for i. Finally, the new DECFLOAT and Unicode data types give developers more flexibility in numeric data type offerings as well as simplify defining columns and parameters requiring double byte Unicode data.

    Michael Sansoterra is a programmer/analyst for i3 Business Solutions, an IT services firm based in Grand Rapids, Michigan. Send your questions or comments for Mike to Ted Holt via the IT Jungle Contact page.

    RELATED STORIES

    Treasury of New DB2 6.1 Features, Part 1: Query Enhancements

    Treasury of New DB2 6.1 Features, Part 2: Grouping Sets and Super Groups

    Treasury of New DB2 6.1 (V6R1) Features, Part 3: Client Special Registers



                         Post this story to del.icio.us
                   Post this story to Digg
        Post this story to Slashdot

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    WorksRight Software

    Do you need area code information?
    Do you need ZIP Code information?
    Do you need ZIP+4 information?
    Do you need city name information?
    Do you need county information?
    Do you need a nearest dealer locator system?

    We can HELP! We have affordable AS/400 software and data to do all of the above. Whether you need a simple city name retrieval system or a sophisticated CASS postal coding system, we have it for you!

    The ZIP/CITY system is based on 5-digit ZIP Codes. You can retrieve city names, state names, county names, area codes, time zones, latitude, longitude, and more just by knowing the ZIP Code. We supply information on all the latest area code changes. A nearest dealer locator function is also included. ZIP/CITY includes software, data, monthly updates, and unlimited support. The cost is $495 per year.

    PER/ZIP4 is a sophisticated CASS certified postal coding system for assigning ZIP Codes, ZIP+4, carrier route, and delivery point codes. PER/ZIP4 also provides county names and FIPS codes. PER/ZIP4 can be used interactively, in batch, and with callable programs. PER/ZIP4 includes software, data, monthly updates, and unlimited support. The cost is $3,900 for the first year, and $1,950 for renewal.

    Just call us and we’ll arrange for 30 days FREE use of either ZIP/CITY or PER/ZIP4.

    WorksRight Software, Inc.
    Phone: 601-856-8337
    Fax: 601-856-9432
    Email: software@worksright.com
    Website: www.worksright.com

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    Halcyon Software:  Register now for our Multi-Platform Virtualization Webinar, March 31, 10 a.m.
    System i Developer:  RPG & DB2 Summit in Orlando, April 15-17 for 3 days of serious training
    COMMON:  Join us at the 2009 annual meeting and expo, April 26-30, Reno, Nevada

    IT Jungle Store Top Book Picks

    Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
    The iSeries Express Web Implementer's Guide: List Price, $49.95
    Getting Started with PHP for i5/OS: List Price, $59.95
    The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
    The System i Pocket RPG & RPG IV Guide: List Price, $69.95
    The iSeries Pocket Database Guide: List Price, $59.00
    The iSeries Pocket SQL Guide: List Price, $59.00
    The iSeries Pocket Query Guide: List Price, $49.00
    The iSeries Pocket WebFacing Primer: List Price, $39.00
    Migrating to WebSphere Express for iSeries: List Price, $49.00
    Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
    Getting Started with WebSphere Express for iSeries: List Price, $49.00
    Can the AS/400 Survive IBM?: List Price, $49.00
    Chip Wars: List Price, $29.95

    LANSA Introduces eLearning Option of Education and Training Kronos Gives iSeries HR App an HTML Overhaul

    Leave a Reply Cancel reply

Volume 9, Number 11 -- March 25, 2009
THIS ISSUE SPONSORED BY:

WorksRight Software
ProData Computer Services
Halcyon Software

Table of Contents

  • Treasury of new DB2 6.1 (V6R1) Features, Part 4: Index and Data Type Enhancements
  • Looking for Commitment, Part 3
  • Catching Robot/SCHEDULE Job Failures As They Happen

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23
  • SEU’s Fate, An IBM i V8, And The Odds Of A Power13
  • Tandberg Bankruptcy Leaves A Hole In IBM Power Storage
  • RPG Code Generation And The Agentic Future Of IBM i
  • A Bunch Of IBM i-Power Systems Things To Be Aware Of
  • IBM i PTF Guide, Volume 27, Numbers 21 And 22

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle