Treasury of new DB2 6.1 (V6R1) Features, Part 5: New Functions and Change Stamp Columns

April 1, 2009 Michael Sansoterra

Today’s tip addresses the new built-in functions available in DB2 for i (a.k.a. DB2 for i5/OS, SQL/400, and DB2 UDB for iSeries), change stamp columns and row change expressions.

ASCII and CHR Functions

The ASCII and CHR functions are great additions to DB2 for i. The ASCII function will take the left most character of a string expression and return the ASCII character code equivalent. Take the following example:

Select ASCII(' ') From SysIBM/SysDummy1

This command will return a value of 32 (ASCII code for space) instead of the expected EBCDIC value of 64.

The CHR (character) function behaves the opposite of ASCII. It accepts a code (number between 0 and 255) and returns the ASCII character. For example:

Select CHR(48) From SysIBM/SysDummy1

This code returns the character for ASCII code 48: ‘0’. The expression CHR(65) as shown here:

Select * From (Values(CHR(65))) ASCII (CHRCODE)

This command returns an ‘A’.

While these function will be helpful in helping EBCDIC developers better handle ASCII conversion issues, I think the greater use for them will be in cross-platform compatibility as many developers on other platforms use these functions to perform various tricks and tasks.

For instance, if you find yourself needing to write an SQL function to convert binary data to Base64 or to encode or decode an HTTP URL, then there’s no sense in re-inventing the wheel because there are already many examples available on the Internet. Many of these custom-written functions use ASCII or CHR type functions. As I discussed in a prior FHG tip, when searching the Internet for “already invented code,” don’t limit yourself to looking for DB2 code. As the tip demonstrates , it is often relatively easy to locate and convert many SQL Server T-SQL (or other database) functions to DB2.

New Encryption Options

V5R3 introduced the first encryption function into the DB2 for i5/OS SQL dialect, the ENCRYPT_RC2 function. V5R4 followed suit with ENCRYPT_TDES and now V6R1 gives us ENCRYPT_AES (Advanced Encryption Standard). Simply put, ENCRYPT_AES allows DB2 to encrypt data using the industry standard (and popular) AES encryption algorithm. It’s good to know that more options are becoming available in this area as it makes encrypted data increasingly easier to work with.

For example, if I have a .NET or SQL Server application that encrypts a credit card number using AES, I can securely pass that value for storage directly to DB2 for decryption at a later time. (Assuming the key is known from the .NET system!) Likewise, if a business partner mandates data exchanges are to be encrypted using AES, DB2 can help with this request.

For more information on the encryption functionality in DB2, see the “Encryption Functions” notes on the V5R3 SQL enhancements article. This section discusses how to encrypt, decrypt, set a password, and even set and retrieve a password hint.

MONTHS_BETWEEN Function

This new function accepts two dates and returns a fractional approximation of the number of months between the two dates. The result type of this function is DEC(31,15).

MONTHS_BETWEEN ( expression1 , expression2 )

In this case expression1 can be considered the “ending date” and expression2 the “starting date. ” If expression1 is greater than expression2, then a negative value is returned.

The number of months between St. Patrick’s Day and Thanksgiving of 2007 can be calculated with this expression:

MONTHS_BETWEEN('2007-11-22', '2007-03-17')

The result is 8.1613 months. Remember, the advantage of MONTHS_BETWEEN as opposed to the related TIMESTAMP_DIFF function is that MONTHS_BETWEEN gives a fractional result.

RID Function

This function returns the relative record number of a row for a given table identifier.

The new RID function appears to behave the same as the relative record number (RRN) function with the exception that it returns its result as a BIGINT data type. A BIGINT requires 8 bytes of storage. For the record, RRN returns a DEC(15,0) data type, which also requires 8 bytes of storage but can’t store the range of numbers that BIGINT can. Honestly I haven’t approached a table size approaching 10 trillion rows, but maybe someday! For compatibility with the DB2 family and for safety’s sake (I used to laugh at querying a 1 million row table), it is probably best to start using RID instead of RRN.

Also keep in mind it is a little dangerous working with relative record numbers because they can change when a physical file is reorganized. Personally, I don’t use relative record number functions except when performing some form of trickery. When I do use it, the queries are usually for testing or utility purposes–not for an application. If you need a counter in your query to place row numbers in your result set, consider using the ROW_NUMBER OLAP function.

Timestamp Functions: Timestamp_Format and To_Date

The Timestamp_Format accepts a string representation of a timestamp and a formatting code, and returns an actual timestamp. This new function will be a blessing to anyone who has had problems importing date/time data from an external application. For instance, say you have imported a spreadsheet that contains timestamps that look like this: ’08/17/2008 10:30′.

Timestamp_Format will aid converting this specific string format to an actual timestamp data type by telling DB2 how the string representation is formatted. The formatting code is built using any of the following format units:

Format Unit

Description

Day

HH24

Hour

Minute

Month

NNNNNN

Microsecond

Last 2 digits of the
adjusted year (00-99)

Seconds

Last 2 digits of the
year

YYYY

Year

To convert the above string example ’08/17/2008 10:30′ to a timestamp, we need to pass the string representation value and a formatting code of ‘MM/DD/YYYY HH24:MM’. Here is an example:

Select Timestamp_Format('08/17/2008 10:30',
                        'MM/DD/YYYY HH24:MI')
  From SysIBM/SysDummy1

This code returns a timestamp value of: 2008-08-17-10.30.00.000000. Now that you have an actual timestamp, it can participate in date/time arithmetic or be inserted into a timestamp column, etc. Note that the hour value is expected to be from 0 to 23 as AM/PM indicators are not yet accepted by the Timestamp_Format.

All of the rows of a character-based timestamp column should have the exact same format. If this is not the case, the function will throw an error. For example, if 99 percent of the dates are formatted as MM/DD/YYYY HH24:MI but a few are formatted as MM-DD-YYYY HH24:MI then the function will choke when it hits the row with the alternative format since the separator characters don’t match.

TO_DATE is an alternative function name that performs the exact same feature as Timestamp_Format.

VARCHAR_FORMAT

Introduced in V5R4, VARCHAR_FORMAT is another helper function that allows the conversion of a timestamp expression into a user-defined string representation. This is similar to the date formatting features available in many languages such as .NET, VBA, and Java. V6R1 has enhanced the formatting capability of the function by accepting any of the following format units:

Format

Unit

Century
(00-99). If the last two digits of the four digit year are zero, the result
is the first two digits of the year. Otherwise, the result is the first two
digits of the year plus one

Day
of week (1-7)

Day
of month (01-31)

DDD

Day
of year (001-366)

FF[n]

Fractional
seconds (000000-999999). The number n is used to specify the number of digits
to include in the value returned. Valid values for n are 1-6

HH24

Hour
of the day (00-24)

ISO
day of week (1-7), where 1 is Monday and 7 is Sunday

ISO
week of year (1-53). The week starts on Monday and includes 7 days. Week 1 is
the first week of the year to contain a Thursday, which is equivalent to the
first week of the year to contain January 4

IYYY

ISO
year (0000-9999). The year based on the ISO week that is returned

Julian
date (0000000-9999999)

Minute
(00-59)

Month
(01-12)

NNNNNN

Microseconds
(000000-999999)

Quarter
(1-4)

Seconds
(00-59)

SSSSS

Number
of seconds elapsed since previous midnight (0-86400)

Week
of the month (1-5)

YYYY

Year
(0000-9999)

As you can see, it is “easy pie” to transform an ugly DB2 timestamp into a variety of character formats. If you have a timestamp column named Ship_Stamp and you want to display it in the MM/DD/YYYY HH:MM format, you can specify the following expression:

VARCHAR_FORMAT(Ship_Stamp, 'MM/DD/YYYY HH24:MI') As Ship_DateTime

Even more valuable is the ability to easily create a Julian date format. Traditionally this used to require creation of a user-defined function or a complex expression. Ideally, IBM will enhance the TIMESTAMP_FORMAT function to interpret a Julian date value. This would allow these two IBM functions to convert an actual timestamp to Julian and vice-versa.

One other format unit I’d like to see added to the list is CYY where C is a century indicator (0=20th, 1=21st, etc.) and YY represents a two-digit year. This would allow developers to convert from the oft used numeric date formats in ERP applications such as MAPICS (CYYMMDD) and JD Edwards (CYYDDD).

This function is useful when exporting data to an external application that requires timestamps to be formatted in a certain way. It seems like I’m always exporting data to various PC applications and each one requires the timestamps or dates to be formatted in a special way. (Of course, no applications I know about want a default DB2 for i CHAR(26) timestamp format!) Another use for this function is for various reporting tasks where the date/time data needs to be formatted in a certain way.

The best part about it this function is that its format expression can be passed as a parameter so that an application can dynamically change a timestamp format based on user locality or preference. To see what I mean, paste this sample SQL into your favorite SQL utility and watch the CURRENT_TIMESTAMP register format two different ways:

Select CurTime,Format,VarChar_Format(CurTime,Format) As SampleOutput
  From (Values('HH24:MI:SS YYYY-MM-DD',Current_Timestamp),
              ('MM/DD/YYYY HH24:MI',Current_Timestamp))
       Demo(Format,CurTime)

This SQL will return the current time formatted with the time first followed by an ISO date representation. The second time will be formatted as a USA date plus the hours and minutes.

One last thing: Keep in mind that a DATE column can be used by this function as well, but it must be cast to a timestamp data type first.

ROUND_TIMESTAMP and TRUNC_TIMESTAMP

ROUND_TIMESTAMP can be used to round a timestamp expression to the nearest unit, where the unit represents a portion of the timestamp (year, day, month, etc.). Similarly, TRUNC_TIMESTAMP accepts a timestamp expression and a unit (month, day, hour, etc.) and truncates the timestamp to that unit.

A format unit is again used to tell DB2 what unit of time it should round or truncate. Filched right from IBM’s Web site, here is a table of format units that you can use with this function:

Format

model

Rounding
or truncating unit

ROUND_TIMESTAMP

example

TRUNC_TIMESTAMP
example

SCC

One
greater than the first two digits of a four digit your. (Rounds up on the
50th year of the century)

Input
value: 1897-12-04-12.22.22.000000

Result:
1900-01-01-00.00.00.000000

Input
value: 1897-12-04-12.22.22.000000

Result:
1800-01-01-00.00.00.000000

YYYY

SYYYY

YEAR

SYEAR

YYY

Year
(Rounds up on July 1)

Input
value: 1897-12-04-12.22.22.000000

Result:
1898-01-01-00.00.00.000000

Input
value: 1897-12-04-12.22.22.000000

Result:
1897-01-01-00.00.00.000000

IYYY

IYY

ISO
year (Rounds up on July 1)

Input
value: 1897-12-04-12.22.22.000000

Result:
1898-01-01-00.00.00.000000

Input
value: 1897-12-04-12.22.22.000000

Result:
1897-01-01-00.00.00.000000

Quarter
(Rounds up on the sixteenth day of the second month of the quarter)

Input
value: 1999-06-04-12.12.30.000000

Result:
1999-07-01-00.00.00.000000

Input
value: 1999-06-04-12.12.30.000000

Result:
1999-04-01-00.00.00.000000

MONTH

MON

Month
(Rounds up on the sixteenth day of the month)

Input
value: 1999-06-18-12.12.30.000000

Result:
1999-07-01-00.00.00.000000

Input
value: 1999-06-18-12.12.30.000000

Result:
1999-06-01-00.00.00.000000

Same
day of the week as the first day of the year (Rounds up on the 12th hour of
the 4th day of the week, with respect to the first day of the year)

Input
value: 2000-05-05-12.12.30.000000

Result:
2000-05-06-00.00.00.000000

Input
value: 2000-05-05-12.12.30.000000

Result:
2000-04-29-00.00.00.000000

Same
day of the week as the first day of the ISO year (Rounds up on the 12th hour
of the 4th day of the week, with respect to the first day of the ISO year)

Input
value: 2000-05-05-12.12.30.000000

Result:
2000-05-08-00.00.00.000000

Input
value: 2000-05-05-12.12.30.000000

Result:
2000-05-01-00.00.00.000000

Same
day of the week as the first day of the month (Rounds up on the 12th hour of
the 4th day of the week, with respect to the first day of the month)

Input
value: 2000-05-17-12.12.30.000000

Result:
2000-05-15-00.00.00.000000

Input
value: 2000-05-17-12.12.30.000000

Result:
2000-05-15-00.00.00.000000

DDD

Day
(Rounds up on the 12th hour of the day)

Input
value: 2000-05-17-12.59.59.000000

Result:
2000-05-18-00.00.00.000000

Input
value: 2000-05-17-12.59.59.000000

Result:
2000-05-17-00.00.00.000000

DAY

Starting
day of the week (Rounds up with respect to the 12th hour of the 4th day of
the week. The first day of the week is always Sunday)

Input
value: 2000-05-17-12.59.59.000000

Result:
2000-05-21-00.00.00.000000

Input
value: 2000-05-17-12.59.59.000000

Result:
2000-05-14-00.00.00.000000

HH12

HH24

Hour
(Rounds up at 30 minutes)

Input
value: 2000-05-17-23.59.59.000000

Result:
2000-05-18-00.00.00.000000

Input
value: 2000-05-17-23.59.59.000000

Result:
2000-05-17-23.00.00.000000

Minute
(Rounds up at 30 seconds)

Input
value: 2000-05-17-23.58.45.000000

Result:
2000-05-17-23.59.00.000000

Input
value: 2000-05-17-23.58.45.000000

Result:
2000-05-17-23.58.00.000000

Second
(Rounds up at 500000 microseconds)

Input
value: 2000-05-17-23.58.45.500000

Result:
2000-05-17-23.58.46.000000

Input
value: 2000-05-17-23.58.45.500000

Result:
2000-05-17-23.58.45.000000

For example, to round a timestamp to the nearest hour, simply specify:

ROUND_TIMESTAMP(Current_Timestamp, 'HH')

The CURRENT_TIMESTAMP register yields a timestamp value with precision down to the nearest millisecond. Often in these cases it is convenient to just drop the millisecond values by rounding or truncating to the nearest second as follows:

ROUND_TIMESTAMP(Current_Timestamp, 'SS')
TRUNC_TIMESTAMP(Current_Timestamp, 'SS')

As for “real life” usefulness, I think this function will be indispensible to aid developers in finding the start of the current quarter or the next Sunday. For example, to find the beginning of the current quarter use the following expression:

ROUND_TIMESTAMP(Current_Timestamp, 'Q')

Row Change Timestamp Column and Row Change Expressions

Another suite of timestamp related enhancements to DB2 for i are based on the concept of a “row change timestamp. ” A “row change timestamp” is nothing more than a timestamp column that is automatically updated whenever a table row is changed.

Consider the following table definition that has common auditing fields to track when a row is inserted and changed:

Create Table AppData/DataTable 
(RowId Integer As Identity Primary Key, 
 Data1 VarChar(1000) Not Null, 
 AddStamp Timestamp  Not Null Default Current_Timestamp,
 ChgStamp Timestamp  Not Null 
          FOR EACH ROW ON UPDATE AS ROW CHANGE TIMESTAMP)

Column ChgStamp is defined with a timestamp data type and with a special “row change timestamp” attribute that signals DB2 to automatically populate the column with the current timestamp whenever a row is updated. This will save developers a little time in that they will no longer have to explicitly code an update to record the timestamp when a row is changed. When a new row is inserted the “row change timestamp” column will record the time the row was added. Keep in mind that there is only one column per table allowed to have the “row change timestamp” clause specified.

Having a row change timestamp defined on a table will help with applications that use optimistic locking. Before updating a row, the application can simply compare its internal “row change timestamp” column against the one in the table to determine if the row has been changed by someone else.

A new expression called a “row change expression” allows developers to easily query for “changed” rows on a table that has a “row change timestamp. ” Here is an example based on the prior table definition:

Select *
  From AppData/DataTable DT
 Where Row Change Timestamp For DT>=Current_Timestamp-21 Days

If the column defined with the row change timestamp is called ChgStamp, the prior query is essentially equivalent to the following:

Select *
  From AppData/DataTable DT
 Where DT.ChgStamp>=Current_Timestamp-21 Days

The benefit of the row change expression is that it allows developers to write queries (or querying tools) that can query for modified rows without having to know the “row change timestamp’s” column name.

The row change expression can also be used in a SELECT column expression:

Select RowId,Row Change Timestamp For DT As "Last Change"
  From AppData/DataTable DT

The “Last Change” column can actually be used on any table that has a “row change timestamp” defined. The syntax is somewhat ugly but it is generic so it can be used consistently. In the above example, “For DT” is used to specify which table to reference in the event that the query has multiple tables having a row change timestamp.

One related item is something called a “row change token. ” Following is an example of a row change expression using a token rather than a timestamp.

Select CusNum,Row Change Token For Cust As "Change Token"
  From QIWS/QCUSTCDT Cust

The token is a BIGINT data type that returns a relative “update value” that is supposed to somehow indicate a sequence of when a row was changed relative to other rows. This token feature is supposed to be used for tables with or without a row change stamp column. When a row change timestamp is present, the token is derived from the row change timestamp column.

For legacy tables that do not have a row change timestamp column, the token is supposed to be less accurate but still offer a relative picture of when a row was changed. In the small amount of tinkering I did using an old legacy table, I did not see much usefulness for this token feature on a legacy table.

In summary, the “row change timestamp” offers a nice “automatic” auditing feature. Used in conjunction with the new CLIENT special registers, coding mundane auditing functions just became easier. Further, the new timestamp formatting functions facilitate the import and export of DB2 timestamps. Finally, the ASCII, CHR and encryption functions make coding utility tasks easier than ever.

Michael Sansoterra is a programmer/analyst for i3 Business Solutions, an IT services firm based in Grand Rapids, Michigan. Send your questions or comments for Mike to Ted Holt via the IT Jungle Contact page.