Treasury of new DB2 6.1 (V6R1) Features, Part 5: New Functions and Change Stamp Columns
April 1, 2009 Michael Sansoterra
Today’s tip addresses the new built-in functions available in DB2 for i (a.k.a. DB2 for i5/OS, SQL/400, and DB2 UDB for iSeries), change stamp columns and row change expressions.
ASCII and CHR Functions
The ASCII and CHR functions are great additions to DB2 for i. The ASCII function will take the left most character of a string expression and return the ASCII character code equivalent. Take the following example:
Select ASCII(' ') From SysIBM/SysDummy1
This command will return a value of 32 (ASCII code for space) instead of the expected EBCDIC value of 64.
The CHR (character) function behaves the opposite of ASCII. It accepts a code (number between 0 and 255) and returns the ASCII character. For example:
Select CHR(48) From SysIBM/SysDummy1
This code returns the character for ASCII code 48: ‘0’. The expression CHR(65) as shown here:
Select * From (Values(CHR(65))) ASCII (CHRCODE)
This command returns an ‘A’.
While these function will be helpful in helping EBCDIC developers better handle ASCII conversion issues, I think the greater use for them will be in cross-platform compatibility as many developers on other platforms use these functions to perform various tricks and tasks.
For instance, if you find yourself needing to write an SQL function to convert binary data to Base64 or to encode or decode an HTTP URL, then there’s no sense in re-inventing the wheel because there are already many examples available on the Internet. Many of these custom-written functions use ASCII or CHR type functions. As I discussed in a prior FHG tip, when searching the Internet for “already invented code,” don’t limit yourself to looking for DB2 code. As the tip demonstrates , it is often relatively easy to locate and convert many SQL Server T-SQL (or other database) functions to DB2.
New Encryption Options
V5R3 introduced the first encryption function into the DB2 for i5/OS SQL dialect, the ENCRYPT_RC2 function. V5R4 followed suit with ENCRYPT_TDES and now V6R1 gives us ENCRYPT_AES (Advanced Encryption Standard). Simply put, ENCRYPT_AES allows DB2 to encrypt data using the industry standard (and popular) AES encryption algorithm. It’s good to know that more options are becoming available in this area as it makes encrypted data increasingly easier to work with.
For example, if I have a .NET or SQL Server application that encrypts a credit card number using AES, I can securely pass that value for storage directly to DB2 for decryption at a later time. (Assuming the key is known from the .NET system!) Likewise, if a business partner mandates data exchanges are to be encrypted using AES, DB2 can help with this request.
For more information on the encryption functionality in DB2, see the “Encryption Functions” notes on the V5R3 SQL enhancements article. This section discusses how to encrypt, decrypt, set a password, and even set and retrieve a password hint.
This new function accepts two dates and returns a fractional approximation of the number of months between the two dates. The result type of this function is DEC(31,15).
MONTHS_BETWEEN ( expression1 , expression2 )
In this case expression1 can be considered the “ending date” and expression2 the “starting date. ” If expression1 is greater than expression2, then a negative value is returned.
The number of months between St. Patrick’s Day and Thanksgiving of 2007 can be calculated with this expression:
The result is 8.1613 months. Remember, the advantage of MONTHS_BETWEEN as opposed to the related TIMESTAMP_DIFF function is that MONTHS_BETWEEN gives a fractional result.
This function returns the relative record number of a row for a given table identifier.
The new RID function appears to behave the same as the relative record number (RRN) function with the exception that it returns its result as a BIGINT data type. A BIGINT requires 8 bytes of storage. For the record, RRN returns a DEC(15,0) data type, which also requires 8 bytes of storage but can’t store the range of numbers that BIGINT can. Honestly I haven’t approached a table size approaching 10 trillion rows, but maybe someday! For compatibility with the DB2 family and for safety’s sake (I used to laugh at querying a 1 million row table), it is probably best to start using RID instead of RRN.
Also keep in mind it is a little dangerous working with relative record numbers because they can change when a physical file is reorganized. Personally, I don’t use relative record number functions except when performing some form of trickery. When I do use it, the queries are usually for testing or utility purposes–not for an application. If you need a counter in your query to place row numbers in your result set, consider using the ROW_NUMBER OLAP function.
Timestamp Functions: Timestamp_Format and To_Date
The Timestamp_Format accepts a string representation of a timestamp and a formatting code, and returns an actual timestamp. This new function will be a blessing to anyone who has had problems importing date/time data from an external application. For instance, say you have imported a spreadsheet that contains timestamps that look like this: ’08/17/2008 10:30′.
Timestamp_Format will aid converting this specific string format to an actual timestamp data type by telling DB2 how the string representation is formatted. The formatting code is built using any of the following format units:
To convert the above string example ’08/17/2008 10:30′ to a timestamp, we need to pass the string representation value and a formatting code of ‘MM/DD/YYYY HH24:MM’. Here is an example:
Select Timestamp_Format('08/17/2008 10:30', 'MM/DD/YYYY HH24:MI') From SysIBM/SysDummy1
This code returns a timestamp value of: 2008-08-17-10.30.00.000000. Now that you have an actual timestamp, it can participate in date/time arithmetic or be inserted into a timestamp column, etc. Note that the hour value is expected to be from 0 to 23 as AM/PM indicators are not yet accepted by the Timestamp_Format.
All of the rows of a character-based timestamp column should have the exact same format. If this is not the case, the function will throw an error. For example, if 99 percent of the dates are formatted as MM/DD/YYYY HH24:MI but a few are formatted as MM-DD-YYYY HH24:MI then the function will choke when it hits the row with the alternative format since the separator characters don’t match.
TO_DATE is an alternative function name that performs the exact same feature as Timestamp_Format.
Introduced in V5R4, VARCHAR_FORMAT is another helper function that allows the conversion of a timestamp expression into a user-defined string representation. This is similar to the date formatting features available in many languages such as .NET, VBA, and Java. V6R1 has enhanced the formatting capability of the function by accepting any of the following format units:
As you can see, it is “easy pie” to transform an ugly DB2 timestamp into a variety of character formats. If you have a timestamp column named Ship_Stamp and you want to display it in the MM/DD/YYYY HH:MM format, you can specify the following expression:
VARCHAR_FORMAT(Ship_Stamp, 'MM/DD/YYYY HH24:MI') As Ship_DateTime
Even more valuable is the ability to easily create a Julian date format. Traditionally this used to require creation of a user-defined function or a complex expression. Ideally, IBM will enhance the TIMESTAMP_FORMAT function to interpret a Julian date value. This would allow these two IBM functions to convert an actual timestamp to Julian and vice-versa.
One other format unit I’d like to see added to the list is CYY where C is a century indicator (0=20th, 1=21st, etc.) and YY represents a two-digit year. This would allow developers to convert from the oft used numeric date formats in ERP applications such as MAPICS (CYYMMDD) and JD Edwards (CYYDDD).
This function is useful when exporting data to an external application that requires timestamps to be formatted in a certain way. It seems like I’m always exporting data to various PC applications and each one requires the timestamps or dates to be formatted in a special way. (Of course, no applications I know about want a default DB2 for i CHAR(26) timestamp format!) Another use for this function is for various reporting tasks where the date/time data needs to be formatted in a certain way.
The best part about it this function is that its format expression can be passed as a parameter so that an application can dynamically change a timestamp format based on user locality or preference. To see what I mean, paste this sample SQL into your favorite SQL utility and watch the CURRENT_TIMESTAMP register format two different ways:
Select CurTime,Format,VarChar_Format(CurTime,Format) As SampleOutput From (Values('HH24:MI:SS YYYY-MM-DD',Current_Timestamp), ('MM/DD/YYYY HH24:MI',Current_Timestamp)) Demo(Format,CurTime)
This SQL will return the current time formatted with the time first followed by an ISO date representation. The second time will be formatted as a USA date plus the hours and minutes.
One last thing: Keep in mind that a DATE column can be used by this function as well, but it must be cast to a timestamp data type first.
ROUND_TIMESTAMP and TRUNC_TIMESTAMP
ROUND_TIMESTAMP can be used to round a timestamp expression to the nearest unit, where the unit represents a portion of the timestamp (year, day, month, etc.). Similarly, TRUNC_TIMESTAMP accepts a timestamp expression and a unit (month, day, hour, etc.) and truncates the timestamp to that unit.
A format unit is again used to tell DB2 what unit of time it should round or truncate. Filched right from IBM’s Web site, here is a table of format units that you can use with this function:
For example, to round a timestamp to the nearest hour, simply specify:
The CURRENT_TIMESTAMP register yields a timestamp value with precision down to the nearest millisecond. Often in these cases it is convenient to just drop the millisecond values by rounding or truncating to the nearest second as follows:
ROUND_TIMESTAMP(Current_Timestamp, 'SS') TRUNC_TIMESTAMP(Current_Timestamp, 'SS')
As for “real life” usefulness, I think this function will be indispensible to aid developers in finding the start of the current quarter or the next Sunday. For example, to find the beginning of the current quarter use the following expression:
Row Change Timestamp Column and Row Change Expressions
Another suite of timestamp related enhancements to DB2 for i are based on the concept of a “row change timestamp. ” A “row change timestamp” is nothing more than a timestamp column that is automatically updated whenever a table row is changed.
Consider the following table definition that has common auditing fields to track when a row is inserted and changed:
Create Table AppData/DataTable (RowId Integer As Identity Primary Key, Data1 VarChar(1000) Not Null, AddStamp Timestamp Not Null Default Current_Timestamp, ChgStamp Timestamp Not Null FOR EACH ROW ON UPDATE AS ROW CHANGE TIMESTAMP)
Column ChgStamp is defined with a timestamp data type and with a special “row change timestamp” attribute that signals DB2 to automatically populate the column with the current timestamp whenever a row is updated. This will save developers a little time in that they will no longer have to explicitly code an update to record the timestamp when a row is changed. When a new row is inserted the “row change timestamp” column will record the time the row was added. Keep in mind that there is only one column per table allowed to have the “row change timestamp” clause specified.
Having a row change timestamp defined on a table will help with applications that use optimistic locking. Before updating a row, the application can simply compare its internal “row change timestamp” column against the one in the table to determine if the row has been changed by someone else.
A new expression called a “row change expression” allows developers to easily query for “changed” rows on a table that has a “row change timestamp. ” Here is an example based on the prior table definition:
Select * From AppData/DataTable DT Where Row Change Timestamp For DT>=Current_Timestamp-21 Days
If the column defined with the row change timestamp is called ChgStamp, the prior query is essentially equivalent to the following:
Select * From AppData/DataTable DT Where DT.ChgStamp>=Current_Timestamp-21 Days
The benefit of the row change expression is that it allows developers to write queries (or querying tools) that can query for modified rows without having to know the “row change timestamp’s” column name.
The row change expression can also be used in a SELECT column expression:
Select RowId,Row Change Timestamp For DT As "Last Change" From AppData/DataTable DT
The “Last Change” column can actually be used on any table that has a “row change timestamp” defined. The syntax is somewhat ugly but it is generic so it can be used consistently. In the above example, “For DT” is used to specify which table to reference in the event that the query has multiple tables having a row change timestamp.
One related item is something called a “row change token. ” Following is an example of a row change expression using a token rather than a timestamp.
Select CusNum,Row Change Token For Cust As "Change Token" From QIWS/QCUSTCDT Cust
The token is a BIGINT data type that returns a relative “update value” that is supposed to somehow indicate a sequence of when a row was changed relative to other rows. This token feature is supposed to be used for tables with or without a row change stamp column. When a row change timestamp is present, the token is derived from the row change timestamp column.
For legacy tables that do not have a row change timestamp column, the token is supposed to be less accurate but still offer a relative picture of when a row was changed. In the small amount of tinkering I did using an old legacy table, I did not see much usefulness for this token feature on a legacy table.
In summary, the “row change timestamp” offers a nice “automatic” auditing feature. Used in conjunction with the new CLIENT special registers, coding mundane auditing functions just became easier. Further, the new timestamp formatting functions facilitate the import and export of DB2 timestamps. Finally, the ASCII, CHR and encryption functions make coding utility tasks easier than ever.
Michael Sansoterra is a programmer/analyst for i3 Business Solutions, an IT services firm based in Grand Rapids, Michigan. Send your questions or comments for Mike to Ted Holt via the IT Jungle Contact page.