SQL Goodies in DB2 for i5/OS V5R4, Part 2
May 3, 2006 Michael Sansoterra
A few weeks ago, I walked you through some of the enhancements that IBM has made in the SQL programming language and its DB2/400 database with the i5/OS V5R4 release, which came out in January. As I explained in the first part of this story, I have been watching SQL enhancements like a hawk since OS/400 V4R2 was announced many years ago. I am always eager to put the improvements IBM has given us into practice. The SQL in V5R4 has lots of new stuff.
New SQL Descriptor Statements
DESCRIBE INPUT is a new statement for “bit heads” that will allow the dynamic analysis and manipulation of parameters in a dynamically prepared SQL statement (e.g., INSERT INTO TABLEA VALUES(?,?,?) ) If you’ve ever had the need to execute parameterized SQL statements without knowing the parameter data types until run time, then this statement is for you. Also, prior to V5R4 SQL routines were not able to take advantage of the various DESCRIBE operations because SQL routines had no access to a descriptor area.
An SQL Descriptor is simply an area in memory dedicated to handling the results of PREPARE or DESCRIBE statements. Several new statements are available for working with SQL Descriptors: ALLOCATE DESCRIPTOR, GET DESCRIPTOR, SET DESCRIPTOR and DEALLOCATE DESCRIPTOR.
For SQL routines, the results of DESCRIBE INPUT (i.e. information about the parameters) are placed in an SQL Descriptor Area. For embedded SQL programs the output of DESCRIBE INPUT can be the new SQL Descriptor Area or the “old style” Descriptor Area, which in RPG is implemented as a pre-compiler generated data structure. It is confusing that the old style SQLDA and the new SQL Descriptor Area are both referred to by the same name. See Appendix D of the SQL Reference for a description of the old style data structure for a host program of a given language. (For a simple example of an embedded SQL RPG that uses PREPARE and the old style descriptor area, see Pivot Data with SQL/400.) For now we will only consider the new SQL Descriptor Area.
ALLOCATE DESCRIPTOR sets aside memory to be used for an SQL statement containing a maximum of n parameters (or columns) and gives it a name. The DESCRIBE INPUT statement, for example, would put the information about the dynamic parameters of a statement into the named descriptor. GET DESCRIPTOR would then be used to iterate through the collection of parameters to discern things like data types. SET DESCRIPTOR can be used to place appropriate data into the parameters using host variables.
Of course if you want to run a parameterized SQL statement then you’ll need to assign values to the parameters. When you set the value for a given parameter using SET DESCRIPTOR also set the parameter attributes (data type, CCSID, length, scale, precision) to match the host variable in your program, otherwise an error will occur. If you think that needing to know the parameter data type and attributes has just ruined the whole purpose of setting parameter values at run time (which can be difficult to do in languages like RPG), don’t forget that DB2 in V5R3 and later can cast VARCHAR to just about any data type automatically. This means if you have to set the values for three numeric parameters you can load them with data from VARCHAR host variables.
If you’re not familiar with implicit casting, try the following test on V5R3 or higher:
Create Table QTemp/Test (F1 Dec(13,5)) Insert Into QTemp/Test Values('243.24356')
The insert will not fail even though a VARCHAR constant is being inserted into a decimal field because DB2 changes the type to decimal. Of course the VARCHAR data has to contain the correct representation of the target data type.
Likewise, you can try the following numeric column comparison against a VARCHAR literal:
Select * From SysTables Where ColCount>'128'
This is pretty powerful stuff when doing dynamic parameter processing. In V5R2 and below, you’ll get an “operands not compatible” error message for this statement. I scoffed when I first saw this concept in V5R3 as I remember thinking this automatic casting would only allow people to write bad queries!
Finally, when the program is done with the descriptor, issue the DEALLOCATE DESCRIPTOR statement to clean up the allocated memory. See the V5R4 SQL Reference for detailed explanations of the GET_DESCRIPTOR and SET_DESCRIPTOR statements.
As mentioned already, DESCRIBE INPUT will be useful in applications that require ad-hoc querying or updating of the database. The new SQL descriptor functions allow all of the DESCRIBE (DESCRIBE, DESCRIBE TABLE, DESCRIBE INPUT) statements to now be used within SQL based procedures and functions. Previously, use of the descriptor area was limited to embedded SQL applications.
Incidentally, related statements such as PREPARE and DESCRIBE can be used with the new SQL Descriptor Area to describe the columns in a SELECT statement and OUT/INOUT parameters in a CALL statement.
Generate_Unique(): This function returns a 13 byte character field that is guaranteed to be unique. The text of this function is tagged with CCSID 65535; and don’t bother trying to translate it because you won’t get anything legible. However, when used with the HEX function, the result looks like the following:
Unlike GUIDs (globally unique identifiers) which are random in nature, the Generate_Unique() function does follow a sequence based on time. The CURRENT_TIMESTAMP function is only evaluated once during the execution of an SQL statement. However, GENERATE_UNIQUE is evaluated multiple times during a SQL statement. This can be beneficial for doing multiple row inserts when you want each row tagged by date/time.
Further, after inserting rows in a table with a column designed for GENERATE_UNIQUE(), you can use the timestamp function to retrieve the date/time when the row was inserted. Consider the following simple example:
Create Table TimeData (TimePunch Char(13) For Bit Data, EmpNo Int Not Null, Hours Dec(5,2) Not Null); Insert Into TimeData Values(Generate_Unique(),1,6.0),(Generate_Unique(),2,2.0), (Generate_Unique(),3,8.0),(Generate_Unique(),4,8.0); Select TimeStamp(TimePunch) As TimePunch, EmpNo, Hours From TimeData; TIMEPUNCH EMPNO HOURS 2006-02-27-23.46.11.045920 1 6.00 2006-02-27-23.46.11.050980 2 2.00 2006-02-27-23.46.11.054431 3 8.00 2006-02-27-23.46.11.057966 4 8.00
Generate_Unique() allows characteristics of an ascending unique value and timestamp to be combined into a single column and should prove beneficial when creating SQL based applications. This new function can be used to store date/time information in a more compact format than the long 26 byte timestamp data type.
Add_Months(expression, numeric-expression): This function, as its name implies, will add the given number of months to a date, timestamp or character representation of either data type. It is similar to the normal SQL date arithmetic with the exception of how it handles the end of the month as shown below:
Select Add_Months('2001-02-28',4), Date('2001-01-28') + 4 Months From SysIBM/SysDummy1
As you can see, the end of the month is accounted for differently as the Add_Months function recognizes that Feb 28, 2001 is the end of a month and hence accounts for that fact when returning its result.
Last_Day(expression): This handy function accepts a date, timestamp or character representation of a date or timestamp and will return a date data type for the last day of the month–no more ugly expressions to do this!
Select Last_Day('2000-02-01') As EndOfMonth From SysDummy1
Next_Day(expression, string-expression): This function accepts a date, timestamp or character representation of a date or timestamp and will return a timestamp for the next day name after the date specified.
For example, to find the first Friday after Jan 7, 2006, simply enter:
Select Next_Day('2006-01-07','Friday') As NextDay From SysDummy1
By implication you could just as easily find the preceding Friday by subtracting seven days from the result.
The second parameter for the day name can either be the full day name or the three character abbreviation (MON, TUE, WED, etc.) For languages other than English, see the message descriptions for CPX9034 and CPX9039 in message file QCPFMSG to retrieve your day and abbreviation names.
Encrypt_TDes is similar to Encrypt_RC2 (released in V5R3) with the exception that it uses Triple DES encryption. Impressive, huh? I’ll refer interested readers to V5R3 SQL Enhancements, a prior story I did, which discusses Encrypt_RC2 and the related password and password hint functions.
Varchar_Format(expression, format string): This function accepts a timestamp (or character representation) and a format string and returns a character representation of the timestamp. I was excited when I first saw this function because I figured the format string would allow developers to create timestamp representations of their choosing as can be done in other languages. But on closer inspection, I found that there is only one format string that is currently accepted: ‘YYYY-MM-DD HH24:MI:SS’. When you run the function, you’ll only be allowed to do something like this:
Select VarChar_Format(Current_TimeStamp,'YYYY-MM-DD HH24:MI:SS') From SysDummy1
RAISE_ERROR(sqlstate,diagnostic string): This is a new function designed to purposely create an error condition with a user defined SQL State and message text (the function will always return SQL code-438.) I’m guessing this was provided as an easier alternative to raise an error than using the SIGNAL statement.
The best use for the function I could think of would be to conditionally crash a Select if an application error condition occurred while the statement was running (if, say, it came across invalid data.) However, I found that I could not use RAISE_ERROR in a CASE statement. I also found that I couldn’t use it in a COALESCE statement. So, it is a mystery to me as to why this was implemented as a function! Anyway, if someone thinks of a good use for this, please let me know. I’ll just throw out the manual’s example since none of mine worked:
CREATE TRIGGER EMPISRT1 AFTER INSERT ON EMPLOYEE REFERENCING NEW AS N FOR EACH ROW MODE DB2SQL BEGIN ATOMIC IF N.BONUS > 20000 THEN VALUES(RAISE_ERROR('ZZZZZ','Incorrect bonus')); END IF; END
Incidentally, the function returns a NULL with an undefined data type.
STDDEV_SAMP and VARIANCE_SAMP are new aggregate functions like SUM and AVG, that is, they operate on a set of rows. The STDDEV_SAMP function returns the sample standard deviation (/n-1) of a set of numbers and the VAR_SAMP function returns the sample variance (/n-1) of a set of numbers. There you have it, right from the manual. Unfortunately, never having taken statistics I don’t have a clue as to why one would use these. However, for those fortunate souls who have a need for these new functions, live it up!
New Special Registers
Register Session_User appears to be synonymous with the User register and simply returns the run time authorization id (i.e. current user) of the current user connected to the database.
Register System_User returns the authorization id of the user connected to the server (in other words, the job user.) For example, the following SELECT:
Select System_User, Session_User From SysIBM.SysDummy1
from an ODBC job run by user MIKE returns:
because ODBC host server jobs QZDASOINIT run under QUSER by default.
As already covered, CURRENT DEBUG MODE and CURRENT DEGREE are also new special registers.
There are a few more SQL enhancements to be discussed. In the future, I still need to cover recursive queries, new OLAP ranking functions, and Materialized Table Queries.
Michael Sansoterra is a programmer/analyst for i3 Business Solutions, an IT services firm based in Grand Rapids, Michigan. Click here to contact Michael Sansoterra by e-mail.