The NULL Nemesis
by Michael Sansoterra
After writing "The Valuable NULL," which enumerated the benefits and snares of using NULLs in an SQL environment, I received an interesting response from a reader. He stated that NULLs were more trouble than they're worth because they usually require extra logic to process correctly.
Though it's true that SQL programmers have more to think about when dealing with NULL-capable fields, the benefits of NULLs are worth the extra effort, provided one understands how NULLs behave in expression and conditional operations. (Before I understood these features, NULLs were a great source of grief!) But the reader's statement about NULLs requiring special logic and being an all around pain is certainly true when using NULL-capable fields with high-level languages like C, RPG, or COBOL.
The reason for this difficulty in dealing with NULLs is that high-level languages do not recognize NULL values (most older HLLs were developed before databases allowed for NULLs). Here's an illustration of the problem. Say a database table contains a field called CONTACTNAM, defined as CHAR(30), which allows NULL values. A HLL program reads this database table and encounters a record where the field CONTACTNAM is NULL. Since variables in an HLL program don't know what a NULL is, what value is placed in the variable? In this case, the answer is a blank. In general, when a database field contains a NULL, the HLL program will contain the field's "low value." For example, NULL character fields will contain a blank, NULL numerics will contain a zero, and NULL dates will contain the lowest possible date value for the specified date format (for example, 0001-01-01 for a date defined with the *ISO date format).
Technically, these low-value assignments are incorrect, because, as described my last article, a NULL represents a missing or unknown value. If a numeric field is NULL, which is different from a zero, how can the HLL program know the difference? To allow HLLs to recognize NULL values, a special indicator is used to designate that the variable is NULL. These special "indicators" take the form of variables or specialized built-in functions. It is the responsibility of the HLL programmer to go to the extra effort of checking these NULL indicators to distinguish if the value in the variable is actually the default low value or a NULL. This article focuses on using these indicators in a high-level language to correctly work with NULLs.
Defining NULL-Capable Fields
Here's 'a quick review of how fields in database files are defined to hold a NULL. Shown below are the equivalent SQL and DDS statements needed to create a simple "orders" file. The only field allowed to contain a NULL in this file is ShipDate:
SQL CREATE TABLE Orders ( OrderID INTEGER NOT NULL, CustomerID CHAR(5) NOT NULL, OrderDate DATE NOT NULL, ShipDate DATE, Unique (OrderID)) DDS A UNIQUE A R ORDERREC A ORDERID 9B 0 A CUSTOMERID 5A A ORDERDATE L DATFMT(*ISO) A SHIPDATE L DATFMT(*ISO) A ALWNULL A K ORDERID
In SQL, fields are assumed to be NULL-capable (meaning they are allowed to contain a NULL) unless the NOT NULL keywords are specified. However, fields defined in DDS are assumed not to be NULL-capable unless the ALWNULL keyword is specified.
When defining files, only allow fields to contain NULLs where it makes sense to have NULLs. Don't unnecessarily create NULL-capable fields, because each time a field is NULL-capable extra logic may be required. In general, if the program has the ability to determine the value for a particular field, then the field should not be NULL-capable. For example, a field such as order date should not be NULL-capable, because the program will always know the date the order was taken. However, a ship date field can be "NULL-capable", because the program may not know the actual ship date at the time the order is taken and the program shouldn't shove an arbitrary date value in the field to represent this "unknown."
Now that you know how NULLs are defined, take a look at how they're accessed, with languages such as RPG.
RPG/400 and NULLs
RPG/400 doesn't support processing NULLs in a database file. If a file contains NULLs, specifying the ALWNULL(*YES) compiler option on the Create RPG Program (CRTRPGPGM) command allows the program to access the file as input only, with the caveat that all NULL-capable fields contain the "default" value when a NULL is encountered. This means the RPG/400 program will have no way of distinguishing a NULL from a blank, for example. Only use the ALWNULL(*YES) compiler option if the program will process a field's default value the same as it would if it were a NULL. To be able to work with "nullable" database fields in an RPG/400 program, you must use embedded SQL with indicator variables (more on this later).
RPG IV and NULLs
Fortunately, RPG IV has NULL support for database fields that are NULL-capable. The %NULLIND built-in function is used to test if a field is NULL, or to set a field to NULL or not NULL. The %NULLIND BIF only works with NULL-capable database fields; you can't define a stand-alone field on the D-spec and use %NULLIND to set it to NULL. Program-described files can not contain NULL-capable fields. As with RPG/400, to process files with NULL-capable fields, the ALWNULL keyword must be specified at compile time or as an H-spec keyword. RPG IV's version of ALWNULL accepts one of the following values:
To illustrate, assume we'll be working with the "orders" table defined above, and it contains a NULL-capable field called ShipDate. We need to write an order-inquiry program that does something special if ShipDate contains a NULL (that is, if the order hasn't been shipped). Consider the following code excerpt that demonstrates how %NullInd is used to test whether the field ShipDate is NULL:
H ALWNULL(*UsrCtl) FOrders IF E K Disk . . . C/Free ExFmt GetOrderScreen; Chain OrderID Orders; If %Found; If %NullInd(ShipDate)=*On; // Order hasn't been shipped EndIf; Else; // Order Not Found EndIf; ...
As seen in the above example, %NULLIND returns an indicator (Boolean) value of *ON or *OFF to indicate the presence of a NULL in a field. Additionally, when used on the left side of an expression, %NULLIND can be used to manipulate whether a field contains a NULL:
// Set ShipDate to NULL Eval %NullInd(ShipDate)=*On; // Set ShipDate to a value Eval %NullInd(ShipDate)=*Off; Eval ShipDate=d'2003-08-01';
For clarification, don't attempt to use RPG's *NULL special value to set database fields to NULL, because *NULL is reserved for working with pointers. Also note that placing a value in a field with %NULLIND set to *ON is pointless:
// Set ShipDate to NULL Eval %NullInd(ShipDate)=*On; // If this file were // opened for update, // this statement has // no effect. Eval ShipDate=d'2003-08-01'; Update OrdersRec;
In the last example, the second EVAL statement (ShipDate=d'2003-08-01') is useless, as far as the database manager is concerned. If the file is updated when %NULLIND is on, any date value placed in ShipDate is ignored. However, as far as normal RPG processing is concerned, if ShipDate is moved to another field, the date value has meaning, because %NULLIND is only applicable, in the final outcome, to the database manager. Because of this behavior, it is important to implement special logic to check the %NULLIND before assigning or reading values from NULL-capable fields, because RPG itself doesn't pay any attention to %NullInd when doing its normal MOVE, MOVEL, and EVAL operations. The ILE RPG Reference guide has a section entitled "Database Null Value Support," which details a plethora of implications related to this behavior.
Keyed operations (like SETLL, READE, and CHAIN) involving one or more NULL-capable key fields are simple, as long as your key list is defined correctly. In this next example, the orders file has an index called OrdersShip built over it by ShipDate and CustomerID:
CREATE INDEX OrdersShip ON Orders (ShipDate, CustomerID)
An RPG program needs to be written to report all orders that have not shipped (that is, the ShipDate contains a NULL). This next section of code demonstrates how to search for NULL values in a key field:
H ALWNULL(*USRCTL) FOrdersShipIF E K DISK DCount S 7 0 DShipDateNull S 1N C Key_ShipDate KList C KFld ShipDateNull ShipDate C/Free // Read all records where // the shipdate is NULL // (When the NULL indicator is on // the actual content of the ShipDate // field is ignored.) ShipDateNull=*On; SetLL Key_ShipDate OrdersShip; ReadE Key_ShipDate OrdersShip; Dow Not %EOF; Count=Count+1; // Process Records with a shipdate of NULL ReadE Key_ShipDate OrdersShip; EndDo; *InLR=*On; Return; /End-Free
The only thing special to note here is the key field (KFLD) definition. Notice that Factor 2, which is normally left blank for the KFLD op code, has an indicator type variable called ShipDateNull specified. This indicator variable controls whether the RPG program should look for NULL values. In this case, since we're searching for records with NULL ship dates, the ShipDateNull indicator variable is turned *ON. To position to a non-NULL value, turn the indicator *Off and set the ShipDate field appropriately. If Factor 2 is omitted from the KFLD definition on a NULL-capable field, the program will have no way to look specifically for a NULL value. OS/400 V5R2 users will be disappointed to learn that the new %KDS built-in function doesn't support NULL positions using an indicator, so for now the traditional KFLD op code must still be used.
When doing keyed positions and reads, remember that, on the iSeries, NULLs are at the bottom of the sort order. If the NULL indicator in Factor 2 is *On, the database manager will ignore any values specified in the ShipDate field and skip right to the NULLs in the file, if any are present. If there are no NULLs to read, the End Of File (EOF) indicator will be turned on, because NULLs come last.
High-Level Languages, Embedded SQL, and NULLs
iSeries shops that have the "DB2 Query Manager and SQL Development Kit" installed (product 5722ST1) can use embedded SQL to access NULL-capable fields from inside an HLL program. Embedded SQL is too large a topic to be covered here,. but if you need help getting up to speed with embedded SQL review "Empower Users with Embedded SQL" or "Dynamic Selection with Embedded SQL."
Here is an RPG-embedded SQL program snippet that retrieves the ShipDate field from the orders file for a given OrderID:
D RPGShipDate S D D RPGOrderID S 10I 0 C Eval RPGOrderID=11074 C/EXEC SQL C+ SELECT ShipDate C+ INTO :RPGShipDate C+ FROM Orders C+ WHERE OrderID=:RPGOrderID C/END-EXEC C Select C When SQLCOD=*Zero … OK C When SQLCOD=100 … Data not found C Other … Error/Warning given C EndSl
In case you're unfamiliar with embedded SQL, here's a quick review.
The concept is simple enough. The ShipDate for the given OrderID is placed in program variable RPGShipDate. But what if ShipDate contains a NULL in the file? The way the program is coded now, when the statement retrieves a NULL the SQLCOD will contain error code -305, which has the primary error description of "indicator variable required."
This message indicates that SQL can't return the NULL value into a HLL program variable without the assistance of an "indicator" variable. An indicator variable is simply an extra variable, defined with an SQL data type of SMALLINT, containing either a zero or a negative 1, to indicate if the given variable is NOT NULL or NULL, respectively. Here's the revised code using an indicator variable that will behave correctly if a NULL is read:
D ShipDateNull S 5I 0 D NULL S 5I 0 Inz(-1) C Eval OrderID=11074 C/EXEC SQL C+ SELECT ShipDate C+ INTO :RPGShipDate:ShipDateNull C+ FROM Orders C+ WHERE OrderID=:OrderID C/END-EXEC C Select C When SQLCOD=*Zero C If ShipDateNull=NULL … ShipDate is NULL – Process accordingly C Else … ShipDate is NOT NULL – Process accordingly C EndIf
When using indicator variables in embedded SQL, the ShipDateNull RPG variable is appended immediately after the variable name. As with the %NULLIND built-in function, NULL indicator variables demand that an extra step be taken to check if a variable is NULL, rather than a blank or a zero. When using embedded SQL, every potential field that can have a NULL should have an accompanying NULL indicator variable. If knowing the state of the NULL is unimportant to the HLL program, the COALESCE or IFNULL SQL functions can be used, to negate the need for an indicator variable by guaranteeing that the column will never be NULL.
It has already been noted that NULL indicator variables are supposed to be defined as the equivalent of the SQL SMALLINT data type (a two-byte binary field). In RPG IV, this can be coded by defining a field as "5I 0" on the D-specs. To determine how to define the SMALLINT data type in another language, see the DB2 Universal Database for iSeries SQL Programming with Host Languages guide (PDF format). This guide covers many facets of embedded SQL for each particular high-level language and includes charts on how to define a given HLL data type to match a corresponding SQL data type.
The Nagging Nulls
Using NULLs with high-level languages can be a bit of a chore. But as more applications use NULL-capable fields, remembering some simple rules about NULL indicator variables will allow high-level-language programs and NULLs to coexist peacefully.
Michael Sansoterra is a programmer/analyst for SilverLake Resources, an IT services firm based in Grand Rapids, Michigan. E-mail: firstname.lastname@example.org
Contact the Editors
|Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.|