Case-Sensitive SQL Identifiers

December 1, 2010 Hey, Ted

Like most people who use SQL, I have never bothered worrying about which case a field name is written in. Case never seemed to matter. However, while performing a data fix, I discovered an instance where it does. (Fortunately, I was using test data!)

I had a number of files from which records were to be deleted. The key values of the records to be deleted were held in a spreadsheet. There were three fields, one of them called VERSION, in the files I was deleting from.

I tried to create a table to store the contents of the spreadsheet, and named the fields in the same way.

create table getridof
(key1 char(4), key2 char(12), version dec (3,0))

I received a message saying that VERSION was a reserved word and needed to be delimited, so I put quotes around the name, without even thinking about the fact I had “version” in lower case.

I performed my DELETEs inside an SQL RPG program.

DELETE FROM FILE1
 WHERE KEY1 || KEY 2 || DIGITS("VERSION") IN 
       (SELECT KEY1 || KEY2 || DIGITS("VERSION")
          FROM GETRIDOF)

That should work fine, shouldn’t it? No, it doesn’t, because the delimited field on my GETRIDOF file was not called “VERSION”, but “version”, so it actually used the value of VERSION in FILE1! So therefore, any matches between just the first two fields resulted in a delete, instead of matching all three keys. If, however, I had used “version” all would have worked fine!

It now seems obvious, but I thought it was worth reporting to you as something to warn other readers about.

–Mike

You have run into the two types of SQL identifiers, Mike. In IBM literature, they are known as ordinary and delimited identifiers. In other literature, you may see them referred to as unquoted and quoted identifiers. You may read about identifiers in the IBM i Infocenter.

An ordinary identifier consists of a letter followed by letters or digits. These are the identifiers most of us typically use. Delimited identifiers begin and end with double quotation marks and allow a more liberal character set. You may use both types of identifiers in column, table, index and view names.

Here’s an example. The table name is delimited. Notice the apostrophe and space in the table name. The table has three ordinary columns and three delimited columns.

create table qtemp/"Ted's Data"
  (alphafield   char(2),
   numericfield dec(3,0),
   account      char(1),
   "Account"    char(1),
   "account"    char(1),
   "/7$%"       char(1))

insert into qtemp/"Ted's Data"
 values ('A', 1, 'B', 'C', 'D', 'E')

select * from "Ted's Data"
where "/7$%" = 'E'

The system creates a physical file name “Ted_0001” (with the quotation marks) in library QTEMP, with the following fields.

ALPHAFIELD
NUMER00001
ACCOUNT
ACCOU00001
ACCOU00002
Q7$__00001

You may use quoted versions of unquoted columns, but they must be in uppercase. The following are identical, and retrieve the value of the INIT column from the QCUSTCDT physical file.

select INIT   from qiws/qcustcdt
select Init   from qiws/qcustcdt
select init   from qiws/qcustcdt
select "INIT" from qiws/qcustcdt

You may also use quoted names in native interfaces, such as Create Physical File (CRTPF), but I don’t recommend it.

One last point. I was not able to recreate Mike’s error.

create table qtemp/mydata
(key1 char(4), key2 char(12), version dec(3,0))

select * from mydata
where version = 77

select * from mydata
where "VERSION" = 77

The V5R3 and V5R4 systems I tested with allowed the word version without quotation marks in the CREATE TABLE statement. Running Display File Field Descriptions (DSPFFD) showed the field named VERSION, without the quotations. However, the field list shown when I prompted a SELECT within embedded SQL showed a quoted “VERSION”.

Thanks for bringing this to our attention, Mike.

–Ted

                     Post this story to del.icio.us
               Post this story to Digg
    Post this story to Slashdot