NULL and NOT IN
November 16, 2011 Ted Holt
No matter who you are, there’s always something you can learn. In Much Ado about Nothing: Interesting Facts about Null, I presented a good bit of information about null values in database tables. Imagine my surprise and delight to stumble upon something I did not know about nulls.
It started innocently enough. I was surfing the Web and happened upon a link to an article entitled Ten Common SQL Programming Mistakes. I couldn’t click the link fast enough.
I found a very well-written article by Plamen Ratchev. I won’t repeat the whole thing here, but Ratchev wrote about a problem of which I was not aware. This problem occurs when a list that follows NOT IN contains a null value.
To illustrate, I’ll use Ratchev’s illustration, modified according to his suggestion.
Assume two database tables.
create table colors (color_code char(3), color_name char(10)) insert into colors values ('BLK', 'Black'), ('BLU', 'Blue'), ('GRN', 'Green'), ('RED', 'Red') create table products (sku dec (3,0), description char(20), color char(3)) insert into products values (1, 'Ball', 'RED'), (2, 'Bike', 'BLU'), (3, 'Tent', null)
Suppose you are asked for a list of the colors that are not used in any products. That would be black and green, right? But how do we derive that list using SQL? I would have done this:
SELECT C.color_code FROM Colors AS C WHERE C.color_code NOT IN (SELECT P.color FROM Products AS P)
And I would have been wrong, because the query would have returned an empty set. Here’s why.
The system carries out the second SELECT first, then uses the resulting list of colors in the first SELECT, as if I had written this:
SELECT C.color_code FROM Colors AS C WHERE C.color_code NOT IN ('RED', 'BLU', NULL)
NOT IN is a shorthand way of saying the following:
SELECT C.color_code FROM Colors AS C WHERE C.color_code <> 'RED' AND C.color_code <> 'BLU' AND C.color_code <> NULL)
It is the last comparison that provides the problem. You cannot use the not-equal operator to test for NULL. The comparison is undefined. Therefore the WHERE fails every time.
I was easily able to fix the query.
SELECT C.color_code FROM Colors AS C WHERE C.color_code NOT IN (SELECT P.color FROM Products AS P WHERE P.color is not null)
I’ve never worked in a shop that used null values in database tables, so I’ve never run into this problem. I expect many of you are in the same boat.
My deepest and most sincere thanks to Plamen Ratchev for expanding my knowledge of SQL.