ON vs. WHERE
November 14, 2007 Hey, Ted
I need your help with an SQL query. I have two database files: a header file and a detail file. There may be many detail records for each header record, but I only want to see certain detail records. Here’s the rub. I want to retrieve all header records, even the headers for which I don’t select any corresponding detail. It seems to me this should be an easy query to write, but my efforts so far produce only the header records for the selected details.
There’s more to this query than meets the eye, Wayne. It is simple, once you understand the difference between WHERE and ON.
Here’s an invoice header table we can play with:
SELECT H.* FROM INVHDR AS H Invoice Company Customer Date 47566 1 44 2004-05-03 47567 2 5 2004-05-03 47568 1 10001 2004-05-03 47569 7 777 2004-05-03 47570 7 777 2004-05-04 47571 2 5 2004-05-04
Here’s the related details table:
SELECT D.* FROM INVDTL AS D Invoice Line Item Price Quantity 47566 1 AB1441 25.00 3 47566 2 JJ9999 20.00 4 47567 1 DN0120 .35 800 47569 1 DC2984 12.50 2 47570 1 MI8830 .10 10 47570 2 AB1441 24.00 100 47571 1 AJ7644 15.00 1
Here’s the query as Wayne was running it:
SELECT H.INVOICE, H.COMPANY, H.CUSTNBR, H.INVDATE, D.LINE, D.ITEM, D.QTY FROM INVHDR AS H LEFT JOIN INVDTL AS D ON H.INVOICE = D.INVOICE WHERE D.ITEM = 'AB1441' Invoice Company Customer Date Line Item Quantity 47566 1 44 2004-05-03 1 AB1441 3 47570 7 777 2004-05-04 2 AB1441 100
I’ve selected the details for item AB1441 only. The problem is that I only have header information for the orders that include those items. Wayne wants header information for all orders.
The solution is to move the selection expression from the WHERE clause into the ON clause, like this:
SELECT H.INVOICE, H.COMPANY, H.CUSTNBR, H.INVDATE, D.LINE, D.ITEM, D.QTY FROM INVHDR AS H LEFT JOIN INVDTL AS D ON H.INVOICE = D.INVOICE AND D.ITEM = 'AB1441' Invoice Company Customer Date Line Item Quantity 47566 1 44 2004-05-03 1 AB1441 3 47567 2 5 2004-05-03 - - - 47568 1 10001 2004-05-03 - - - 47569 7 777 2004-05-03 - - - 47570 7 777 2004-05-04 2 AB1441 100 47571 2 5 2004-05-04 - - -
Why the difference? The WHERE clause runs after the join has taken place. That is, the database engine joins the tables to produce a result set, then applies the WHERE clause to select the joined records that contain item AB1441. This has the effect of turning an outer join into an inner join.
The ON clause, on the other hand, runs during the join. It causes the database engine to select only the detail records that include item AB1441, then joins the selected detail records to the header records using a left outer join. The result is that we still get the left outer join we wanted, yet we only see the detail records we selected. Nifty, huh?
Moving the detail record selection from WHERE to ON is not the only way to produce the desired results. You can explicitly select only the detail records you want by placing a SELECT clause inside the join, like this:
SELECT H.INVOICE, H.COMPANY, H.CUSTNBR, H.INVDATE, D.LINE, D.ITEM, D.QTY FROM INVHDR AS H LEFT JOIN (SELECT * FROM INVDTL WHERE ITEM = 'AB1441') AS D ON H.INVOICE = D.INVOICE