Newsletters Subscriptions Media Kit About Us Contact Search Home

mgo
OS/400 Edition
Volume 3, Number 30 -- April 30, 2003

Time for a Common Table Expression, Part 2


Hey, Howard:

Surely what you are proposing in "Time for a Common Table Expression!" is a subselect.

A common table would be:

WITH MyFile_ABS AS 
(SELECT  TDATE, LOCN, PRODUCT, ABS(RETAIL), AS RETAILA,
         CASE WHEN RETAIL < 0 THEN -QTY ELSE QTY END) AS QTYA
       FROM MyFile)
     SELECT    TDATE, LOCN, PRODUCT, RETAILA, SUM(QTYA)
     FROM      MyFile_ABS
     GROUP BY  TDATE, LOCN, PRODUCT, RETAILA
     ORDER BY  TDATE, LOCN, PRODUCT, RETAILA DESC

The common table is effectively a temporary view that can be used anywhere and as many times as necessary in the statement. A subselect generates a result set that applies only to the position in the statement where the subselect is specified. I love common tables!

--Andy


Mea culpa! You're right, Andy. The solution I proposed is not a common table expression; it is a nested-table expression. Technically, it is also a subselect, but the proper name is a nested-table expression. Thanks for pointing that out. This also gives me an opportunity to explain the difference between common tables, nested tables, and subselects.

First I'll attack the definition of a subselect, since the term is commonly misused. A subselect is an SQL SELECT statement that can contain the following elements:

SELECT-clause FROM-clause 
WHERE <search-expression>
  GROUP BY <search-expression> 
HAVING <search-expression>

The select-clause and the from-clause are required elements. The other elements are optional. Notice that subselects cannot contain an order-by clause, nor can they contain optimization or locking information.

Now let's take a look at the permissible values for the from-clause. The from-clause is defined by the keyword FROM and is followed by a table-reference. The allowable values for a table-reference are the following:

  single table
  nested table expression
  joined table

The single table is just that, a reference to a single table or view. Here is an example of a single table select statement:

SELECT * FROM MYFILE

A joined table is when you use a join expression in the following form:

<table=reference> <join clause> <table-reference> ON <join-condition> 

Here is an example of a joined table:

SELECT * FROM MYFILE INNER JOIN OTHERFILE ON (X=Y)

The statement can also be written like this:

SELECT * FROM MYFILE, OTHERFILE
WHERE X=Y

This is an example of how tables used to be joined together, before the join clause was added to the SQL language. However, SQL actually sees the above statement as the joining of all the selected records from MYFILE as a Cartesian product to all of the selected records from OTHERFILE. While sometimes it can be helpful to try a statement with this older style, I always recommend that you attempt the statement using the join clause first and only trying the older syntax if the optimizer fails to give good performance.

A nested table expression is when you specify a subselect with a correlation clause in the following form:

(subselect) AS <correlation-name>

Here is my example of a nested table expression from the last article. The nested expression is rendered in red:

SELECT TDATE, LOCN, PRODUCT, RETAILA, SUM(QTYA)
FROM 
  ( SELECT TDATE, LOCN, PRODUCT,
    ABS(RETAIL) as RETAILA,
    (CASE WHEN RETAIL<0 then -QTY else QTY END) as QTYA
  FROM MyFile
) AS X 
GROUP BY TDATE, LOCN, PRODUCT, RETAILA
ORDER BY TDATE, LOCN, PRODUCT, RETAILA DESC

A common table expression is when you use the following syntax:

WITH <correlation-name> AS <subselect>

Here is Andy's statement, which uses the common table expression, this time with the common table expression in red:

WITH MyFile_ABS AS 
(SELECT  TDATE, LOCN, PRODUCT, ABS(RETAIL), AS RETAILA,
             CASE WHEN RETAIL < 0 THEN -QTY ELSE QTY END) AS QTYA
       FROM MyFile)
SELECT TDATE, LOCN, PRODUCT, RETAILA, SUM(QTYA)
 FROM MyFile_ABS
 GROUP BY  TDATE, LOCN, PRODUCT, RETAILA
 ORDER BY  TDATE, LOCN, PRODUCT, RETAILA DESC

You see, a common table expression is when you use the WITH clause at the beginning of a query to specify one or more subselects with correlation names; whereas a nested table expression occurs in the FROM clause of a query.

So what is the difference between the two statements? In reality there is not much of a difference. The first statement and the second statement give the same functional results. (Remember, there is almost always a way to ask for the same results with different syntax in SQL). In the case of the first statement, the portion of the statement in red is run and creates a result set named X, which is referenced by the outer statement. In the case of the second statement, the portion in red creates a result set called MyFILE_ABS, which is referenced by the select statement. Both are doing the same thing: They are creating a virtual table that can be referenced in your query. But they have different names because syntactically they are different entities.

Let's take a look at the definition of a when a nested table expression can be used, from the iSeries SQL reference:

  • In place of a view to avoid creating the view (when general use of the view is not required).
  • When the desired result table is based on host variables.

Now let's look at the definition of when a common table expression can be used, from the iSeries SQL reference:

  • In place of a view to avoid creating the view (when general use of the view is not required and positioned updates or deletes are not used).
  • When the desired result table is based on host variables.
  • When the same result table needs to be shared in a fullselect.

Note that they are quite similar in their use. One advantage of a common table expression is that, in other versions of DB2, you can use it to make recursive queries. However, this is not yet supported on the iSeries. Another advantage of a common table expression is that you can use the result set in more than one query below the definition of the common table. Otherwise, they are quite similar. However, that does not excuse me from using the improper terminology in my article.

Next, the question becomes, when should you use common table expressions versus nested table expressions? I tend to believe that common table expressions are sometimes more readable than nested table expressions. Also, if you need to use the temporary result set more than once in a select statement, the common table expression is very useful. However, common table expressions are supported in DB2 but not in some other database servers. The nested table expression is supported in many SQL dialects (like Microsoft Access and SQL Server). Therefore, I tend to write my SQL using nested table, rather than common table, expressions for compatibility purposes. I wish that other databases would support the WITH construct, because I believe it makes the query more readable and helps when you want to use a temporary result set in multiple unions. Bottom line, use whichever syntax you are more comfortable with, but always remember the alternate syntax is available. This is especially helpful when the optimizer is not finding the appropriate access path for your query. Sometimes changing syntax from one form to another can cause the query optimizer to suddenly "see the light" and give you optimum performance.

Andy, thanks for pointing out my syntactic foible. It is important that we always give things their proper names. I tend to refer to common table and nested table expressions as common table expressions, because they are so similar. But in reality, they are quite different entities.


Howard F. Arner, Jr., is a writer and consultant living in Jacksonville, Florida, and is the author of iSeries and AS/400 SQL at Work, a learn-by-example guide to SQL on the AS/400 platform. You can purchase Howard's book from www.sqlthing.com/books. Or visit www.sqlthing.com for more information about SQL and iSeries performance tuning.


Sponsored By
WORKSRIGHT SOFTWARE

600 Billion

That's how much a recent independent study estimated U.S. businesses spend on dirty data. How much of that 600 billion is spent by your company? Cleanse you dirty ZIP Codes and mailing addresses with our software and save big bucks.

WorksRight Software, Inc.
Phone: 601-856-8337
E-mail: software@worksright.com
Web site: www.worksright.com


THIS ISSUE
SPONSORED BY:

WorksRight Software
Damon Technologies


BACK ISSUES

TABLE OF
CONTENTS

Time for a Common Table Expression, Part 2

Printing Qshell Output, Part 2

Reader Feedback and Insights: There's No Java Conspiracy


Editors
Howard Arner
Joe Hertvik
Ted Holt
David Morris
Shannon O'Donnell

Managing Editor
Shannon Pastore

Publisher and
Advertising Director:

Jenny Thomas

Advertising Sales Representative
Kim Reed

Contact the Editors
Do you have a gripe, inside dope or an opinion?
Email the editors:
editors@itjungle.com

Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.