Using Lateral Correlation To Define Expressions In DB2 For i
August 2, 2016 Michael Sansoterra
The SQL implementation in DB2 for i is second to none. However, one irritating thing common to various SQL dialects is the need to repeat expressions in query. As SQL matured over the years, techniques such as nested table and common table expressions became available to, among other things, reduce repetitive expressions. This tip illustrates the use of the LATERAL correlation as another way to avoid repetition.
Say you’re writing a report for a grocery wholesaler, where markup on food items is small and discounts are even smaller. Looking for orders that were not priced correctly, you’re tasked with writing a query that reports sales order detail lines based on two criteria:
The report should be sorted with the biggest discounts first.
In the old days, your query might have looked like this:
SELECT SOD.*, OrderQty*UnitPrice AS ExtPrice, OrderQty*UnitPrice*(1-UnitPriceDiscount) AS DiscountPrice, p.Name, p.ListPrice FROM SalesOrderDetail SOD JOIN Product p ON p.ProductId=sod.ProductId AND p.ListPrice<OrderQty*UnitPrice*(1-UnitPriceDiscount) WHERE OrderQty*UnitPrice-OrderQty*UnitPrice*(1-UnitPriceDiscount) >10.00 ORDER BY OrderQty*UnitPrice-OrderQty*UnitPrice*(1-UnitPriceDiscount) DESC
Notice that ExtPrice expression (in blue) is repeated three times in the query and the DiscountPrice expression appears four times. A query like this can be difficult to maintain.
No matter, a common table expression (CTE), can alleviate the need for all that repetition so that the expressions are defined once and referenced by name everywhere else within the query:
WITH OrderData AS ( SELECT SOD.*, OrderQty*UnitPrice AS ExtPrice, OrderQty*UnitPrice*(1-UnitPriceDiscount) AS DiscountPrice FROM SalesOrderDetail SOD ) SELECT od.*, p.Name, p.ListPrice FROM OrderData od JOIN Product p ON p.ProductId=od.ProductId AND p.ListPrice<od.DiscountPrice WHERE ExtPrice-DiscountPrice>10.00 ORDER BY ExtPrice-DiscountPrice DESC
Likewise, a nested table expression is probably a little more concise for this simple query and only requires the expressions to be written once:
SELECT SOD.*, p.Name, p.ListPrice FROM ( SELECT SalesOrderDetail.*, OrderQty*UnitPrice AS ExtPrice, OrderQty*UnitPrice*(1-UnitPriceDiscount) AS DiscountPrice FROM SalesOrderDetail) SOD JOIN Product p ON p.ProductId=sod.ProductId AND p.ListPrice<sod.DiscountPrice WHERE ExtPrice-DiscountPrice>10.00 ORDER BY ExtPrice-DiscountPrice DESC
A lesser known coding alternative, is to use LATERAL correlation as a way to define expressions in such a way that they can be referenced by name throughout the rest of the query:
SELECT SOD.*, calcs.ExtPrice, calcs.DiscountPrice FROM SalesOrderDetail sod CROSS JOIN LATERAL ( VALUES(sod.OrderQty*sod.UnitPrice, sod.OrderQty*sod.UnitPrice*(1-sod.UnitPriceDiscount) )) calcs (ExtPrice,DiscountPrice) JOIN Product p ON p.ProductId=sod.ProductId AND p.ListPrice<calcs.DiscountPrice WHERE ExtPrice-DiscountPrice>10.00 ORDER BY ExtPrice-DiscountPrice DESC
Recall that the significant behavior with LATERAL correlation is that it allows references to columns from tables defined earlier in the query. In the above example, a VALUES clause is used to construct one row consisting of the two expressions: ExtPrice and DiscountPrice. LATERAL allows the VALUES clause to reference columns from the SalesOrderDetail table. This row is CROSS JOINed to the SalesOrderDetail table (hence an extension of the SalesOrderDetail row). Thereafter the expression aliases can be referenced in the remainder of the query.
Which technique to use is up to you. In this instance, the performance of the CTE and LATERAL SQL statements are equivalent. For this simple demonstration, where the expressions come from values within a single table, my preference is to use the nested table expression.
However, in more complex scenarios such as where expressions are based on values from multiple tables, I’m starting to like the LATERAL technique because it allows me to continue writing the query in the normal “top down” manner without having to be interrupted to recraft the statement as a CTE or nested table expression. I like CTEs but when they get large it’s a pain reading the statement as your eyes are forced to bounce back and forth between the CTE definition(s) and the final portion of the query.
LATERAL correlation may not be the most widely used technique by DB2 devs, but it’s worthwhile to know as it may come in handy.