fhg
Volume 8, Volume 11 -- March 19, 2008

Grouping a Union

Published: March 19, 2008

Hey, Ted:

We have several sales history files--one for each year. When I need to combine data for more than one year, I have to use an SQL UNION. I am trying to group "unioned" data and can't seem to get it right. Is it possible to group a dataset that is built by a union?

--Tom


This appears to be an easy problem to resolve, but appearances can be deceiving. If you're not careful, you can get the wrong results when you UNION two or more tables, especially when you're summarizing data.

Divide the query into two steps: one to union the data, the other to group it. Assuming you have two files, one for the current year and one for the previous year, here's one possibility.

WITH Sales AS 
(SELECT Customer, Item, QtyShipped, UnitPrice
  FROM CurrYear
UNION ALL
 SELECT Customer, Item, QtyShipped, UnitPrice
  FROM PrevYear
)
SELECT Customer, Item, SUM(QtyShipped), 
       SUM(QtyShipped * UnitPrice)
  FROM Sales
 GROUP by Customer, Item
 ORDER BY Customer, Item;

The WITH defines a common table expression, which creates a temporary table of the combined data. This temporary table is called SALES. Notice I used UNION ALL, not UNION, so duplicate rows would not be omitted. The last SELECT summarizes the combined data on customer and item number.

Here's another query that gives the same results. It summarizes each file, then summarizes the two summaries. I doubt this method is any better than the first one.

WITH SalesSummary AS 
(SELECT Customer, Item, SUM(QtyShipped) AS qty,
       SUM(QtyShipped * UnitPrice) AS sale  
  FROM CurrYear
 GROUP by Customer, Item
UNION ALL
 SELECT Customer, Item, SUM(QtyShipped) AS qty,
       SUM(QtyShipped * UnitPrice) AS sale  
  FROM PrevYear
 GROUP by Customer, Item)
SELECT Customer, Item, SUM(qty), SUM(sale) FROM SalesSummary
GROUP by Customer, Item
ORDER BY Customer, Item;

I googled and found a similar question at http://joecelkothesqlapprentice.blogspot.com/. I based the following SQL query on Joe Celko's answer.

SELECT sale.Customer, sale.Item, 
       SUM(sale.QtyShipped), 
       SUM(sale.QtyShipped * sale.UnitPrice)
  FROM
   (SELECT Customer, Item, QtyShipped, UnitPrice
      FROM CurrYear
     UNION ALL
    SELECT Customer, Item, QtyShipped, UnitPrice
      FROM PrevYear) AS sale
 GROUP by Customer, Item
 ORDER BY Customer, Item;

This query differs from the first example in that the union has been embedded in the FROM clause.

BTW, in an ideal world, all sales history would be stored in one big file and transactions would be extracted by date. I find it more practical to keep separate files of history. It appears to me that many other people think the same.

--Ted


RELATED STORIES

Weird SQL Union Results

Time for a Common Table Expression, Part 2



Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc., 50 Park Terrace East, Suite 8F, New York, NY 10034

Privacy Statement