• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Running Totals in an SQL Query, Take 2

    July 13, 2011 Skip Marchesani

    My article, Running Totals in an SQL Query, described how to write an SQL SELECT statement that generated a running total for each row in the result set and provided an example of the SELECT statement syntax to do so.

    While the SELECT statement was syntactically correct, it had an omission and did not work correctly in a certain data situation. Several readers figured this out and were good enough to send an email note to make me aware of the situation. I very much appreciate the emails.

    I want readers to know that the article was a test to see how many were paying attention to detail, and all of you that sent emails alerting me to the issue passed the test.

    I wish. And if you believe that it was a test, I have a bridge for sale if you’re interested.

    Let’s review my running totals scenario so I can explain the omission and show how it can be easily fixed. I used an example based on a department master table called Dept_Master, which had the following format.

    Column Name	Description
    Dpt_Nbr		Department Number
    Dpt_Name		Department Name
    Emp_Count		Department Employee Count
    

    And, for the purposes of the discussion, the Dept_Master table contained the following rows listed in arrival sequence for the table.

    Dpt_Nbr		Dpt_Name		Emp_Count
    901		Accounting	03
    911		Sales		11
    977		Manufact		27
    990		Maint		07
    

    The example that I used was to generate a running total based on the employee count for each department, and see the result set in descending sequence based on the employee count as shown below.

    Dpt_Nbr	Dpt_Name		Emp_Count		Running_Total
    977	Manufact		27		27
    911	Sales		11		38
    990	Maint		07		45
    901	Accounting	03		48
    

    The SELECT statement to do the above used a scalar subselect in the column list of the SELECT statement to calculate or derive the running total for each row as shown below.

    SELECT dpt_nbr,  dpt_name,  emp_count,
               		    (SELECT SUM(emp_count)
                        		FROM  dept_master  b
                        		WHERE  b.emp_count  >=  a.emp_count)
                   	        AS  running_total
       	    FROM  dept_master  a
       	    ORDER BY  emp_count  DESC
    

    This SELECT statement is syntactically correct, but it breaks when the column on which the running total is based–in this case Emp_Count–contains identical values in two or more rows. This means that if the Dept_Master table in the example had two rows with the same Emp_Count, the SELECT statement shown above will not work correctly.

    To see what happens in this situation, let’s change the row for department number 990 to have an employee count of 11, which is the same as department number 911. The rows in the Dept_Master table would then look as follows:

    Dpt_Nbr	Dpt_Name		Emp_Count
    901	Accounting	03
    911	Sales		11
    977	Manufact  	27
    990	Maint		11
    

    If we execute the SELECT statement against this set of data, it returns the following result set.

    Dpt_Nbr	pt_Name		Emp_Count		Running_Total
    977	Manufact		27		27
    911	Sales		11		49
    990	Maint		11		49
    901	Accounting	03		2
    

    If we examine the result set, the running total for departments 911 and 990 are both 49, which is not correct. One of them should be 38 and the other 49. The reason that the SELECT statement does not work correctly is that the argument for the WHERE clause for the scalar subselect (WHERE b.emp_count >= a.emp_count) cannot handle duplicate employee counts in the rows for each department they way it is currently structured.

    The solution is simple and was provided by reader Anita C, who emailed to make me aware of the situation. My thanks to Anita! To resolve the issue, the argument for the WHERE clause in the scalar subselect needs to be structured so that it is unique for each row in Dept_Master. The way to do this is to combine Emp_Count with Dpt_Nbr by concatenating them together as shown in the following WHERE clause. Dpt_Nbr is a unique column in Dept_Master.

    WHERE  digits(b.emp_count) || b.dpt_nbr  >=  digits(a.emp_count) 
    || a.dpt_nbr) 
    

    In addition to the change in the WHERE clause, the ORDER BY clause must have the same argument as the WHERE clause as shown below. Without the revised ORDER BY clause, the running total will not be in the correct sequence in the result set.

    ORDER BY  digits(emp_count) || dpt_nbr
    

    In this specific case the DIGITS function is used with Emp_Count (in both the WHERE and ORDER BY clauses) to preserve the leading zero in the employee count of 03 for department number 901. If DIGITS is not used, DB2 will do an implicit cast of Emp_Count from numeric to character, the leading zero on the employee count of 03 will be dropped, the remaining three will be left justified, and the SELECT statement will not return the correct result set.

    The SELECT statement with the revised syntax looks like this:

    SELECT dpt_nbr,  dpt_name,  emp_count,
               (SELECT SUM(emp_count)
                       FROM  dept_master  b
                       WHERE  DIGITS(b.emp_count) || b.dpt_nbr >=  
    				   DIGITS(a.emp_count) || a.dpt_nbr)
                   AS  running_total
       FROM  dept_master  a 
       ORDER BY  DIGITS(emp_count) || dpt_nbr DESC
    

    And, the following would be the correct result set returned by the SELECT statement:

    Dpt_Nbr	Dpt_Name		Emp_Count		Running_Total
    977	Manufact		27		27
    990	Maint		11		38
    911	Sales		11		49
    901	Accounting	03		52
    

    The SELECT statement shown above, returns the result set in descending sequence of Emp_Count then Dpt_Nbr. To see the result set in ascending sequence simply change the greater than or equal to argument (>=) in the WHERE clause for the scalar subselect to be to a less than or equal argument (<=), and remove the DESC keyword at the end of the ORDER BY clause.

    There are a couple of things to understand about running totals to get them to work correctly in an SQL Query.

    Results will be unpredictable and not correct if:

    1. The column name(s) used in the WHERE clause for the scalar subselect are not the same as those used in the ORDER by clause for the outer or primary SELECT statement. If the names are different, the data contained in the columns must be based on the same thing. In other words, don’t mix apples and oranges in the WHERE clause for the scalar subselect and the ORDER BY clause for the outer or primary SELECT statement. For example, you can’t use emp_count in the WHERE clause and dpt_name in the ORDER BY clause.

    2. The argument of the WHERE clause must match the sequence in the ORDER BY clause. This means a WHERE clause argument of <= (less than or equal) is used in conjunction with an ascending sequence in the ORDER BY clause (ascending is the default sequence for the ORDER BY clause), and a WHERE clause of >= (greater than or equal) is used in conjunction with a descending sequence in the ORDER BY clause (ORDER BY xxxx DESC).

    This technique works very nicely and is easy to use when the target of the running total is an existing column in a table or view, as opposed to a derived column. If you should choose to try and use it in conjunction with a derived column, the complexity of the syntax for the scalar subselect increases with and is proportional to the complexity of the expression that is used for the column derivation. Speaking from experience, when using a derived column, the syntax for the scalar subselect becomes very complex very quickly.

    Understanding these considerations is the key to putting running totals to work for you.

    While reviewing this article, Ted Holt realized that the OLAP (On Line Analytical Processing) SQL functions that were announced in V5R4 and V6R1 could be also be used to generate running totals. The SQL OLAP functions RANK, DENSE_RANK, and ROLLUP were announced in V5R4; and CUBE, GROUPING, and GROUPING SETS were announced in V6R1.

    My thanks to Ted for providing this solution.

    WITH RankedData AS
         (SELECT dpt_nbr,  dpt_name,  emp_count,
                     row_number() OVER (order by emp_count) AS Rank
              FROM dept_master)
    SELECT dpt_nbr, dpt_name, emp_count,
              SELECT SUM(emp_count)
                   FROM  RankedData AS b
                   WHERE  b.rank >= a.rank) AS Running_Total
         FROM RankedData AS a
    ORDER BY rank DESC
    

    Skip Marchesani retired from IBM after 30 years and decided to pursue a career as a consultant in the IBM i marketplace. He spent much of his IBM career working with the Rochester Development Lab on projects for S/38 and AS/400, and was involved with the development of the AS/400. Skip was part of the team that taught early AS/400 education to customers and IBM lab sites worldwide. Now recognized as an industry expert on DB2 for i (a.k.a. DB2/400), and author of the book “DB2/400: The New AS/400 Database,” Skip specializes in providing customized education for any area of the IBM i and AS/400, does database design and design reviews, and general IBM i, iSeries, and AS/400 consulting. He has been a speaker for user groups, technical conferences, and IBM i, iSeries, and AS/400 audiences around the world, and has received COMMON’s Distinguished Service Award. Skip Recently moved to Vermont and is now president of the Vermont Mid Range User Group (VTMUG). Send your questions or comments for Skip to Ted Holt via the IT Jungle Contact page.

    RELATED STORY

    Running Totals in an SQL Query



                         Post this story to del.icio.us
                   Post this story to Digg
        Post this story to Slashdot

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    Maxava

    Migrate IBM i with Confidence

    Tired of costly and risky migrations? Maxava Migrate Live minimizes disruption with seamless transitions. Upgrading to Power10 or cloud hosted system, Maxava has you covered!

    Learn More

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    Cosyn:  Add audit trails to your master files in minutes with Audit Trail/400
    PowerTech:  FREE White Paper: PCI Compliance for Power Systems running IBM i
    Bytware:  Viruses? You'll Never Know Unless You Scan. FREE Webinar. July 20

    IT Jungle Store Top Book Picks

    BACK IN STOCK: Easy Steps to Internet Programming for System i: List Price, $49.95

    The iSeries Express Web Implementer's Guide: List Price, $49.95
    The iSeries Pocket Database Guide: List Price, $59
    The iSeries Pocket SQL Guide: List Price, $59
    The iSeries Pocket WebFacing Primer: List Price, $39
    Migrating to WebSphere Express for iSeries: List Price, $49
    Getting Started with WebSphere Express for iSeries: List Price, $49
    The All-Everything Operating System: List Price, $35
    The Best Joomla! Tutorial Ever!: List Price, $19.95

    PostgreSQL, Progress DB Now Supported on mrc’s m-Power Still Wanted: A Power-i System of Systems

    Leave a Reply Cancel reply

Volume 11, Number 20 -- July 13, 2011
THIS ISSUE SPONSORED BY:

SEQUEL Software
ProData Computer Services
CNX

Table of Contents

  • Running Totals in an SQL Query, Take 2
  • The DB2 DBA: Identifying Indexes with Shared Access Paths
  • How Do I Email a PC5250 Screen Shot?

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24
  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle