• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • New OLAP Aggregate Functions In DB2 for i, Part 1

    July 12, 2016 Michael Sansoterra

    In my last tips about the new OLAP features in DB2 for i 7.3, I discussed the OLAP Aggregation Specification and the new LAG and LEAD OLAP functions. In this article and the next one, I continue the discussion of new OLAP features by highlighting four new OLAP aggregate functions: FIRST_VALUE, LAST_VALUE, nTH_VALUE, and RATIO_TO_REPORT. The value of these functions is that they can reference data from other rows in a query result set relative to the current row.

    The new OLAP aggregate functions differ from normal aggregate functions (such as MIN, MAX, SUM, and AVG) in that they can only be used in conjunction with an OLAP aggregation specification (OAS); whereas the normal aggregates can be used with or without an OAS (e.g., they can be used in a GROUP BY summary query). Recall that the aggregation spec consists of one or more components, including a window partition clause, window order clause, and a window aggregation group clause. Since the DB2 for i terminology can be confusing, I’ll get right to an example. If you’re unfamiliar with OLAP aggregation spec, please review the Related Stories listed at the end of this tip.

    FIRST_VALUE And LAST_VALUE

    As their names imply, these functions will extract the first or last value of a column or expression within an OLAP window. Say you have a help desk application for tracking support calls. The two primary tables in the app are SupportIncident (header) and SupportCall (detail):

    CREATE OR REPLACE TABLE SupportIncident (
    Support_Id        INT          NOT NULL 
                      GENERATED BY DEFAULT AS IDENTITY,
    Product_Group     VARCHAR(10)  NOT NULL,
    Name              NVARCHAR(80) NOT NULL,
    TimeZone          VARCHAR(10)  NOT NULL,
    EMail             VARCHAR(96)  NOT NULL,
    Phone             VARCHAR(20)  NOT NULL,
    First_Contact     TIMESTAMP    NOT NULL 
                      DEFAULT CURRENT_TIMESTAMP,
    CONSTRAINT pk_SupportIncident PRIMARY KEY (Support_Id)
    );
    
    CREATE OR REPLACE TABLE SupportCall (
    Support_Id        INT          NOT NULL,
    Support_Call_Id   INT          NOT NULL AS IDENTITY,
    Support_Timestamp TIMESTAMP(0) NOT NULL,
    Support_By_Phone  CHAR(1)      NOT NULL DEFAULT 'Y',
    Duration_Sec      INT          NOT NULL,
    Remarks           CLOB(10K),
    Rep_Id            INT          NOT NULL,
    CONSTRAINT pk_SupportCall PRIMARY KEY (Support_Id,Support_Call_Id),
    CONSTRAINT fk_Support_Id FOREIGN KEY (Support_Id) 
        REFERENCES SupportIncident (Support_Id)
    );
    

    Management has requested that support calls be listed as detail lines on a “support incident” report. The query to feed this report is shown here:

    SELECT c.*
      FROM SupportIncident i
      JOIN SupportCall c ON i.Support_Id=c.Support_Id
    

    Further, because some incidents may require several calls (possibly fielded by multiple support center representatives) from a customer before their issue is resolved, when reviewing the detail it’s often helpful to know the customer support representative(s) who opened and closed the incident.

    Therefore, each detail line should also include the support representative who opened the incident and the representative who closed the incident (or who last talked to the customer if the incident is still open). The support representative ID is contained in column REP_ID. To fulfill this requirement, simply use the new FIRST_VALUE and LAST_VALUE OLAP aggregate functions:

    SELECT c.*,
           First_Value(Rep_Id) OVER(
           PARTITION BY c.Support_Id 
           ORDER BY Support_Timestamp) AS First_Rep_Id,
           Last_Value(Rep_Id)  OVER(
           PARTITION BY c.Support_Id 
           ORDER BY Support_Timestamp 
           ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS Last_Rep_Id
      FROM SupportIncident i
      JOIN SupportCall c ON i.Support_Id=c.Support_Id
    

    The abridged results look something like this (with some columns omitted):

    ID

    Call_ID

    Duration

    Remarks

    Rep_ID

    First_Rep_ID

    Last_Rep_ID

    1

    1

    310

    Problems
    faxing – Told him to reboot

    5

    5

    16

    1

    2

    210

    Thought his DVD tray was a
    coffee tray. it broke, doh!

    7

    5

    16

    1

    3

    572

    A mind
    numbing call

    8

    5

    16

    1

    4

    188

    Transferred him to
    Bangladesh call center

    8

    5

    16

    1

    5

    442

    This guy
    won’t give up

    14

    5

    16

    1

    6

    642

    Told him he can’t fax via
    his monitor

    14

    5

    16

    1

    7

    924

    Offered
    his money back — problem solved!

    16

    5

    16

    The source values for the First_Rep_Id and Last_Rep_Id columns are highlighted in red.

    FIRST_VALUE will get the first value in a windowed set of rows and likewise LAST_VALUE will retrieve the last value. The OLAP aggregation spec controls the scope (PARTITION BY) and order (ORDER BY) of the window of rows that are to be considered by the OLAP function. Recall that this windowing concept allows these OLAP functions to be used in “detail queries” that don’t require a GROUP BY clause.

    Looking closely at the use of these new functions:

           First_Value(Rep_Id) OVER(
           PARTITION BY c.Support_Id 
           ORDER BY Support_Timestamp) AS First_Rep_Id,
           Last_Value(Rep_Id)  OVER(
           PARTITION BY c.Support_Id 
           ORDER BY Support_Timestamp 
           ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS Last_Rep_Id
    

    It’s clear that the OLAP window consists of rows bounded by the Support_Id (PARTITION BY Support_Id) and the rows within the window are sorted by the support call’s timestamp. Further, because these functions are OLAP aggregation functions, the data they retrieve is from a row that is relative to the current row in relation to the current OLAP window.

    By default, this class of functions will operate on the rows from the start of the window through the current row, as though the following “window aggregation group clause” was specified:

    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    

    This default works fine for the FIRST_VALUE function because the first row of the window is always available for consideration for any given partition. However, the last row in the window is not available by this default. Therefore, a different “window aggregation group clause” is required to instruct LAST_VALUE that it needs to consider all rows from the current row through the end of the window as follows:

    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
    

    Without this window aggregation group clause, the LAST_VALUE function returns the value from the current row. If this seems confusing, I don’t blame you and suggest you experiment with LAST_VALUE with and without this window aggregation group clause and watch how the behavior changes.

    nTH_VALUE

    The nTH_VALUE function is similar to FIRST_VALUE and LAST_VALUE with the exception that the requested value is extracted from row n, where n is the row number within the window relative to the start or the end of the window.

    Going back to the support query, say management concludes that knowing the rep ID for the third call in a long series of calls is pivotal to understanding how to improve customer service and shorten the number of calls. So, management requests for the rep ID of the third call in an incident to be included on the report as its own column. In this case, the nTH_VALUE function can be used to retrieve the third customer service rep ID in the chain of calls:

    SELECT c.*,
           First_Value(Rep_Id)
           OVER(PARTITION BY c.Support_Id 
           ORDER BY Support_Timestamp) 
           AS First_Rep_Id,
    
           Last_Value(Rep_Id)   
           OVER(PARTITION BY c.Support_Id 
           ORDER BY Support_Timestamp 
           ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) 
           AS Last_Rep_Id,
    
           nTH_Value(Rep_Id,3) FROM FIRST 
           OVER(PARTITION BY c.Support_Id 
           ORDER BY Support_Timestamp 
           ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) 
           AS "3rd_Rep_Id"
    
      FROM SupportIncident i
      JOIN SupportCall c ON i.Support_Id=c.Support_Id
    

    As shown in the example, nTH_VALUE requires a second parameter which is the row number within the OLAP window. A literal value of three is given, which means take the requested value (REP_ID) from the third row in the window. The default FROM FIRST indicates the function will get the third from the start of the window. The alternative option is FROM LAST, which would cause the function to get the third row from the end of the window. If there are less than three rows in the window, the function returns NULL.

    Based on the result set shown above, for support incident 1, the “3rd_Rep_Id” column will contain the value “8”, which is the Rep_Id in row 3.

    As with LAST_VALUE, nTH_VALUE also requires a non-default window aggregation group clause to make this function evaluate the third row over the entire window:

    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    

    Without this clause (keeping in mind that the default window is the first row through the current row), the nTH_VALUE function would not return a value until row three or later became the current row. In other words, the first two rows would contain a NULL value for the “3rd_Rep_Id” column.

    FIRST_VALUE, LAST_VALUE and nTH_VALUE can have an additional option of RESPECT NULLS (default) or IGNORE NULLS specified as the last argument. When IGNORE NULLS is specified, the requested OLAP function will only consider non-null values within the window:

    Last_Value(REMARKS) IGNORE NULLS
    OVER(PARTITION BY c.Support_Id 
    ORDER BY Support_Timestamp 
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS Last_Remark
    

    In this case, LAST_VALUE will return the last non-null REMARKS column for the given support ID. For nTH_VALUE, using IGNORE NULLS will potentially cause the function’s behavior to change from retrieving data from an absolute position in the window to a relative row position that is based on whether NULLs are present.

    There’s more to tell. Look for the RATIO_TO_REPORT function, coming in my next article.

    Michael Sansoterra is a DBA for Broadway Systems in Grand Rapids, Michigan. Send your questions or comments for Mike via the IT Jungle Contact page.

    RELATED STORIES

    OLAP Aggregation Specification In DB2 For i 7.3

    LAG And LEAD Functions In DB2 for i 7.3

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    WorksRight Software

    Do you need area code information?
    Do you need ZIP Code information?
    Do you need ZIP+4 information?
    Do you need city name information?
    Do you need county information?
    Do you need a nearest dealer locator system?

    We can HELP! We have affordable AS/400 software and data to do all of the above. Whether you need a simple city name retrieval system or a sophisticated CASS postal coding system, we have it for you!

    The ZIP/CITY system is based on 5-digit ZIP Codes. You can retrieve city names, state names, county names, area codes, time zones, latitude, longitude, and more just by knowing the ZIP Code. We supply information on all the latest area code changes. A nearest dealer locator function is also included. ZIP/CITY includes software, data, monthly updates, and unlimited support. The cost is $495 per year.

    PER/ZIP4 is a sophisticated CASS certified postal coding system for assigning ZIP Codes, ZIP+4, carrier route, and delivery point codes. PER/ZIP4 also provides county names and FIPS codes. PER/ZIP4 can be used interactively, in batch, and with callable programs. PER/ZIP4 includes software, data, monthly updates, and unlimited support. The cost is $3,900 for the first year, and $1,950 for renewal.

    Just call us and we’ll arrange for 30 days FREE use of either ZIP/CITY or PER/ZIP4.

    WorksRight Software, Inc.
    Phone: 601-856-8337
    Fax: 601-856-9432
    Email: software@worksright.com
    Website: www.worksright.com

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    System i Developer:  RPG & DB2 Summit - October 4-6 2016 in Chicago. Register now!
    COMMON:  Open Source and Systems Management at the COMMON Forum. August 24-25 in Chicago.
    OCEAN:  3 days of inspiration! 2016 IBM i Technical Conference, July 21-23, Costa Mesa, California

    Maxava Opens Up Bidding For iFoundation Grants A Second Look At SQL Descriptors

    Leave a Reply Cancel reply

Volume 16, Number 15 -- July 12, 2016
THIS ISSUE SPONSORED BY:

WorksRight Software
Valence Framework for IBM i
System i Developer

Table of Contents

  • A Generic Character Editing Routine
  • Formatting Dates With SQL, Take 3
  • New OLAP Aggregate Functions In DB2 for i, Part 1

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24
  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle