• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru: A Simple, Effective Way To Compare Data Using SQL

    April 20, 2020 Ted Holt

    One thing I have learned over the years is that no matter how much technology changes, people always ask the same questions like: “Why does the computer say we have 50 widgets when we only have 45?” and “Can you tell me who changed the due date of order number 12345?” and “How do sales of widgets this year compare to the same time last year?” It is the last of these that I wish to address today.

    Comparing data is a very common activity. Everybody compares data: sales people, production people, purchasing people, and of course management people. This sort of thing is the reason people set up data warehouses. Fortunately, there is a good formula for answering comparison questions using SQL.

    To compare data, you need five things:

    • two or more data sets
    • a full outer join
    • the COALESCE function
    • a selection of columns
    • an ORDER BY clause

    Let’s look at each one in turn. But first, some illustrative data:

    create table saleshist
      (key        dec(3)    primary key,
       custid     dec(3),
       datesold   date,
       item       char(5),
       quantity   dec(3),
       price      dec(5,2));
       
    insert into saleshist
    (key, custid, datesold, item, quantity, price)
     values
    (  1, 123, '2019-03-01', 'A15', 1, 3.00),
    (  2, 456, '2019-03-01', 'A15', 2, 1.50),
    (  3, 456, '2019-03-05', 'A15', 3, 1.50),
    (  4, 456, '2019-03-05', 'X22', 1, 1.00),
    (  5, 456, '2020-02-08', 'A15', 2, 2.00),
    (  6, 456, '2020-03-08', 'A15', 2, 2.00),
    (  7, 333, '2020-03-03', 'A15', 4, 1.50),
    (  8, 567, '2019-03-10', 'A15', 2, 1.50),
    (  9, 567, '2019-04-01', 'A15', 1, 1.50),
    ( 10, 567, '2020-03-01', 'D44', 2, 2.00);
    

    Suppose you’ve been asked to compare sales of item A15 by customer for March of this year and March of last year. Where do you begin? You begin with two data sets. They may come from two separate tables or sets of tables. They may come from two views. They may be two SELECTS over the same table. They come from wherever the data is stored.

    In my example, the data for both years is stored in the sales history table, SALESHIST, so I use two common table expressions, SALES2019 and SALES2020.

    with sales2019 as
       (select custid, sum(quantity) as qty,
               sum(quantity * price) as sales
          from saleshist
         where item = 'A15'
           and datesold between '2019-03-01' and '2019-03-31'
         group by custid),
    sales2020 as     
       (select custid, sum(quantity) as qty,
               sum(quantity * price) as sales
          from saleshist
         where item = 'A15'
           and datesold between '2020-03-01' and '2020-03-31'
         group by custid)
    . . . more to come ! ! ! . . .
    

    Comparing data means that the two data sets need to be joined. A customer might have bought item A15 in only one year or in both years. This means that we must use a full outer join to get all the data.

    select . . . 
      from sales2019 as s19
      full outer join sales2020 as s20
        on s19.custid = s20.custid     
    

    That’s fabulous, but it doesn’t give the user any information. Let’s begin by listing all the summary columns. In this example, the only summary column is customer ID. But which customer ID do we get—the one from SALES2019 or the one from SALES2020? The only logical answer, of course, is “Yes!” Some customers bought item A15 both years. Others bought in 2019 or 2020, but not both years. The COALESCE function gives us the customer number no matter what.

    select coalesce(s20.custid, s19.custid) as custid,
           . . . more columns . . .
      from sales2019 as s19
      full outer join sales2020 as s20
        on s19.custid = s20.custid
    

    COALESCE returns the first non-null value in the list. We get the customer number whether it’s found in one data set or both data sets. Be sure to use COALESCE over all of the summary columns.

    So far, so good, but the user needs sales figures — how many each customer bought and how much money they paid us.

    select coalesce(s20.custid, s19.custid) as custid,
           s20.qty as "Qty 2020", s19.qty as "Qty 2019",
           s20.sales as "Sales 2020", s19.sales as "Sales 2019"
      from sales2019 as s19
      full outer join sales2020 as s20
        on s19.custid = s20.custid
    

    I’ve grouped two the two quantity fields and the two sales figures, but how you group the columns depends on the task. Most likely you’ll give the output of such queries to people in Excel, and they’ll rearrange them as they want them.

    The last thing to do is to present the result set in a logical sequence.

      order by 1
    

    Use the ORDER BY clause to sort the data any way you need to. The sequence will often be the summary columns, but that’s not necessarily so. In this example, ORDER BY 1 means to sort the result set on the first column, which is the customer ID.

    Putting it all together, we have this query:

    with sales2019 as
       (select custid, sum(quantity) as qty,
               sum(quantity * price) as sales
          from saleshist
         where item = 'A15'
           and datesold between '2019-03-01' and '2019-03-31'
         group by custid),
    sales2020 as     
       (select custid, sum(quantity) as qty,
               sum(quantity * price) as sales
          from saleshist
         where item = 'A15'
           and datesold between '2020-03-01' and '2020-03-31'
         group by custid)
    select coalesce(s20.custid, s19.custid) as custid,
           s20.qty as "Qty 2020", s19.qty as "Qty 2019",
           s20.sales as "Sales 2020", s19.sales as "Sales 2019"
      from sales2019 as s19
      full outer join sales2020 as s20
        on s19.custid = s20.custid     
      order by 1
    

    And this result set:

    CUSTID Qty 2020 Qty 2019 Sales 2020 Sales 2019
    123 – 1 – 3.00
    333 4 – 6.00 –
    456 2 5 4.00 7.50
    567 – 2 – 3.00

    It’s a very simple formula, as simple as a pencil, which also has five components. Don’t let the simplicity fool you — it’s a powerful tool to have in your toolbox.

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guru, FHG, Four Hundred Guru, IBM i, SQL

    Sponsored by
    iTech Solutions

    Choose Your Own IBM i OS Upgrade Adventure

    Choice 1:

    • Plan for 3 months
    • Check hardware & software compatibility
    • Check Lan Console, MQ, Domino, SMB, Ciphers, WebSphere, Java
    • Test, test, and test again
    • Prepare for potential downtime
    • Hope the OS Upgrade goes smoothly
     

    Choice 2:

    • Strategically plan alongside a team of IBM i experts
    • Work with experienced system admins to ensure hardware & software compatibility
    • Receive full analysis of Lan Console, MQ, Domino, SMB, Ciphers, WebSphere, Java
    • Know which PTFs are required for the upgrade
    • Relax, and let iTech handle the rest

    Not every OS Upgrade has to be an adventure. We make the process easy for you.

    Having completed thousands of upgrades, we have the experience, know-how, and expertise to get the job done seamlessly. We know what can go wrong, what to plan for, and can act quickly if problems arise.

    No matter where you are in your journey, we’re here to help. Take a look at the video below to ensure you’re on the right path when it comes to your next IBM i OS Upgrade.

    [Video] What You Need to Know to Successfully Upgrade to IBM i 7.4 and 7.5

    An IBM i OS upgrade isn’t complete until all the boxes are checked. Is your list up to date?

    In this video, Pete Massiello covers what’s new on IBM i 7.5, planning tips, pre-requisites, and post-installation requirements for a successful OS Upgrade.

    [ Watch Now ]

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    COVID-19 Response: Manta Cuts Training Prices, Offers Freebies Database Enhancements Galore In Technology Refresh

    One thought on “Guru: A Simple, Effective Way To Compare Data Using SQL”

    • slashsplat says:
      April 28, 2020 at 2:35 pm

      Great, simple idea, well presented. Thanks.

      Reply

    Leave a Reply Cancel reply

TFH Volume: 30 Issue: 26

This Issue Sponsored By

  • Fresche Solutions
  • ProData Computer Services
  • Chordia Consulting
  • WorksRight Software
  • Raz-Lee Security

Table of Contents

  • IBM Grants Amnesty On Software Maintenance After License Charges
  • Database Enhancements Galore In Technology Refresh
  • Guru: A Simple, Effective Way To Compare Data Using SQL
  • COVID-19 Response: Manta Cuts Training Prices, Offers Freebies
  • IBM i PTF Guide, Volume 22, Number 16

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • IBM i 7.3 TR12: The Non-TR Tech Refresh
  • IBM i Integration Elevates Operational Query and Analytics
  • Simplified IBM i Stack Bundling Ahead Of Subscription Pricing
  • More Price Hikes From IBM, Now For High End Storage
  • Big Blue Readies Power10 And IBM i 7.5 Training for Partners
  • IBM Delivers More Out-of-the-Box Security with IBM i 7.5
  • Groundhog Day For Malware
  • IBM i Community Reacts to IBM i 7.5
  • Four Hundred Monitor, May 11
  • IBM i PTF Guide, Volume 24, Number 19

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2022 IT Jungle

loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.