• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Distinctly Speeding Up DISTINCT

    May 26, 2010 Ted Holt

    The DISTINCT keyword is a great example of the power of the SQL SELECT statement. Add this powerful word to any SELECT command and voilà! Duplicate rows magically disappear. Comments from readers have alerted me to the fact that some people find use of this feature confusing. Here are a brief discussion of DISTINCT and a performance tip.

    Given a table (physical file) of sales order information and a table of customer information, how do I go about finding the account numbers of customers who have orders in the database? We could try this:

    select s.companyno, s.customerno
      from salesordh as s
     order by 1, 2
    

    And we would get something like this:

    COMPANYNO  CUSTOMERNO
         1       34567
         1       34567
         1       45678
         1       56789
         1       77777
         2       12345
         2       23456
         2       56789
    

    Marvelous, but do I really need to have each customer listed once per order? In this short example, only one customer—34567–has more than one order. In a production database, containing hundreds or thousands of orders, there could be many duplicates. We can eliminate duplicates by adding the word DISTINCT:

    select distinct s.companyno, s.customerno
      from salesordh as s
     order by 1, 2
    

    And we would get this instead:

    COMPANYNO  CUSTOMERNO
         1       34567
         1       45678
         1       56789
         1       77777
         2       12345
         2       23456
         2       56789
    

    Let’s take it a step farther. Suppose I also want the customer’s name. For that, I must go to the customer master file. Here’s the same query, but I’ve added the customer master file.

    select distinct s.companyno, s.customerno, c.cusnam
      from salesordh as s
      join customers as c
        on c.companyno = s.companyno
       and c.customerno = s.customerno
     order by 1, 2
    

    This works. I get this:

    COMPANYNO  CUSTOMERNO  CUSNAM
         1       34567     Polly Unsaturated
         1       45678     Molly Coddle
         1       56789     R. U. Furreal
         1       77777     Herman Nootix
         2       12345     Cal E. Phornya
         2       23456     Billy Rubin
         2       56789     Sally Mander
    

    But notice something. Notice that all three selected fields are in the customer master file. Notice also that two of the fields–company number and customer number–are the key fields that uniquely identify each customer. In other words, why join, which creates multiple rows for each customer, then throw away duplicates? It’s unnecessary. Here’s the rewritten query:

    select c.companyno, c.customerno, c.cusnam
      from customers as c
     where exists
        (select * 
           from salesordh as s
          where s.companyno  = c.companyno
            and s.customerno = c.customerno)
     order by 1, 2
    

    DISTINCT is gone. The main query reads CUSTOMERS only. The sales order table has been relegated to a subquery. The resulting data set is the same, but performance should be better.

    To sum it up, use DISTINCT when you’re retrieving data from one table (or view) only. When two or more tables and/or views are involved, and all selected columns are from one table, you can usually get a performance advantage by converting the query to include a subquery and throwing away DISTINCT.



                         Post this story to del.icio.us
                   Post this story to Digg
        Post this story to Slashdot

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    DRV Tech

    Get More Out of Your IBM i

    With soaring costs, operational data is more critical than ever. IBM shops need faster, easier ways to distribute IBM applications-based data to users more efficiently, no matter where they are.

    The Problem:

    For Users, IBM Data Can Be Difficult to Get To

    IBM Applications generate reports as spooled files, originally designed to be printed. Often those reports are packed together with so much data it makes them difficult to read. Add to that hardcopy is a pain to distribute. User-friendly formats like Excel and PDF are better, offering sorting, searching, and easy portability but getting IBM reports into these formats can be tricky without the right tools.

    The Solution:

    IBM i Reports can easily be converted to easy to read and share formats like Excel and PDF and Delivered by Email

    Converting IBM i, iSeries, and AS400 reports into Excel and PDF is now a lot easier with SpoolFlex software by DRV Tech.  If you or your users are still doing this manually, think how much time is wasted dragging and reformatting to make a report readable. How much time would be saved if they were automatically formatted correctly and delivered to one or multiple recipients.

    SpoolFlex converts spooled files to Excel and PDF, automatically emailing them, and saving copies to network shared folders. SpoolFlex converts complex reports to Excel, removing unwanted headers, splitting large reports out for individual recipients, and delivering to users whether they are at the office or working from home.

    Watch our 2-minute video and see DRV’s powerful SpoolFlex software can solve your file conversion challenges.

    Watch Video

    DRV Tech

    www.drvtech.com

    866.378.3366

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    IBS:  Free e-book: The Six Margin Killers in Wholesale Distribution
    WorksRight Software:  ZIP codes, area codes, Canadian postal codes, CASS certification, and more
    COMMON:  Join us at the Fall 2010 Conference & Expo, Oct. 4 - 6, in San Antonio, Texas

    IT Jungle Store Top Book Picks

    Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
    The iSeries Express Web Implementer's Guide: List Price, $49.95
    The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
    The System i Pocket RPG & RPG IV Guide: List Price, $69.95
    The iSeries Pocket Database Guide: List Price, $59.00
    The iSeries Pocket SQL Guide: List Price, $59.00
    The iSeries Pocket Query Guide: List Price, $49.00
    The iSeries Pocket WebFacing Primer: List Price, $39.00
    Migrating to WebSphere Express for iSeries: List Price, $49.00
    Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
    Getting Started with WebSphere Express for iSeries: List Price, $49.00
    Can the AS/400 Survive IBM?: List Price, $49.00
    Chip Wars: List Price, $29.95

    Saudi Distributor Taps VAI for Logistics Improvement nuBridges Eases i/OS Integration for Tokenized Data

    Leave a Reply Cancel reply

Volume 10, Number 17 -- May 26, 2010
THIS ISSUE SPONSORED BY:

WorksRight Software
SEQUEL Software
Botz & Associates, Inc.

Table of Contents

  • Five Steps To Monitoring Your Server Log on IBM i
  • Distinctly Speeding Up DISTINCT
  • Hey! What Happened To My Last Used Dates

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24
  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle