• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru: When An Outer Join Is An Inner Join

    February 15, 2021 Ted Holt

    When is a boy not a boy? When he’s abed! When is a door not a door? When it’s ajar! When is an outer join not an outer join? (Sorry, no dad joke here. Three dad jokes in one paragraph would have been too many, don’t you agree?) Let me answer that last question.

    In my work I often see outer joins that are not really outer joins, but inner joins. Oh, based on what I’ve heard from IBM, the query engine may treat them as outer joins, but the result set is the same as that produced by an inner join. This occurs when a query tests a column (field) from a secondary table or view (database file) for a non-null value.

    Before I show you an example, let’s look at some data. Here’s an invoice header table:

    INVOICE COMPANY CUSTNBR INVDATE
    101 1 50 2021-02-05
    102 1 33 2021-02-05
    103 2 70 2021-02-06
    104 2 53 2021-02-06

    And here’s the corresponding detail table:

    INVOICE LINE ITEM PRICE QTY
    101 1 AB-205 3.00 2
    101 2 DD-762 5.00 4
    102 1 TR-911 2.00 2
    103 1 JP-310 6.00 3
    103 2 AB-205 1.00 2
    101 3 Null null null

    Here’s the type of query I was referring to. There is a left outer join from the header to the detail, and the WHERE clause references the ITEM column, which is stored in the details.

    select h.invoice, h.company, h.CustNbr, h.InvDate,
           d.line, d.item, d.qty
      from invhdr as h
      left join invdtl as d
        on h.invoice = d.invoice
     where d.item = 'AB-205';
    

    Here’s the result set.

    INVOICE COMPANY CUSTNBR INVDATE LINE ITEM QTY
    101 1 50 2021-02-05 1 AB-205 2
    103 2 70 2021-02-06 2 AB-205 2

    Since the query specifically asks for an outer join, we’d expect to see invoices 101 through 104 inclusive, but instead we see the same results we’d seen had we used an inner join.

    So, which is it — an inner join or an outer join? And does it matter, as long as the query produces the correct results? I can’t speak for you, but it matters to me.

    First, there’s the matter of self-documentation, the idea that a human being should be able to read source code and understand what it’s doing and possibly even why it’s doing it. Testing a secondary column in an outer join is instant confusion. Did the developer who wrote that statement want an outer join? Did the developer think he wanted an outer join? Did the developer think?

    Second, there’s the matter of determining what the correct results should be. In many shops, the test data is a copy of the live data, sometimes with identifiable data changed, sometimes not. In some shops, the test data is the live data. There are two glaring problems with such test data: (1) It is rarely comprehensive, and therefore does not exercise all paths through the software and (2) there is usually far too much of it. Test data sets should be small enough that the output can be verified easily.

    It’s possible that the programmer who wrote this SELECT did a poor job of testing. The programmer may have run a quick test against the production database, saw that the numbers matched a screen or report, and moved along to the next fire.

    It’s also possible that the programmer did want an outer join, but only wanted secondary data for item AB-205. I wrote about that type of query in my younger days. If so, this statement definitely has a bug.

    As I wrote a few months ago, certain programming practices raise a little flag when I see them. Testing for non-null values with outer joins is another such practice. One good way to help the flag go up is to use correlation names. Seeing d. in front of item increases the chances that you’ll notice that the test is for a column in a secondary table.

    I should also mention that testing for null is a different story. An inner join returns only those rows where the secondary-table column is null.

    select h.invoice, h.company, h.CustNbr, h.InvDate,
           d.line, d.item, d.qty
      from invhdr as h
      join invdtl as d
        on h.invoice = d.invoice
     where d.item is null;
    
    INVOICE COMPANY CUSTNBR INVDATE LINE ITEM QTY
    101 1 50 2021-02-05 3 null null

    The outer join returns the rows with a null secondary column as well as null values for unmatched rows.

    select h.invoice, h.company, h.CustNbr, h.InvDate,
           d.line, d.item, d.qty
      from invhdr as h
      left join invdtl as d
        on h.invoice = d.invoice
     where d.item is null;
    
    INVOICE COMPANY CUSTNBR INVDATE LINE ITEM QTY
    101 1 50 2021-02-05 3 null null
    104 2 53 2021-02-06 null null null

    Effective, reliable code is not an accident. In the words of Sergeant Phil Esterhaus, “Let’s be careful out there.”

    RELATED STORIES

    On Vs. Where

    Guru: DISTINCT Can Hide A Performance Problem

    Let’s be careful out there

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guru, FHG, Four Hundred Guru, IBM i

    Sponsored by
    WorksRight Software

    Do you need area code information?
    Do you need ZIP Code information?
    Do you need ZIP+4 information?
    Do you need city name information?
    Do you need county information?
    Do you need a nearest dealer locator system?

    We can HELP! We have affordable AS/400 software and data to do all of the above. Whether you need a simple city name retrieval system or a sophisticated CASS postal coding system, we have it for you!

    The ZIP/CITY system is based on 5-digit ZIP Codes. You can retrieve city names, state names, county names, area codes, time zones, latitude, longitude, and more just by knowing the ZIP Code. We supply information on all the latest area code changes. A nearest dealer locator function is also included. ZIP/CITY includes software, data, monthly updates, and unlimited support. The cost is $495 per year.

    PER/ZIP4 is a sophisticated CASS certified postal coding system for assigning ZIP Codes, ZIP+4, carrier route, and delivery point codes. PER/ZIP4 also provides county names and FIPS codes. PER/ZIP4 can be used interactively, in batch, and with callable programs. PER/ZIP4 includes software, data, monthly updates, and unlimited support. The cost is $3,900 for the first year, and $1,950 for renewal.

    Just call us and we’ll arrange for 30 days FREE use of either ZIP/CITY or PER/ZIP4.

    WorksRight Software, Inc.
    Phone: 601-856-8337
    Fax: 601-856-9432
    Email: software@worksright.com
    Website: www.worksright.com

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Weighing The Hidden Costs Of Open Source A Conversation With Carol Woodbury on IBM i Security

    Leave a Reply Cancel reply

TFH Volume: 31 Issue: 12

This Issue Sponsored By

  • ARCAD Software
  • Precisely
  • New Generation Software
  • Trinity Guard
  • WorksRight Software

Table of Contents

  • Big Blue Rolls Out Red Hat Power Stack
  • A Conversation With Carol Woodbury on IBM i Security
  • Guru: When An Outer Join Is An Inner Join
  • Weighing The Hidden Costs Of Open Source
  • What Does This Year Look Like For IT Spending?

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • The Power11 Transistor Count Discrepancies Explained – Sort Of
  • Is Your IBM i HA/DR Actually Tested – Or Just Installed?
  • Big Blue Delivers IBM i Customer Requests In ACS Update
  • New DbToo SDK Hooks RPG And Db2 For i To External Services
  • IBM i PTF Guide, Volume 27, Number 33
  • Tool Aims To Streamline Git Integration For Old School IBM i Devs
  • IBM To Add Full System Replication And FlashCopy To PowerHA
  • Guru: Decoding Base64 ASCII
  • The Price Tweaking Continues For Power Systems
  • IBM i PTF Guide, Volume 27, Numbers 31 And 32

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle