• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru: When An Outer Join Is An Inner Join

    February 15, 2021 Ted Holt

    When is a boy not a boy? When he’s abed! When is a door not a door? When it’s ajar! When is an outer join not an outer join? (Sorry, no dad joke here. Three dad jokes in one paragraph would have been too many, don’t you agree?) Let me answer that last question.

    In my work I often see outer joins that are not really outer joins, but inner joins. Oh, based on what I’ve heard from IBM, the query engine may treat them as outer joins, but the result set is the same as that produced by an inner join. This occurs when a query tests a column (field) from a secondary table or view (database file) for a non-null value.

    Before I show you an example, let’s look at some data. Here’s an invoice header table:

    INVOICE COMPANY CUSTNBR INVDATE
    101 1 50 2021-02-05
    102 1 33 2021-02-05
    103 2 70 2021-02-06
    104 2 53 2021-02-06

    And here’s the corresponding detail table:

    INVOICE LINE ITEM PRICE QTY
    101 1 AB-205 3.00 2
    101 2 DD-762 5.00 4
    102 1 TR-911 2.00 2
    103 1 JP-310 6.00 3
    103 2 AB-205 1.00 2
    101 3 Null null null

    Here’s the type of query I was referring to. There is a left outer join from the header to the detail, and the WHERE clause references the ITEM column, which is stored in the details.

    select h.invoice, h.company, h.CustNbr, h.InvDate,
           d.line, d.item, d.qty
      from invhdr as h
      left join invdtl as d
        on h.invoice = d.invoice
     where d.item = 'AB-205';
    

    Here’s the result set.

    INVOICE COMPANY CUSTNBR INVDATE LINE ITEM QTY
    101 1 50 2021-02-05 1 AB-205 2
    103 2 70 2021-02-06 2 AB-205 2

    Since the query specifically asks for an outer join, we’d expect to see invoices 101 through 104 inclusive, but instead we see the same results we’d seen had we used an inner join.

    So, which is it — an inner join or an outer join? And does it matter, as long as the query produces the correct results? I can’t speak for you, but it matters to me.

    First, there’s the matter of self-documentation, the idea that a human being should be able to read source code and understand what it’s doing and possibly even why it’s doing it. Testing a secondary column in an outer join is instant confusion. Did the developer who wrote that statement want an outer join? Did the developer think he wanted an outer join? Did the developer think?

    Second, there’s the matter of determining what the correct results should be. In many shops, the test data is a copy of the live data, sometimes with identifiable data changed, sometimes not. In some shops, the test data is the live data. There are two glaring problems with such test data: (1) It is rarely comprehensive, and therefore does not exercise all paths through the software and (2) there is usually far too much of it. Test data sets should be small enough that the output can be verified easily.

    It’s possible that the programmer who wrote this SELECT did a poor job of testing. The programmer may have run a quick test against the production database, saw that the numbers matched a screen or report, and moved along to the next fire.

    It’s also possible that the programmer did want an outer join, but only wanted secondary data for item AB-205. I wrote about that type of query in my younger days. If so, this statement definitely has a bug.

    As I wrote a few months ago, certain programming practices raise a little flag when I see them. Testing for non-null values with outer joins is another such practice. One good way to help the flag go up is to use correlation names. Seeing d. in front of item increases the chances that you’ll notice that the test is for a column in a secondary table.

    I should also mention that testing for null is a different story. An inner join returns only those rows where the secondary-table column is null.

    select h.invoice, h.company, h.CustNbr, h.InvDate,
           d.line, d.item, d.qty
      from invhdr as h
      join invdtl as d
        on h.invoice = d.invoice
     where d.item is null;
    
    INVOICE COMPANY CUSTNBR INVDATE LINE ITEM QTY
    101 1 50 2021-02-05 3 null null

    The outer join returns the rows with a null secondary column as well as null values for unmatched rows.

    select h.invoice, h.company, h.CustNbr, h.InvDate,
           d.line, d.item, d.qty
      from invhdr as h
      left join invdtl as d
        on h.invoice = d.invoice
     where d.item is null;
    
    INVOICE COMPANY CUSTNBR INVDATE LINE ITEM QTY
    101 1 50 2021-02-05 3 null null
    104 2 53 2021-02-06 null null null

    Effective, reliable code is not an accident. In the words of Sergeant Phil Esterhaus, “Let’s be careful out there.”

    RELATED STORIES

    On Vs. Where

    Guru: DISTINCT Can Hide A Performance Problem

    Let’s be careful out there

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guru, FHG, Four Hundred Guru, IBM i

    Sponsored by
    Raz-Lee Security

    Start your Road to Zero Trust!

    Firewall Network security, controlling Exit Points, Open DB’s and SSH. Rule Wizards and graphical BI.

    Request Demo

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Weighing The Hidden Costs Of Open Source A Conversation With Carol Woodbury on IBM i Security

    Leave a Reply Cancel reply

TFH Volume: 31 Issue: 12

This Issue Sponsored By

  • ARCAD Software
  • Precisely
  • New Generation Software
  • Trinity Guard
  • WorksRight Software

Table of Contents

  • Big Blue Rolls Out Red Hat Power Stack
  • A Conversation With Carol Woodbury on IBM i Security
  • Guru: When An Outer Join Is An Inner Join
  • Weighing The Hidden Costs Of Open Source
  • What Does This Year Look Like For IT Spending?

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Liam Allan Shares What’s Coming Next With Code For IBM i
  • From Stable To Scalable: Visual LANSA 16 Powers IBM i Growth – Launching July 8
  • VS Code Will Be The Heart Of The Modern IBM i Platform
  • The AS/400: A 37-Year-Old Dog That Loves To Learn New Tricks
  • IBM i PTF Guide, Volume 27, Number 25
  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle