Guru: Three Ways To Manage Unmatched Data

April 24, 2017 Ted Holt

Heaven forbid that I would ignore a failed RPG CHAIN (random read) operation. I always take appropriate action. Which action I take depends on the situation. The same applies to outer joins that don’t find matching data in a secondary table. Here are three ways to deal with unmatched data in an outer join using SQL.

To illustrate, let’s use three tables from an overly simplified general ledger system. The first is a table of departments into which the business is divided. The second is a chart of accounts. The third is a transaction file that feeds the general ledger. Those tables look like this:

select ID, Name from dept

ID	NAME
1	Accounting
2	Sales
3	Shipping
4	Receiving

select ID, Name from acct

ID	NAME
1000	Cash in bank
1010	Petty cash
1020	Accounts receivable
2000	Accounts payable
2010	Notes payable
3000	Owner’s equity
4000	Sales
5000	Salaries
5010	Employee benefits
5020	Taxes
5030	Office expenses

select * from glxacts

ID	DATE	DEPT	ACCOUNT	AMOUNT
1	2017-04-24	1	5030	100.00
2	2017-04-24	3	5030	150.00
3	2017-04-24	5	5090	300.00
4	2017-04-24	2	5030	200.00
5	2017-04-24	1	5000	1,000.00
6	2017-04-24	2	5000	2,000.00
7	2017-04-24	3	5000	1,500.00
8	2017-04-24	4	5000	1,250.00

Notice that the department ID and account number in transaction 3 do not match any rows in the department and chart of accounts tables.

Method 1: Deal With Null Values

By default, the database manager returns nulls for the columns of a secondary table when no row matches a row in the primary table. Here’s such a query:

select t.ID, t.date,
       t.dept, d.name as Department,
       t.account, a.name as Account,
       t.amount
  from glxacts as t
  left join dept as d on t.dept = d.ID
  left join acct as a on t.account = a.ID

ID	DATE	DEPT	DEPARTMENT	ACCOUNT	ACCOUNT	AMOUNT
1	2017-04-24	1	Accounting	5030	Office expenses	100.00
2	2017-04-24	3	Shipping	5030	Office expenses	150.00
3	2017-04-24	5	null	5090	null	300.00
4	2017-04-24	2	Sales	5030	Office expenses	200.00
5	2017-04-24	1	Accounting	5000	Salaries	1,000.00
6	2017-04-24	2	Sales	5000	Salaries	2,000.00
7	2017-04-24	3	Shipping	5000	Salaries	1,500.00
8	2017-04-24	4	Receiving	5000	Salaries	1,250.00

The department name and account name are null for row 3 in the result set. There is nothing wrong with this. You must consider that the DEPARTMENT and ACCOUNT columns may have null values, and program accordingly. RPG programs require the use of null indicator variables, which are five-digit integers. A negative value means that the corresponding column is null.

Method 2: Fill In Default Values

Dealing with nulls can be a nuisance. You can use the COALESCE, VALUE, and IFNULL functions to convert nulls to non-null values. These three functions return the first non-null value in the list of arguments (parameters). If all of the arguments are null, these functions return the null value. I prefer COALESCE because it is the most standard of the three and because it can accept more than two arguments.

select t.ID, t.date,
       t.dept, coalesce(d.name, '**INVALID**') as Department,
       t.account, coalesce(a.name, '**INVALID**') as Account,
       t.amount
  from glxacts as t
  left join dept as d on t.dept = d.ID
  left join acct as a on t.account = a.ID

ID	DATE	DEPT	DEPARTMENT	ACCOUNT	ACCOUNT	AMOUNT
1	2017-04-24	1	Accounting	5030	Office expenses	100.00
2	2017-04-24	3	Shipping	5030	Office expenses	150.00
3	2017-04-24	5	INVALID	5090	INVALID	300.00
4	2017-04-24	2	Sales	5030	Office expenses	200.00
5	2017-04-24	1	Accounting	5000	Salaries	1,000.00
6	2017-04-24	2	Sales	5000	Salaries	2,000.00
7	2017-04-24	3	Shipping	5000	Salaries	1,500.00
8	2017-04-24	4	Receiving	5000	Salaries	1,250.00

Method 3: Raise An Error

You may tell a query to cancel itself when it finds unmatched data. After all, why return multitudes of rows if the result set will be unusable?

select t.ID, t.date,
       t.dept, coalesce(d.name,
                  raise_error('88001',
                   'Bad department ' concat varchar(t.dept) concat
                   ' for ID '        concat varchar(t.ID )))      
               as Department,
       t.account, coalesce(a.name,
                     raise_error('88002',
                     'Bad account ' concat varchar(t.account) concat
                     ' for ID '     concat varchar(t.ID )))
               as Account,
       t.amount
  from glxacts as t
  left join dept as d on t.dept = d.ID
  left join acct as a on t.account = a.ID

The RAISE_ERROR function forces an SQL statement to end abnormally. It has two arguments: the SQL state to be returned and the message text to describe the error. I have placed this function as the second argument in the COALESCE functions. If the department column of the transaction table matches the department ID in at least one row of the departments table, the result set contains the department name. Otherwise, the database manager cancels the query with a SQL state value of 88001. An unmatched account number cancels with SQL state 88002. The SQL code value in those cases is -438. In the job log you’ll see messages CPF503E:

User-defined function error on member GLXACTS. The external program or service program returned SQLSTATE 88001. The text message returned from the program is: Bad department 5 for ID 3.

And SQL0438:

Message Bad department 5 for ID 3 returned from SIGNAL, RESIGNAL, or RAISE_ERROR.

These are all good ways to deal with unmatched data. It’s good to have options.

This Issue Sponsored By

Table of Contents

Content archive

Recent Posts

Subscribe

Pages

Search