• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru: MERGE, Chicken, And Eggs

    September 30, 2019 Ted Holt

    Which came first: the chicken or the egg? I don’t have time to ponder such trivialities. However, I am glad to know that SQL has a way to help me with chicken-and-egg database updates, i.e., when two statements need to run but each politely needs for the other to go first.

    Suppose you support an IBM i system that keeps up with inventory. It has an item master table (physical file) that stores general information such as a description, the standard cost, and the list price of an item.

    create table ItemMaster
      (ItemNumber     char(6),
       Revision       dec(3),
       Description    char(20),
       Cost           dec(5,2),
       Price          dec(5,2),
       EffectiveFrom  date  not null,
       EffectiveThru  date  not null with default '9999-12-31',
       primary key (ItemNumber, Revision));
       
    insert into ItemMaster values
    ('AA-101', 1, 'Widget',  2.50, 3.00, '2019-01-01', '9999-12-31'),
    ('BB-202', 1, 'Doodad',  1.00, 2.00, '2019-02-01', '2019-07-31'),
    ('BB-202', 2, 'Doodad',  1.25, 2.15, '2019-08-01', '9999-12-31'),
    ('CC-303', 1, 'Thingie', 3.50, 4.25, '2019-01-01', '9999-12-31');
    

    In many systems, the item master has one row (record) for each item. You can store only one set of values at a time. One description, one cost, one price. When you update a column (field), the old value of that column is lost.

    This system, however, permits multiple rows for an item. Each row has a range of effective dates. The dates in the rows of any item do not overlap, so that we can know what the description, standard cost, and catalog price were for any item on any given date.

    The rows for an item are distinguished by a revision (version) number, which is incremented when a new revision of the item is created. Thus, there are two ways to identify the active revision of an item: by looking for the highest revision number or by looking for the expiration (effective through) date 9999-12-31.

    To change the description, cost, and/or price for an item requires two operations.

    1. Change the effective-through date of the latest revision from 9999-12-31 to the day before the new values take effect.
    2. Add a new row with the new data and an effective-through date of 9999-12-31.

    In other words, we must UPDATE the existing active row and we must INSERT a new active row.

    Think for a moment. How would you make these changes? Would you UPDATE first or INSERT first? Either way you can run into problems. If you INSERT first, you have multiple rows for an item with an expiration date of 9999-12-31. If you UPDATE first, retrieving values from the last-active row in order to generate the release number for the new row becomes more difficult because you have to look for the MAX(REVISION) for each item. This is the sort of chicken-and-egg situation I was referring to.

    And then there’s the problem of the first of the two running to completion and the second one canceling with an error, which is a very real possibility. None of this is insurmountable. I just want to point out that this sort of thing needs to be thought out carefully.

    Fortunately, there is a better way. It’s called MERGE.

    Consider a file-maintenance table of changes to be applied to the item master.

    create table ItemUpdates
      (ItemNumber   char(6),
       Description  char(20),
       Cost         dec(5,2),
       Price        dec(5,2));
      
    insert into ItemUpdates values
    ( 'AA-101', 'Widget',  3.00, 3.55),
    ( 'BB-202', 'Doodad',  1.50, 2.25);
    

    We have data — a description, a cost, and a price — for two items. These may be the same values that are in the current row for the item, or they may be new values. In this example, I chose to leave the description as it is and change only cost and price for both items.

    MERGE needs two transactions per item — one to force an UPDATE and one to force an INSERT. But we have only one row per item in the ItemUpdates table. Here’s how we generate two transactions from one.

    select m.ItemNumber, m.Revision, m.Description, m.cost, m.price
             from ItemMaster as m
             join ItemUpdates as u
               on m.ItemNumber = u.ItemNumber
            where m.EffectiveThru = '9999-12-31'
           union all
           select m.ItemNumber, m.Revision + 1, u.Description, u.cost, u.price
             from ItemMaster as m
             join ItemUpdates as u
               on m.ItemNumber = u.ItemNumber
            where m.EffectiveThru = '9999-12-31'
    

    This query transforms two rows of input into four rows of output. The first SELECT retrieves data from the current revision. We’ll use the revision number and ignore the description, cost, and price columns. The second select generates the new revision with the new description, cost, and price.

    Item Revision Description Cost Price
    AA-101 1 Widget 2.50 3.00
    AA-101 2 Widget 3.00 3.55
    BB-202 2 Doodad 1.25 2.15
    BB-202 3 Doodad 1.50 2.25

    This query becomes the source data set in MERGE. That is, it follows USING. Watch MERGE do its magic.

    merge into ItemMaster as tgt
    using (select m.ItemNumber, m.Revision, 
                  m.Description, m.cost, m.price
             from ItemMaster as m
             join ItemUpdates as u
               on m.ItemNumber = u.ItemNumber
            where m.EffectiveThru = '9999-12-31'
           union all
           select m.ItemNumber, m.Revision + 1, 
                  u.Description, u.cost, u.price
             from ItemMaster as m
             join ItemUpdates as u
               on m.ItemNumber = u.ItemNumber
            where m.EffectiveThru = '9999-12-31') as src
       on (src.ItemNumber, src.Revision) = (tgt.ItemNumber, tgt.Revision)
     when matched then
        update set tgt.EffectiveThru = '2019-09-30'
     when not matched then
        insert values (src.ItemNumber, src.Revision, src.Description,
                       src.Cost, src.Price, '2019-10-01', '9999-12-31')
    

    The UPDATE only changes one column — it updates the expiration date to one day before the day the new revision takes effect. The INSERT creates the new revision. Amazing!

    It would be more realistic to see it in a production situation, such as in an RPG program.

    dcl-s  NewEffectiveDate   date;
    dcl-s  NoExpirationDate   date    inz(d'9999-12-31');
    
    exec sql
       merge into ItemMaster as tgt
       using (select m.ItemNumber, m.Revision,
                     m.Description, m.cost, m.price
                from ItemMaster as m
                join ItemUpdates as u
                  on m.ItemNumber = u.ItemNumber
               where m.EffectiveThru = :NoExpirationDate
              union all
              select m.ItemNumber, m.Revision + 1,
                     u.Description, u.cost, u.price
                from ItemMaster as m
                join ItemUpdates as u
                  on m.ItemNumber = u.ItemNumber
               where m.EffectiveThru = :NoExpirationDate) as src
          on (src.ItemNumber, src.Revision) =
                (tgt.ItemNumber, tgt.Revision)
        when matched then
           update set tgt.EffectiveThru = :NewEffectiveDate - 1 day
        when not matched then
           insert values (src.ItemNumber, src.Revision,
                          src.Description, src.Cost, src.Price,
                          :NewEffectiveDate, :NoExpirationDate);
    

    So what’s the result? See for yourself. First, the before:

    Item Revision Description Cost Price Effective from Effective thru
    AA-101 1 Widget 2.50 3.00 2019-01-01 9999-12-31
    BB-202 1 Doodad 1.00 2.00 2019-02-01 2019-07-31
    BB-202 2 Doodad 1.25 2.15 2019-08-01 9999-12-31
    CC-303 1 Thingie 3.50 4.25 2019-01-01 9999-12-31

    Now, the after (assuming the new effective date is October 1, 2019):

    Item Revision Description Cost Price Effective from Effective thru
    AA-101 1 Widget 2.50 3.00 2019-01-01 2019-09-30
    AA-101 2 Widget 3.00 3.55 2019-10-01 9999-12-31
    BB-202 1 Doodad 1.00 2.00 2019-02-01 2019-07-31
    BB-202 2 Doodad 1.25 2.15 2019-08-01 2019-09-30
    BB-202 3 Doodad 1.50 2.25 2019-10-01 9999-12-31
    CC-303 1 Thingie 3.50 4.25 2019-01-01 9999-12-31

    MERGE came through for us again!

    A word of warning is appropriate here. When you do this type of update, you need to do it under commitment control. It is possible for the UPDATE part of MERGE to succeed and the INSERT part to fail, and vice versa. If one part succeeds and one part fails, you need to be able to ROLLBACK. I would not want a partially updated database.

    Joe Celko wrote about this sort of problem almost 11 years ago, but with a couple of differences. If you’re interested in reading more about this topic, see the link below. Anything he writes, whether book or article, is worth reading.

    RELATED RESOURCE

    Temporal Data Techniques in SQL

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guru, FHG, Four Hundred Guru, IBM i, SQL

    Sponsored by
    New Generation Software

    FREE Webinar:

    Creating Great Data for Enterprise AI

    Enterprise AI relies on many data sources and types, but every AI project needs a data quality, governance, and security plan.

    Wherever and however you want to analyze your data, adopting modern ETL and BI software like NGS-IQ is a great way to support your effort.

    Webinar: June 26, 2025

    RSVP today.

    www.ngsi.com – 800-824-1220

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    As I See It: De-Meritocracy IBM i Startup Tackles “Awkward” Git Integration

    One thought on “Guru: MERGE, Chicken, And Eggs”

    • John Vriezen says:
      September 30, 2019 at 12:02 pm

      How is using this Merge technique under commitment control any different than just doing the original update and insert directly under commitment control? Also, IBM now natively supports temporal tables, so that could be an alternate way to solve this problem, although queries against older time frames would need to be done using temporal based SQL syntax.

      Reply

    Leave a Reply Cancel reply

TFH Volume: 29 Issue: 55

This Issue Sponsored By

  • iTech Solutions
  • Fresche Solutions
  • WorksRight Software
  • Manta Technologies
  • Eradani

Table of Contents

  • Power9 Prime Previews Future Power10 Memory Boost
  • IBM i Startup Tackles “Awkward” Git Integration
  • Guru: MERGE, Chicken, And Eggs
  • As I See It: De-Meritocracy
  • Hyperconverged Takes The Mantle Of Integrated Systems

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • FAX/400 And CICS For i Are Dead. What Will IBM Kill Next?
  • Fresche Overhauls X-Analysis With Web UI, AI Smarts
  • Is It Time To Add The Rust Programming Language To IBM i?
  • Is IBM Going To Raise Prices On Power10 Expert Care?
  • IBM i PTF Guide, Volume 27, Number 20
  • POWERUp 2025 –Your Source For IBM i 7.6 Information
  • Maxava Consulting Services Does More Than HA/DR Project Management – A Lot More
  • Guru: Creating An SQL Stored Procedure That Returns A Result Set
  • As I See It: At Any Cost
  • IBM i PTF Guide, Volume 27, Number 19

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle