• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Merge Into the Synchronization Fast Lane with DB2 for i 7.1

    September 22, 2010 Michael Sansoterra

    One of the tedious chores that many developers face is keeping many different data sources synchronized. It is common for mid- to large-sized companies to have multiple software packages with requirements to keep synchronized copies of the item master, customer master, employee master, and other master files.

    Other common synchronization tasks involve keeping aggregate tables of detailed data or even synchronizing DB2 data with a relational or non-relational remote data source, including remote databases, Web services, flat files, etc. Fortunately, to help with all of this, IBM has updated DB2 for i 7.1 to include support for the SQL MERGE statement.

    MERGE (part of the SQL:2008 standard) is a statement that combines the functionality of INSERT and UPDATE statements into a single statement, generally for the purpose of synchronizing data. MERGE makes use of a target table (or updateable view) and a source query to do the synchronization.

    A Simple Example

    Let’s look at a quick example of how we might use MERGE to synchronize exact copies of an EMPLOYEE master file (from the DB2 for i EMPLOYEE sample table), which are placed in two different libraries (a.k.a., schemas). These files have a single column primary key: EMPNO. We’ll also assume this is a unidirectional synchronization, with the source library being merged into the target library. In the past, a complete synchronization of the tables would be done with three SQL statements:

    /* Step 1 - Update Target Rows from Source Rows */
    UPDATE TARGETLIB.EMPLOYEE AS TARGET
    SET ROW=(SELECT * 
               FROM SOURCELIB.EMPLOYEE AS SOURCE
              WHERE TARGET.EMPNO=SOURCE.EMPNO)
    WHERE EXISTS
    (SELECT * 
       FROM SOURCELIB.EMPLOYEE AS SOURCE
      WHERE TARGET.EMPNO=SOURCE.EMPNO)
    
    /* Step 2 - Remove Target Rows that are not in Source */
    DELETE FROM TARGETLIB.EMPLOYEE AS TARGET
    WHERE NOT EXISTS
    (SELECT *
       FROM SOURCELIB.EMPLOYEE AS SOURCE
      WHERE TARGET.EMPNO=SOURCE.EMPNO)
    
    /* Step 3 - Insert New Rows from Source into Target */
    INSERT INTO TARGETLIB.EMPLOYEE
    SELECT *
      FROM SOURCELIB.EMPLOYEE AS SOURCE
    WHERE NOT EXISTS
    (SELECT *
       FROM TARGETLIB.EMPLOYEE AS TARGET
      WHERE TARGET.EMPNO=SOURCE.EMPNO)
    

    Now let’s take a look at a sample MERGE statement that is similar to the above statements:

    MERGE INTO TARGETLIB.EMPLOYEE AS EMPLOYEE_TARGET
    USING SOURCELIB.EMPLOYEE AS EMPLOYEE_SOURCE
    
    ON (Employee_Target.EmpNo=Employee_Source.EmpNo)
    
    WHEN MATCHED THEN
        UPDATE SET
        Employee_Target.FIRSTNME =Employee_Source.FIRSTNME,
        Employee_Target.MIDINIT  =Employee_Source.MIDINIT,
        Employee_Target.LASTNAME =Employee_Source.LASTNAME,
        Employee_Target.WORKDEPT =Employee_Source.WORKDEPT,
        Employee_Target.PHONENO  =Employee_Source.PHONENO,
        Employee_Target.HIREDATE =Employee_Source.HIREDATE,
        Employee_Target.JOB      =Employee_Source.JOB,
        Employee_Target.EDLEVEL  =Employee_Source.EDLEVEL,
        Employee_Target.SEX      =Employee_Source.SEX,
        Employee_Target.BIRTHDATE=Employee_Source.BIRTHDATE,
        Employee_Target.SALARY   =Employee_Source.SALARY,
        Employee_Target.BONUS    =Employee_Source.BONUS,
        Employee_Target.COMM     =Employee_Source.COMM
    
    WHEN NOT MATCHED THEN
        INSERT (EMPNO,FIRSTNME,MIDINIT,LASTNAME,WORKDEPT,PHONENO,
                HIREDATE,JOB,EDLEVEL,SEX,BIRTHDATE,SALARY,BONUS,COMM)
        VALUES(
        Employee_Source.EMPNO,Employee_Source.FIRSTNME,
        Employee_Source.MIDINIT,Employee_Source.LASTNAME,
        Employee_Source.WORKDEPT,Employee_Source.PHONENO,
        Employee_Source.HIREDATE,Employee_Source.JOB,
        Employee_Source.EDLEVEL,Employee_Source.SEX,
        Employee_Source.BIRTHDATE,Employee_Source.SALARY,
        Employee_Source.BONUS,Employee_Source.COMM)
    
    ELSE IGNORE;
    

    Let’s examine the components of MERGE in a little more detail.

    MERGE INTO begins the statement and is followed by an updateable table or view, known as the “target.”

    The USING keyword specifies a “table reference, ” which can include just about any query, but does not include a common table expression. This is considered the source data for the merge.

    The ON keyword specifies the correlation between the source and the target, usually a unique row identifier. However, you can also specify additional criteria here, such as limiting the MERGE to employees within a certain department, etc.

    WHEN MATCHED THEN is used to indicate the operation to be performed when the ON clause condition has been met. In this case, an abbreviated UPDATE statement, using only a SET assignment clause, is used within the MERGE statement to set all the target row columns to be equal to the corresponding source row columns.

    WHEN NOT MATCHED THEN indicates the operation to be done when a source row does not have a matching row in the target table. Typically, this is an abridged INSERT/VALUES statement. Please note that the table name is not specified because the target table is automatically used by MERGE. For the instance when the source and the target should always contain an exact number of matching rows, the SIGNAL statement can be coded to indicate an error condition.

    While this example was somewhat trivial, please note that multiple WHEN MATCHED and NOT MATCHED clauses can be specified with additional criteria. These clauses will be evaluated in the order specified in the statement. This is similar to the evaluation of a CASE statement; once the first condition is true the specified operation will be evaluated with the remainder being ignored.

    ELSE IGNORE specifies that nothing further is to be done when the matching condition for all prior WHEN clauses are false.

    Please note when using MERGE that the synchronization is a uni-directional proposition–always from the source to the target. In the example above, you cannot update the source table based on data in the target. If you need two-way synchronization, then you will need a second MERGE statement with the source and target reversed.

    Deletion Considerations

    I said MERGE is “similar” to the three SQL statements because while DB2 can add new rows to the target and can update existing rows in the target, it cannot DELETE rows in the target for the circumstance where there is no matching row in the source. However, it can delete rows from the target in the case where there is a matching row from the source. I mention this because SQL Server’s implementation of MERGE has the option to delete rows from the target when there is no matching row in the source.

    I initially expected DB2’s MERGE to be able to perform a target DELETE when there is no matching source row. However this behavior, I learned, is SQL Server specific and not part of the ANSI SQL standard. Bummer!

    A DELETE statement is allowed in the WHEN MATCHED clause and can be used to remove rows from the target table when a match is found. However, a DELETE usually isn’t done unless additional criteria are specified (unless you want to delete all rows in the target when there is a matching source row). For example, if we wanted to remove MERGED employees from the target when the department is T01 (temporary employees), we could code a WHEN MATCHED THEN clause to the above MERGE statement like this:

    WHEN MATCHED AND Employee_Source.WORKDEPT='T01' THEN
        DELETE
    

    Because order of evaluation is important, this specific WHEN MATCHED would have to be placed prior to the more generic WHEN MATCHED clause shown earlier.

    If temporary employees are not supposed to be in the source data, an error condition can be artificially created using the SIGNAL statement like this:

    WHEN MATCHED AND Employee_Source.WORKDEPT='T01' THEN
        SIGNAL SQLSTATE '70001' 
           SET MESSAGE_TEXT='Temporary Employees Found'
    

    Another Example

    Let’s take a look at one more example. Consider the following Sales Summary table that summarizes customer sales by year and customer ID:

    CREATE TABLE SalesSummaryByYear (
    SalesYear      SMALLINT NOT NULL,
    CustomerId     INT NOT NULL,
    OrderQty       INT,
    OrderLineValue DEC(19,4),
    NumberLines    INT,
    RankBySales    INT,
    RankByOrders   INT,
    CreatedOn      TIMESTAMP,
    LastUpdatedOn  TIMESTAMP,
    CONSTRAINT pk_SalesSummaryByYear 
               PRIMARY KEY (SalesYear,CustomerId))
    

    You can use the MERGE statement to INSERT new and UPDATE existing rows in this table:

    MERGE INTO SalesSummaryByYear Target
    USING (
    SELECT YEAR(OrderDate) AS SalesYear, CustomerId,
               COUNT(DISTINCT SOD.SalesOrderId) As NumberOrders,
               SUM(OrderQty) AS OrderQty,
               SUM(UnitPrice*OrderQty*(1-UnitPriceDiscount)) 
                   AS OrderLineValue,
               COUNT(*) AS NumberLines,
               RANK() OVER(PARTITION BY CustomerId 
                           ORDER BY SUM(UnitPrice*OrderQty*
                               (1-UnitPriceDiscount)) DESC) 
                        AS RankBySales,
               RANK() OVER(PARTITION BY CustomerId 
                           ORDER BY COUNT(DISTINCT SOD.SalesOrderId) DESC)
                        AS RankByOrders
      FROM SalesOrderDetail SOD
      JOIN SalesOrderHeader SOH ON SOH.SalesOrderId=SOD.SalesOrderId
    GROUP BY YEAR(OrderDate),CustomerId
    ) Source
    ON Target.SalesYear=Source.SalesYear
    AND Target.CustomerId=Source.CustomerId
    
    WHEN MATCHED THEN
    UPDATE SET
    SalesYear=Source.SalesYear,
    CustomerId=Source.CustomerId,
    OrderQty=Source.OrderQty,
    OrderLineValue=Source.OrderLineValue,
    NumberLines=Source.NumberLines,
    RankBySales=Source.RankBySales,
    RankByOrders=Source.RankByOrders,
    LastUpdatedOn=CURRENT_TIMESTAMP
    
    WHEN NOT MATCHED THEN
    INSERT (SalesYear,CustomerId,
    OrderQty,OrderLineValue,NumberLines,
    RankBySales,RankByOrders,CreatedOn)
    VALUES (Source.SalesYear,Source.CustomerId,
    Source.OrderQty,Source.OrderLineValue,Source.NumberLines,
    Source.RankBySales,Source.RankByOrders,CURRENT_TIMESTAMP)
    ;
    

    In this example, a complex query is used as MERGE’s source. If the sales order history was extremely large, a WHERE clause could be added to limit the query’s sales to only the past year or two, which should be OK because sales orders over one year old are generally never updated.

    Using this technique is beneficial because:

    • The entire order history doesn’t need to be materialized.
    • A single statement handles inserts and updates into the target table.

    If you need to synchronize with remote data sources, table functions can be used to aid with this task, if the situation is appropriate. For example, if table function TFXML reads customer master data from an XML source (such as a Web service or an integrated file system (IFS) file), then a MERGE statement like this is valid:

    MERGE INTO MYDATA.CUSTOMER AS TARGET
    USING (SELECT * FROM TABLE(MYPGM.TFXML()) AS SOURCE
    ON (SOURCE.CustomerId=TARGET.CustomerId)
    ...
    

    Other Considerations

    One other important thing to know about MERGE concerns how it acts when running within a transaction. By default, the statement will run as an ATOMIC statement, that is, all of the modifications must succeed or the entire statement will fail and no changes are made. However, NOT ATOMIC may also be specified. In this case the statement is processed row by row, and all of the rows successfully processed up to the point of an error will be committed. Further, when NOT ATOMIC is specified, an additional option to continue processing (i.e., ignore the error) or stop processing is available.

    Benefits of using MERGE include:

    • Only one DML statement is required to perform INSERT and UPDATE statement functions (and in some limited cases DELETE).
    • Complex source query expressions do not have to be repeated in individual SQL statements.
    • This statement has been implemented by other major database vendors including Microsoft (SQL Server 2008) and Oracle (10g Release 2).
    • There is a high degree of granularity in how modifications are handled within a transaction by specifying ATOMIC or NOT ATOMIC.
    • Source code maintainability may be increased because a developer will be forced to examine the implications to both an INSERT and UPDATE when changes are made.
    • In my opinion, MERGE provides a cleaner syntax when updating a table based on another query or view when compared to using an UPDATE statement with correlated subqueries in the SET and WHERE clauses.

    In summary, MERGE is another one of those great new statements that makes the coding of a common task easier and squeezes even more work out of DB2.

    Michael Sansoterra is a DBA for Broadway Systems in Grand Rapids, Michigan. Send your questions or comments for Mike via the IT Jungle Contact page.



                         Post this story to del.icio.us
                   Post this story to Digg
        Post this story to Slashdot

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    DRV Tech

    Get More Out of Your IBM i

    With soaring costs, operational data is more critical than ever. IBM shops need faster, easier ways to distribute IBM applications-based data to users more efficiently, no matter where they are.

    The Problem:

    For Users, IBM Data Can Be Difficult to Get To

    IBM Applications generate reports as spooled files, originally designed to be printed. Often those reports are packed together with so much data it makes them difficult to read. Add to that hardcopy is a pain to distribute. User-friendly formats like Excel and PDF are better, offering sorting, searching, and easy portability but getting IBM reports into these formats can be tricky without the right tools.

    The Solution:

    IBM i Reports can easily be converted to easy to read and share formats like Excel and PDF and Delivered by Email

    Converting IBM i, iSeries, and AS400 reports into Excel and PDF is now a lot easier with SpoolFlex software by DRV Tech.  If you or your users are still doing this manually, think how much time is wasted dragging and reformatting to make a report readable. How much time would be saved if they were automatically formatted correctly and delivered to one or multiple recipients.

    SpoolFlex converts spooled files to Excel and PDF, automatically emailing them, and saving copies to network shared folders. SpoolFlex converts complex reports to Excel, removing unwanted headers, splitting large reports out for individual recipients, and delivering to users whether they are at the office or working from home.

    Watch our 2-minute video and see DRV’s powerful SpoolFlex software can solve your file conversion challenges.

    Watch Video

    DRV Tech

    www.drvtech.com

    866.378.3366

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    PowerTech:  FREE Webinar! Reduce the Cost and Effort of IBM i Auditing. Sept. 29, 10 a.m. CT
    looksoftware:  RPG OA & Beyond Webinar. Sept 28 & 29. Enter to win an Amazon Kindle™
    COMMON:  Join us at the Fall 2010 Conference & Expo, Oct. 4 - 6, in San Antonio, Texas

    IT Jungle Store Top Book Picks

    Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
    The iSeries Express Web Implementer's Guide: List Price, $49.95
    The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
    The System i Pocket RPG & RPG IV Guide: List Price, $69.95
    The iSeries Pocket Database Guide: List Price, $59.00
    The iSeries Pocket SQL Guide: List Price, $59.00
    The iSeries Pocket Query Guide: List Price, $49.00
    The iSeries Pocket WebFacing Primer: List Price, $39.00
    Migrating to WebSphere Express for iSeries: List Price, $49.00
    Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
    Getting Started with WebSphere Express for iSeries: List Price, $49.00
    Can the AS/400 Survive IBM?: List Price, $49.00
    Chip Wars: List Price, $29.95

    Is RFID Heyday Just Around the Corner? Power 720: Same Entry Price, But More Room to Grow at Less Cost

    Leave a Reply Cancel reply

Volume 10, Number 28 -- September 22, 2010
THIS ISSUE SPONSORED BY:

WorksRight Software
SEQUEL Software
Twin Data Corporation

Table of Contents

  • Get Thee to the Web, Part 3
  • Merge Into the Synchronization Fast Lane with DB2 for i 7.1
  • Changing i/OS Password Expiration Settings

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24
  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle