• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru: Reading Nested XML Using SQL

    August 31, 2020 Jonathan M. Heinz

    XML is a data-interchange format, not a relational database management system. For this reason, using SQL to query XML data can be challenging, as what would be stored in two relational tables are placed in one element of XML. To put it another way, detail data is nested under the header data.

    I would like to share a way of using SQL to extract nested data from an XML file. I found this method useful when testing a change to a process that creates XML to be sent to customers. I can use this SQL to quickly check that the process is populating the XML correctly.

    The XML contained invoice information with detailed product rows. I needed to extract the product rows for each invoice number. At first glance, this should be easy, but it isn’t, because of the repeated product rows.

    Here is some simplified example XML that has multiple nested elements. It has two invoices to be sent to two customers, and each invoice contains two product rows. In real life it could contain thousands of invoices and each invoice might have thousands of product rows.

    <?xml version='1.0' encoding='UTF-8'? >
    <INVOICES>
      <INVOICE>
        <HEADER>
         <INVOICE_ID>123456</INVOICE_ID>
           <CURRENCY>
               <CODE>EUR</CODE>
           </CURRENCY>
        </HEADER>
        <RECEIVER>
           <CUSTOMER_INFORMATION>
              <CUSTOMER_NAME>COMPANY1</CUSTOMER_NAME>
              <CUSTOMER_ID>00000001</CUSTOMER_ID>
              <ADDRESS>
                 <STREET_ADDRESS1>Lane 1</STREET_ADDRESS1>
                 <STREET_ADDRESS2></STREET_ADDRESS2>
                 <STREET_ADDRESS3></STREET_ADDRESS3>
                 <POSTAL_CODE>10</POSTAL_CODE>
                 <COUNTRY>UK</COUNTRY>
              </ADDRESS>
           </CUSTOMER_INFORMATION>
        </RECEIVER>
        <ROWS>
           <ROW>
           <ROW_NUMBER>1</ROW_NUMBER>
               <PRODUCT>
                  <PRODUCT_ID>111112</PRODUCT_ID>
                  <PRODUCT_NAME>Some good stuff</PRODUCT_NAME>
              </PRODUCT>
               <ROW_TOTAL>
                 <AMOUNT SIGN="+" VAT="EXCLUDED">10.000</AMOUNT>
               </ROW_TOTAL>
           </ROW>
           <ROW>
           <ROW_NUMBER>2</ROW_NUMBER>
               <PRODUCT>
                 <PRODUCT_ID>111114</PRODUCT_ID>
                 <PRODUCT_NAME>Some other good stuff</PRODUCT_NAME>
             </PRODUCT>
               <ROW_TOTAL>
                 <AMOUNT SIGN="+" VAT="EXCLUDED">5.350</AMOUNT>
               </ROW_TOTAL>
           </ROW>
        </ROWS>
      </INVOICE>
      <INVOICE>
        <HEADER>
         <INVOICE_ID<123457</INVOICE_ID>
           <CURRENCY>
               <CODE>EUR</CODE>
           </CURRENCY>
        </HEADER>
        <RECEIVER>
          <CUSTOMER_INFORMATION>
              <CUSTOMER_NAME>COMPANY2</CUSTOMER_NAME>
              <CUSTOMER_ID>00000002</CUSTOMER_ID>
              <ADDRESS>
                 <STREET_ADDRESS1>Lane 2</STREET_ADDRESS1>
                 <STREET_ADDRESS2></STREET_ADDRESS2>
                 <STREET_ADDRESS3></STREET_ADDRESS3>
                 <POSTAL_CODE>20</POSTAL_CODE>
                 <COUNTRY>UK</COUNTRY>
              </ADDRESS>
          </CUSTOMER_INFORMATION>
        </RECEIVER>
        <ROWS>
           <ROW>
           <ROW_NUMBER>1</ROW_NUMBER>
               <PRODUCT>
                  <PRODUCT_ID>111112</PRODUCT_ID>
                  <PRODUCT_NAME>Some good stuff</PRODUCT_NAME>
              </PRODUCT>
               <ROW_TOTAL>
                 <AMOUNT SIGN="+" VAT="EXCLUDED">10.000</AMOUNT>
               </ROW_TOTAL>
           </ROW>
           <ROW>
           <ROW_NUMBER>2</ROW_NUMBER>
               <PRODUCT>
                  <PRODUCT_ID>111115</PRODUCT_ID>
                  <PRODUCT_NAME>Some more good stuff</PRODUCT_NAME>
               </PRODUCT>
               <ROW_TOTAL>
                 <AMOUNT SIGN="+" VAT="EXCLUDED">5.350</AMOUNT>
               </ROW_TOTAL>
           </ROW>
        </ROWS>
      </INVOICE>
    </INVOICES>
    

    I ran the following queries using Run SQL Scripts in Access Client Solutions (ACS). You must use commitment control when working with XML, otherwise the system will respond with SQL state 42926 (LOB and XML locators are not allowed with COMMIT(*NONE)). If your connection is not set to use commitment control, issue the following command before running the queries.

    set transaction isolation level read committed;
    

    Extracting the invoice ID and customer information (the header data) is easily done with the following SQL query.

    select a.* from xmltable('INVOICES/INVOICE'
      passing (xmlparse(document get_xml_file('/SomeDir/INVOICE.XML')))
      columns
        InvoiceID varchar(6) Path 'HEADER/INVOICE_ID',
        CustomerName varchar(20) Path 'RECEIVER/CUSTOMER_INFORMATION/CUSTOMER_NAME',
        CustomerNumber varchar(10) Path 'RECEIVER/CUSTOMER_INFORMATION/CUSTOMER_ID'
     ) as a;
    
    INVOICEID CUSTOMERNAME CUSTOMERNUMBER
    123456 COMPANY1 00000001
    123457 COMPANY2 00000002

    This is quite straightforward. What if I want to return all the product rows? That’s just as easy.

    select a.* from xmltable ('INVOICES/INVOICE/ROWS/ROW'
     Passing ( xmlparse(document get_xml_file('/SomeDir/INVOICE.XML')))
     columns
        ROWNUM  integer  Path 'ROW_NUMBER',
        PRODUCTID varchar(20) Path 'PRODUCT/PRODUCT_ID',
        PRODUCTNAME varchar(50) Path 'PRODUCT/PRODUCT_NAME'
     ) as a;
    
    ROWNUM PRODUCTID PRODUCTNAME
    1 111112 Some good stuff
    2 111114 Some other good stuff
    1 111112 Some good stuff
    2 111115 Some more good stuff

    However, what if we want to know the ID of the invoice each product belongs to? This presents a problem. We can’t just add the product ID and product name to the first query because the product is a repeating group. The database responds with SQL state 10507. (An XPath expression has a type that is not valid for the context in which the expression occurs.)

    One way around this obstacle is to use a CTE (Common Table Expression) with the XML data type.

    with f1 as
       (select *
           from xmltable ('INVOICES/INVOICE'
                 passing ( xmlparse(document get_xml_file('/SomeDir/INVOICE.XML')))
                 columns
                    InvoiceID   varchar(6)   Path 'HEADER/INVOICE_ID',
                    ALLROWS     xml          Path 'ROWS')),
    f2 as
       (select f1.*, p.*
          from f1,    
               xmltable ('ROWS/ROW' 
                passing ALLROWS
                columns 
                   ROWNUM   integer   Path 'ROW_NUMBER',
                   PRODUCTID varchar(20)  Path 'PRODUCT/PRODUCT_ID',
                   PRODUCTNAME varchar(50) Path 'PRODUCT/PRODUCT_NAME') as P)
    select f2.InvoiceID, f2.RowNum, f2.ProductID, f2.ProductName from f2;
    
    INVOICEID ROWNUM PRODUCTID PRODUCTNAME
    123456 1 111112 Some good stuff
    123456 2 111114 Some other good stuff
    123457 1 111112 Some good stuff
    123457 2 111115 Some more good stuff

    I’ve defined two common table expressions, F1 and F2. F1 retrieves the invoice ID as a single column and all the product information as a second column, which I named ALLROWS.

    The second CTE, F2, selects the InvoiceID from F1 and also extracts the product information from the ALLROWS column in F1. The result is similar to what we would get by querying joined header and detail tables. If you select all columns from F2, you can see the XML column.

    Using SQL to extracting nested data from an XML takes a little imagination, but it isn’t impossible. It isn’t even difficult.

    RELATED STORY

    DB2 For i XMLTABLE, Part 1: Convert XML to Tabular Data

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guru, Common Table Expression, CTE, FHG, Four Hundred Guru, IBM i, SQL, XML

    Sponsored by
    UCG Technologies

    DON’T GAMBLE WITH YOUR DATA

    Comprehensive Data Protection from UCG Technologies: VAULT400 IBM i Cloud Backup & DRaaS

    Heightened concerns of the industry

    • Inefficient manual backup processes
    • Effectively storing data offsite
    • Developing and testing a concrete disaster recovery plan
    • Efficient access to data in disaster scenario for necessary users
    • Risk of cyber security attack
    • Declining IT staff and resources

    The true cause of the above concerns is an organization’s status quo – 80% of IBM i users currently backup to tape and 40% of companies have no DR plan at all. During this unprecedented time, don’t wait for your business to suffer a disaster to take action.

    The path to ensure cost-effective safety

    • Automated cloud backup to two remote sites

    − redundant storage, power, internet pipe, firewalls, etc.

    • AES 256-bit encryption at rest and in flight
    • Fully managed remote hardware DR, including remote VPN access for necessary users
    • Regularly simulated phishing tests and cyber security training

    Potential “landmines” in solutions to avoid

    • Single point of storage – no redundancy
    • Misleading data analysis, compression/de-dup ratios, sizing of necessary computer resources for backup and DR
    • Large scale cloud storage with difficult recovery
    • Inability to meet RTO/RPO

    There’s probably never going to be a better time to ensure your business continuity and DR plans are the best they can be.

    LEARN MORE:

    Visit VAULT400.com/proposal to receive a FREE analysis and proposal

    FROM TAPE TO CLOUD

    This timely report highlights the top five reasons why businesses are leaving tape technology and moving to the cloud for data protection. Download the PDF at vault400.com/report.

    BACKING UP CRITICAL DATA WITH TAPE IS A GAMBLE NO BUSINESS CAN AFFORD TO TAKE.

    Serving the US, Canada, & Latin America

    VAULT400 Cloud Backup & DRaaS is an IBM server-proven solution.

    800.211.8798 | info@ucgtechnologies.com| ucgtechnologies.com/cloud

    To the First Responders serving on the front-lines during the COVID-19 pandemic, we extend our heartfelt gratitude.

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    COVID-19 Tracking Comes to iSeries Central Can You Get There From Here? The Road to the Cloud

    One thought on “Guru: Reading Nested XML Using SQL”

    • Christian Wolff says:
      September 24, 2020 at 5:01 am

      Why not using an x-path-expression to refer to the appropriate parent element to get the Invoice ID like in the statement below:
      select a.* from xmltable (‘INVOICES/INVOICE/ROWS/ROW’
      Passing ( xmlparse(document get_xml_file(‘/SomeDir/INVOICE.XML’)))
      columns
      InvoiceID varchar(6) Path ‘../../HEADER/INVOICE_ID’,
      ROWNUM integer Path ‘ROW_NUMBER’,
      PRODUCTID varchar(20) Path ‘PRODUCT/PRODUCT_ID’,
      PRODUCTNAME varchar(50) Path ‘PRODUCT/PRODUCT_NAME’
      ) as a;

      Reply

    Leave a Reply Cancel reply

TFH Volume: 30 Issue: 52

This Issue Sponsored By

  • iTech Solutions
  • COMMON
  • UCG Technologies
  • ARCAD Software
  • Rocket Software

Table of Contents

  • IBM’s Possible Designs For Power10 Systems
  • Can You Get There From Here? The Road to the Cloud
  • Guru: Reading Nested XML Using SQL
  • COVID-19 Tracking Comes to iSeries Central
  • IBM Clarifies Utility Pricing, Adds Solution Edition For New Entry Power Server

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Why Open Source Is Critical for Digital Transformation
  • mrc Refreshes IBM i Low-Code Dev Tool
  • Unit Testing Automation Hits Shift Left Instead of Ctrl-Alt-Delete Cash
  • Four Hundred Monitor, March 3
  • IBM i PTF Guide, Volume 23, Number 9
  • Doing The Texas Two Step From Power9 To Power10
  • PHP’s Legacy Problem
  • Guru: For IBM i Newcomers, An Access Client Solutions Primer
  • IBM i 7.1 Extended Out To 2024 And Up To The IBM Cloud
  • Some Practical Advice On That HMC-Power9 Impedance Mismatch

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2021 IT Jungle

loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.