• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru: Reading Nested XML Using SQL

    August 31, 2020 Jonathan M. Heinz

    XML is a data-interchange format, not a relational database management system. For this reason, using SQL to query XML data can be challenging, as what would be stored in two relational tables are placed in one element of XML. To put it another way, detail data is nested under the header data.

    I would like to share a way of using SQL to extract nested data from an XML file. I found this method useful when testing a change to a process that creates XML to be sent to customers. I can use this SQL to quickly check that the process is populating the XML correctly.

    The XML contained invoice information with detailed product rows. I needed to extract the product rows for each invoice number. At first glance, this should be easy, but it isn’t, because of the repeated product rows.

    Here is some simplified example XML that has multiple nested elements. It has two invoices to be sent to two customers, and each invoice contains two product rows. In real life it could contain thousands of invoices and each invoice might have thousands of product rows.

    <?xml version='1.0' encoding='UTF-8'? >
    <INVOICES>
      <INVOICE>
        <HEADER>
         <INVOICE_ID>123456</INVOICE_ID>
           <CURRENCY>
               <CODE>EUR</CODE>
           </CURRENCY>
        </HEADER>
        <RECEIVER>
           <CUSTOMER_INFORMATION>
              <CUSTOMER_NAME>COMPANY1</CUSTOMER_NAME>
              <CUSTOMER_ID>00000001</CUSTOMER_ID>
              <ADDRESS>
                 <STREET_ADDRESS1>Lane 1</STREET_ADDRESS1>
                 <STREET_ADDRESS2></STREET_ADDRESS2>
                 <STREET_ADDRESS3></STREET_ADDRESS3>
                 <POSTAL_CODE>10</POSTAL_CODE>
                 <COUNTRY>UK</COUNTRY>
              </ADDRESS>
           </CUSTOMER_INFORMATION>
        </RECEIVER>
        <ROWS>
           <ROW>
           <ROW_NUMBER>1</ROW_NUMBER>
               <PRODUCT>
                  <PRODUCT_ID>111112</PRODUCT_ID>
                  <PRODUCT_NAME>Some good stuff</PRODUCT_NAME>
              </PRODUCT>
               <ROW_TOTAL>
                 <AMOUNT SIGN="+" VAT="EXCLUDED">10.000</AMOUNT>
               </ROW_TOTAL>
           </ROW>
           <ROW>
           <ROW_NUMBER>2</ROW_NUMBER>
               <PRODUCT>
                 <PRODUCT_ID>111114</PRODUCT_ID>
                 <PRODUCT_NAME>Some other good stuff</PRODUCT_NAME>
             </PRODUCT>
               <ROW_TOTAL>
                 <AMOUNT SIGN="+" VAT="EXCLUDED">5.350</AMOUNT>
               </ROW_TOTAL>
           </ROW>
        </ROWS>
      </INVOICE>
      <INVOICE>
        <HEADER>
         <INVOICE_ID<123457</INVOICE_ID>
           <CURRENCY>
               <CODE>EUR</CODE>
           </CURRENCY>
        </HEADER>
        <RECEIVER>
          <CUSTOMER_INFORMATION>
              <CUSTOMER_NAME>COMPANY2</CUSTOMER_NAME>
              <CUSTOMER_ID>00000002</CUSTOMER_ID>
              <ADDRESS>
                 <STREET_ADDRESS1>Lane 2</STREET_ADDRESS1>
                 <STREET_ADDRESS2></STREET_ADDRESS2>
                 <STREET_ADDRESS3></STREET_ADDRESS3>
                 <POSTAL_CODE>20</POSTAL_CODE>
                 <COUNTRY>UK</COUNTRY>
              </ADDRESS>
          </CUSTOMER_INFORMATION>
        </RECEIVER>
        <ROWS>
           <ROW>
           <ROW_NUMBER>1</ROW_NUMBER>
               <PRODUCT>
                  <PRODUCT_ID>111112</PRODUCT_ID>
                  <PRODUCT_NAME>Some good stuff</PRODUCT_NAME>
              </PRODUCT>
               <ROW_TOTAL>
                 <AMOUNT SIGN="+" VAT="EXCLUDED">10.000</AMOUNT>
               </ROW_TOTAL>
           </ROW>
           <ROW>
           <ROW_NUMBER>2</ROW_NUMBER>
               <PRODUCT>
                  <PRODUCT_ID>111115</PRODUCT_ID>
                  <PRODUCT_NAME>Some more good stuff</PRODUCT_NAME>
               </PRODUCT>
               <ROW_TOTAL>
                 <AMOUNT SIGN="+" VAT="EXCLUDED">5.350</AMOUNT>
               </ROW_TOTAL>
           </ROW>
        </ROWS>
      </INVOICE>
    </INVOICES>
    

    I ran the following queries using Run SQL Scripts in Access Client Solutions (ACS). You must use commitment control when working with XML, otherwise the system will respond with SQL state 42926 (LOB and XML locators are not allowed with COMMIT(*NONE)). If your connection is not set to use commitment control, issue the following command before running the queries.

    set transaction isolation level read committed;
    

    Extracting the invoice ID and customer information (the header data) is easily done with the following SQL query.

    select a.* from xmltable('INVOICES/INVOICE'
      passing (xmlparse(document get_xml_file('/SomeDir/INVOICE.XML')))
      columns
        InvoiceID varchar(6) Path 'HEADER/INVOICE_ID',
        CustomerName varchar(20) Path 'RECEIVER/CUSTOMER_INFORMATION/CUSTOMER_NAME',
        CustomerNumber varchar(10) Path 'RECEIVER/CUSTOMER_INFORMATION/CUSTOMER_ID'
     ) as a;
    
    INVOICEID CUSTOMERNAME CUSTOMERNUMBER
    123456 COMPANY1 00000001
    123457 COMPANY2 00000002

    This is quite straightforward. What if I want to return all the product rows? That’s just as easy.

    select a.* from xmltable ('INVOICES/INVOICE/ROWS/ROW'
     Passing ( xmlparse(document get_xml_file('/SomeDir/INVOICE.XML')))
     columns
        ROWNUM  integer  Path 'ROW_NUMBER',
        PRODUCTID varchar(20) Path 'PRODUCT/PRODUCT_ID',
        PRODUCTNAME varchar(50) Path 'PRODUCT/PRODUCT_NAME'
     ) as a;
    
    ROWNUM PRODUCTID PRODUCTNAME
    1 111112 Some good stuff
    2 111114 Some other good stuff
    1 111112 Some good stuff
    2 111115 Some more good stuff

    However, what if we want to know the ID of the invoice each product belongs to? This presents a problem. We can’t just add the product ID and product name to the first query because the product is a repeating group. The database responds with SQL state 10507. (An XPath expression has a type that is not valid for the context in which the expression occurs.)

    One way around this obstacle is to use a CTE (Common Table Expression) with the XML data type.

    with f1 as
       (select *
           from xmltable ('INVOICES/INVOICE'
                 passing ( xmlparse(document get_xml_file('/SomeDir/INVOICE.XML')))
                 columns
                    InvoiceID   varchar(6)   Path 'HEADER/INVOICE_ID',
                    ALLROWS     xml          Path 'ROWS')),
    f2 as
       (select f1.*, p.*
          from f1,    
               xmltable ('ROWS/ROW' 
                passing ALLROWS
                columns 
                   ROWNUM   integer   Path 'ROW_NUMBER',
                   PRODUCTID varchar(20)  Path 'PRODUCT/PRODUCT_ID',
                   PRODUCTNAME varchar(50) Path 'PRODUCT/PRODUCT_NAME') as P)
    select f2.InvoiceID, f2.RowNum, f2.ProductID, f2.ProductName from f2;
    
    INVOICEID ROWNUM PRODUCTID PRODUCTNAME
    123456 1 111112 Some good stuff
    123456 2 111114 Some other good stuff
    123457 1 111112 Some good stuff
    123457 2 111115 Some more good stuff

    I’ve defined two common table expressions, F1 and F2. F1 retrieves the invoice ID as a single column and all the product information as a second column, which I named ALLROWS.

    The second CTE, F2, selects the InvoiceID from F1 and also extracts the product information from the ALLROWS column in F1. The result is similar to what we would get by querying joined header and detail tables. If you select all columns from F2, you can see the XML column.

    Using SQL to extracting nested data from an XML takes a little imagination, but it isn’t impossible. It isn’t even difficult.

    RELATED STORY

    DB2 For i XMLTABLE, Part 1: Convert XML to Tabular Data

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guru, Common Table Expression, CTE, FHG, Four Hundred Guru, IBM i, SQL, XML

    Sponsored by
    ARCAD Software

    [Webinar] Trends for 2026: ARCAD Software’s strategic vision

    Between the acceleration of artificial intelligence, constant pressure to modernize existing systems, and ever-increasing security requirements, 2026 is shaping up to be a decisive year for legacy platforms.

    At the start of this new year, this webinar offers strategic insight into the future of these critical environments, which are at the heart of information systems.

    Join Philippe Magne, CEO of ARCAD Software, as he shares his analysis of the major trends and structural issues facing organizations:

    • DevSecOps: What are the current trends in DevOps transformation?
    • Generative artificial intelligence: What are the concrete use cases and measurable benefits for application development and maintenance?
    • Critical application security: How to respond to growing and sophisticated threats?
    • Cloud and hybridization: How do legacy applications fit into current cloud strategies?

    Save your seat for March 24 at 11 AM EDT!

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    COVID-19 Tracking Comes to iSeries Central Can You Get There From Here? The Road to the Cloud

    One thought on “Guru: Reading Nested XML Using SQL”

    • Christian Wolff says:
      September 24, 2020 at 5:01 am

      Why not using an x-path-expression to refer to the appropriate parent element to get the Invoice ID like in the statement below:
      select a.* from xmltable (‘INVOICES/INVOICE/ROWS/ROW’
      Passing ( xmlparse(document get_xml_file(‘/SomeDir/INVOICE.XML’)))
      columns
      InvoiceID varchar(6) Path ‘../../HEADER/INVOICE_ID’,
      ROWNUM integer Path ‘ROW_NUMBER’,
      PRODUCTID varchar(20) Path ‘PRODUCT/PRODUCT_ID’,
      PRODUCTNAME varchar(50) Path ‘PRODUCT/PRODUCT_NAME’
      ) as a;

      Reply

    Leave a Reply Cancel reply

TFH Volume: 30 Issue: 52

This Issue Sponsored By

  • iTech Solutions
  • COMMON
  • UCG Technologies
  • ARCAD Software
  • Rocket Software

Table of Contents

  • IBM’s Possible Designs For Power10 Systems
  • Can You Get There From Here? The Road to the Cloud
  • Guru: Reading Nested XML Using SQL
  • COVID-19 Tracking Comes to iSeries Central
  • IBM Clarifies Utility Pricing, Adds Solution Edition For New Entry Power Server

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • No Joke: Big Memory And Flash Price Hikes Coming April 1
  • Strategic Topics To Think About For 2026, Part 2
  • Guru: IBM i Job Log Detective Brings Structure To Job Log Analysis In VS Code
  • IBM Launches Hybrid Cloud Backup Product With Cobalt Iron
  • IBM i PTF Guide, Volume 28, Number 10
  • Why You Need To Think About Offsite Data Protection
  • IBM Gets Bob 1.0 Off The Ground
  • You Store The Crown Jewels In A Safe, Not In A Bucket
  • More Power Systems Withdrawals, And Some From Red Hat, Too
  • Price Increases Are Here, Or Pending, And For Sure For Memory

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle