DB2 For i XMLTABLE, Part 2: Using Namespaces And IFS XML Files
July 11, 2012 Michael Sansoterra
In Part 1 of this series, I introduced the DB2 for i developer’s newest friend, the XML “slicer and dicer” table function: XMLTABLE. XMLTABLE is powerful because it can read an XML document and convert its content to a tabular format that can participate in a query with relational data.
The developer has to specify the mapping between the XML data and the intended tabular format by using XPath expressions that instruct XMLTABLE how to map the data from XML into rows and columns. In this article, I’d like to cover two additional topics: parsing XML documents containing namespaces, and reading XML data directly from an IFS file (instead of using a DB2 XML column).
The sample XML data I used in Part 1 was relatively simple and did not use namespaces. However, in many cases XML documents implement namespaces as a way of grouping similar elements and differentiating duplicate element names that mean different things. For example, take a look at this XML:
<store> <info> <name>Computer Warehouse</name> <category /> </info> <owner> <name>Mike S</name> <address /> </owner> <product> <name>21 Speed Bicycle</name> <id>X100</id> </product> </store>
Notice there are three <name> elements: one falls under the <info> element, one falls under <product>, and the other is a child of the <owner> element. Each of these <name> elements has a different meaning. Namespaces can be used to make each of these <name> elements universally unique. Moreover, since namespaces group related elements you can, for instance, create XPath expressions that only pick out elements belonging to a particular namespace.
To illustrate how to use namespaces with XMLTABLE, I’m going to use the contact table from the Microsoft AdventureWorks 2005 sample database. (I exported this table from SQL Server to DB2 for i.) This table contains an XML column called AdditionalContactInfo. Here is an example of the XML column’s content:
<AdditionalContactInfo >http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ ContactInfo" red">http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ ContactRecord" red">http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ ContactTypes"> <act:telephoneNumber> <act:number>425-555-1112</act:number> <act:SpecialInstructions>Call only after 5:00 p.m.</act:SpecialInstructions> </act:telephoneNumber> Note that the customer has a secondary home address. <act:homePostalAddress> <act:Street>123 Oak</act:Street> <act:Cit>Seattle</act:City> <act:StateProvince>WA</act:StateProvince> <act:PostalCode>98001</act:PostalCode> <act:CountryRegion>USA</act:CountryRegion> <act:SpecialInstructions>If correspondence to the primary address fails, try this one. </act:SpecialInstructions> </act:homePostalAddress> Customer provided additional email address. <act:eMail> <act:eMailAddress>firstname.lastname@example.org</act:eMailAddress> <act:SpecialInstruction> For urgent issues, do not send e-mail. Instead use this emergency contact phone <act:telephoneNumber> <act:number>425-555-1111</act:number> </act:telephoneNumber> . </act:SpecialInstructions> </act:eMail> <crm:ContactRecord date="2001-06-02Z">This customer is interested in purchasing high-end bicycles for his family. The customer contacted Michael in sales. </crm:ContactRecord> </AdditionalContactInfo>
The root element <AdditionalContactInfo> contains namespace references (as identified by >
Street VARCHAR(64) PATH ‘act:homePostalAddress/act:Street’,
As opposed to the XMLTABLE example in Part 1, this time the first argument given to XMLTABLE is a list of namespaces wrapped by the XMLNAMESPACES function. The three namespaces shown in the SQL code are the same three specified in the XML document. Observe that the namespaces are aliased (AS “ci”, etc.). The alias provides a shorthand way of referring to the namespace.
Reviewing the XPath expressions, notice that each element name is prefixed with the namespace alias as shown here:
'$doc/ci:AdditionalContactInfo' PASSING Contacts. AdditionalContactInfo AS "doc" Phone VARCHAR(32) PATH 'act:telephoneNumber/ act:number',
As you can see, each element is now prefixed with the namespace alias (namespace alias:element). XMLTABLE will only extract the requested element if it belongs to the specified namespace.
Incidentally, if you’re wondering about this XPath expression:
Number_Child_Elements INTEGER PATH 'count(child::*)',
The previous code sample simply refers to the built-in XPath “count” function, which in this case is being asked to return the number of child elements within the <AdditionalContactInfo> element. It is only shown here to demonstrate many of the XPath core functions are supported by DB2 for i.
Additionally, if an XML document only has one namespace, or if you only require elements from one namespace, you can assign a DEFAULT namespace that eliminates the need to specify the redundant namespace prefix in the XPath expressions. In the Adventure Works 2005 sample database, the Store Table has a DEMOGRAPHICS XML column containing simple documents that look like this:
<StoreSurvey > <AnnualSales>300000</AnnualSales> <AnnualRevenue>30000</AnnualRevenue> <BankName>International Bank</BankName> <BusinessType>BM</BusinessType> <YearOpened>1970</YearOpened> <Specialty>Road</Specialty> <SquareFeet>7000</SquareFeet> <Brands>AW</Brands> <Internet>T1</Internet> <NumberEmployees>2</NumberEmployees> </StoreSurvey>
XMLTABLE can parse this XML using a default namespace:
SELECT xmldata.*,store.* FROM store JOIN XMLTABLE( XMLNAMESPACES(DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/ adventure-works/StoreSurvey'), '$doc/StoreSurvey' PASSING store.Demographics AS "doc" COLUMNS SquareFeet INTEGER PATH 'SquareFeet', …etc… ) AS XMLDATA ON 1=1;
Now, when an element is requested without a namespace, the default namespace is assumed.
Until now, the XMLTABLE examples used data from an XML column in a database table. However, XMLTABLE can also parse data from a variable or from an XML flat file stored on the IFS. For example, if an “additional contact info” XML document was stored on the IFS in a file called ContactInfo.xml, it could be read directly from the IFS and passed into XMLTABLE as follows:
SELECT XML_SHRED.* FROM (VALUES( XMLPARSE(DOCUMENT GET_XML_FILE('/xml/ContactInfo.xml')) )) AS XMLFILE(CONTACT), XMLTABLE( XMLNAMESPACES( 'http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ ContactInfo' as "ci", 'http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ ContactTypes' as "act", 'http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ ContactRecord' as "crm"), '$doc/ci:AdditionalContactInfo' PASSING XMLFILE.CONTACT AS "doc" COLUMNS Phone VARCHAR(32) PATH 'act:telephoneNumber/act:number', /* More columns go here */ PostalCode VARCHAR(256) PATH 'act:homePostalAddress/act:PostalCode') AS XML_SHRED ;
The syntax is a little hairy until it is examined piece by piece. The FROM clause consists of a single row VALUES clause and an implicit CROSS JOIN to the results of the XMLTABLE table function.
The VALUES clause contains a single XML column that has been aliased as CONTACT and will be used to feed the data into the XMLTABLE function. The column expression invokes the new GET_XML_FILE built-in function to directly read the content of an IFS file. However, the result data type of the GET_XML_FILE function is a BLOB locator that cannot be fed directly into the XMLTABLE function. Therefore, the XMLPARSE function is employed to cast the data from a BLOB locator to XML. Now, the IFS XML file is ready to be passed to XMLTABLE just like in the first example.
For those who are wondering if we even need the VALUES clause, the short answer appears to be yes, at least for now. I attempted to only use the XMLTABLE function in the FROM clause like this:
'$doc/ci:AdditionalContactInfo' PASSING XMLPARSE (DOCUMENT GET_XML_FILE('/tmp/ContactInfo.xml')) AS "doc"
However, this statement failed although I’m not sure it should have. I was testing this with DB2 i7.1 PTF Group level 14; if this is a bug hopefully it will be fixed in the future.
Overall, XMLTABLE offers a versatile mechanism for decomposing XML documents contained in DB2 tables, variables, or IFS files, and allows programmers to utilize the tabular result set in DB2 for i queries.