Newsletters   Subscriptions  Forums  Store  Media Kit  About Us  Contact  Search   Home 
fhg
Volume 4, Number 31 -- September 15, 2004

XML Validation with Regular Expressions

Hey, David:


The code for this article is available for download.


I want to set the date field in a schema to YYYYMMDD format and use this to validate an XML document. Will I have to have to write any Java code to support this? I also want to set the format of an amount to numbers, with 18 digits followed by 3 digits decimal value. The separator can be either a comma or a decimal point.

 

--Satish


From your description, it looks like you can use XML Schema's pattern-matching capability. With pattern matching, you supply a regular expression that is matched against a value to determine whether the value is allowed. Regular expressions are very powerful and flexible. The problem with regular expressions is that they are cryptic and can be difficult to understand and test.

You can use the following patterns to validate your date and numeric values.

 
((19|20)\d\d)(0[1-9]|1[012])(0[1-9]|1[0-9]|2[0-9]|3[01])

\d{18}[\.,]\d{3}

The first regular expression will match a date in year, month, and day sequence without separators. However, this regular expression does not take into account the month or year when it looks at the day, and only ensures that the day is between 01 and 31. Building a regular expression that prevents an invalid date like 20040230 is possible, but the only way I know of to do this is too long to be practical. I won't describe every part of this expression, but the pipe (|) indicates an or, the square brackets ([]) define a class of characters, and the \d indicates that any digit is allowed.

The second regular expression is reasonably simple and only allows values that start with 18 digits, followed by a period (.) or a comma (,) and followed by three additional digits. The \d indicates a digit, and the bracketed 18 describes the length. The character class in square brackets ([]) allows a single period or comma. Because the period is a reserved value, it must be escaped with a backslash (\). The \d{3} indicates that the value must end in three digits. There are lots of resources available on the Internet that describe how to create and use regular expressions, like Regular-Expressions.info.

Fortunately, there are utilities that allow you to enter and test regular expressions. I used the Regex Coach to validate the regular expressions I used to validate your specialized date and numeric values.

You use a regular expression in a schema like this.

<xsd:element name="amount" type="amountType"/>

<xsd:simpleType name="amountType">
  <xsd:restriction base="xsd:token">
    <xsd:pattern value="\d{18}[\.,]\d{3}"/>
  </xsd:restriction>
</xsd:simpleType>

For date validation in an XML document, I would consider changing the format to be an International Standards Organization (ISO) date formatted like YYYY-MM-DD, which is directly supported by XML Schema. To use that support, you would use the standard date validation provided by XML Schema.

<xsd:element name="date" type="xsd:date"/>

You can see how these instructions work in the satish.xsd schema, created to test your example.

Once you have a schema, you will need to apply it to your XML document. One of the simplest ways to do that is to write a short Java program that receives the names of the schema and XML document as parameters and applies the validation. See "Improved XML Validation with Schemas," which describes how to use an XML schema to validate XML documents.

I have included an updated version of the ValidDocument Java program mentioned in that article to locate the XML document and XML schema on the classpath. To do this, I use the getResourceAsStream method found on the class loader. The XML schema is located with an entity resolver that is called when the parser looks for the related schema. These changes mean that you don't have to pass the full path or a relative path to ValidDocument at run time, and the schema can even be in a JAR file. If your document is in a package you would include the package directories separated by forward slashes (/), like org/iseriestoolkit/xml/mySchema.xsd. If you have a directory named java/xml with the XML document and XML schema in it, you would call ValidDocument from Qshell (QSH) like this:

export -s CLASSPATH=.:/java/xml/jdom/build/jdom.jar:/java
   /xml/xerces/xercesImpl.jar:/java/xml/xerces/xml-apis.jar
java ValidDocument satish.xml satish.xsd

You didn't mention what type of process would call this program, but you can call Java directly from an RPG IV program, CL, or a Qshell script. The article "More on XML Schema Validation with RPG" explains how to call this validation from an RPG IV program.

--David Morris

Sponsored By
DAMON TECHNOLOGIES

RSP is the Evolution of RPG

RSP (RPG Server Pages) is the best way to develop Web applications with RPG.

· Developers use their existing RPG skills.
· More robust than CGI with greater flexibility and speed.
· RSP is not just visual development. It is an application server built specifically for the iSeries.
· Full debug capabilities.
· Session Handling with a built in garbage collector.
· Use WDSc to develop your web content.
· Priced Right.

With RSP, Web content is developed with the Ease, Speed, and Reliability of RPG.

In today's fast paced business world, there is not enough time or resources to convert RPG developers into Java developers. The logical step to bring your business critical applications to the Web is with RSP. RSP gives the developer the tools necessary to create fast and reliable Web applications.

Download your free copy of RSP today!

www.damontech.com
Evolve


Editors: Howard Arner, Joe Hertvik, Ted Holt,
Shannon O'Donnell, Kevin Vandever
Managing Editor: Shannon Pastore
Contributing Editors: Joel Cochran, Wayne O. Evans, Raymond Everhart,
Bruce Guetzkow, Marc Logemann, David Morris
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message.


THIS ISSUE
SPONSORED BY:

T.L. Ashford
Damon Technologies
WorksRight Software


BACK ISSUES

TABLE OF
CONTENTS
Using the SQL Features in Operations Navigator

XML Validation with Regular Expressions

Admin Alert: Three Ways to Tighten OS/400 Security


The Four Hundred
PeopleSoft Says It's Working Hard to Make a Better World

Tales of an iSeries Offshore Outsourcer

UDO Storage Now Available for the iSeries

Four Hundred Stuff
Versata Completes Port of Java Tool to iSeries

Olympic Effort Not Required for Publishing Business Communications

Tree of Life Finds Supply Chain Answers in Datamart

Four Hundred Monitor


Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.
Guild Companies, 50 Park Terrace East, Suite 8F, New York, NY 10034
Privacy Statement