• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • XML Validation with Regular Expressions

    September 15, 2004 Hey, David

    The code for this article is available for download.

    I want to set the date field in a schema to YYYYMMDD format and use this to validate an XML document. Will I have to have to write any Java code to support this? I also want to set the format of an amount to numbers, with 18 digits followed by 3 digits decimal value. The separator can be either a comma or a decimal point.

     

    –Satish

    From your description, it looks like you can use XML Schema’s pattern-matching capability. With pattern matching, you supply a regular expression that is matched against a value to determine whether the value is allowed. Regular expressions are very powerful and flexible. The problem with regular expressions is that they are cryptic and can be difficult to understand and test.

    You can use the following patterns to validate your date and numeric values.

     
    ((19|20)dd)(0[1-9]|1[012])(0[1-9]|1[0-9]|2[0-9]|3[01])
    
    d{18}[.,]d{3}
    

    The first regular expression will match a date in year, month, and day sequence without separators. However, this regular expression does not take into account the month or year when it looks at the day, and only ensures that the day is between 01 and 31. Building a regular expression that prevents an invalid date like 20040230 is possible, but the only way I know of to do this is too long to be practical. I won’t describe every part of this expression, but the pipe (|) indicates an or, the square brackets ([]) define a class of characters, and the d indicates that any digit is allowed.

    The second regular expression is reasonably simple and only allows values that start with 18 digits, followed by a period (.) or a comma (,) and followed by three additional digits. The d indicates a digit, and the bracketed 18 describes the length. The character class in square brackets ([]) allows a single period or comma. Because the period is a reserved value, it must be escaped with a backslash (). The d{3} indicates that the value must end in three digits. There are lots of resources available on the Internet that describe how to create and use regular expressions, like Regular-Expressions.info.

    Fortunately, there are utilities that allow you to enter and test regular expressions. I used the Regex Coach to validate the regular expressions I used to validate your specialized date and numeric values.

    You use a regular expression in a schema like this.

    <xsd:element name="amount" type="amountType"/>
    
    <xsd:simpleType name="amountType">
      <xsd:restriction base="xsd:token">
        <xsd:pattern value="d{18}[.,]d{3}"/>
      </xsd:restriction>
    </xsd:simpleType>
    

    For date validation in an XML document, I would consider changing the format to be an International Standards Organization (ISO) date formatted like YYYY-MM-DD, which is directly supported by XML Schema. To use that support, you would use the standard date validation provided by XML Schema.

    <xsd:element name="date" type="xsd:date"/>
    

    You can see how these instructions work in the satish.xsd schema, created to test your example.

    Once you have a schema, you will need to apply it to your XML document. One of the simplest ways to do that is to write a short Java program that receives the names of the schema and XML document as parameters and applies the validation. See “Improved XML Validation with Schemas,” which describes how to use an XML schema to validate XML documents.

    I have included an updated version of the ValidDocument Java program mentioned in that article to locate the XML document and XML schema on the classpath. To do this, I use the getResourceAsStream method found on the class loader. The XML schema is located with an entity resolver that is called when the parser looks for the related schema. These changes mean that you don’t have to pass the full path or a relative path to ValidDocument at run time, and the schema can even be in a JAR file. If your document is in a package you would include the package directories separated by forward slashes (/), like org/iseriestoolkit/xml/mySchema.xsd. If you have a directory named java/xml with the XML document and XML schema in it, you would call ValidDocument from Qshell (QSH) like this:

    export -s CLASSPATH=.:/java/xml/jdom/build/jdom.jar:/java
       /xml/xerces/xercesImpl.jar:/java/xml/xerces/xml-apis.jar
    java ValidDocument satish.xml satish.xsd
    

    You didn’t mention what type of process would call this program, but you can call Java directly from an RPG IV program, CL, or a Qshell script. The article “More on XML Schema Validation with RPG” explains how to call this validation from an RPG IV program.

    –David Morris

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    WorksRight Software

    Do you need area code information?
    Do you need ZIP Code information?
    Do you need ZIP+4 information?
    Do you need city name information?
    Do you need county information?
    Do you need a nearest dealer locator system?

    We can HELP! We have affordable AS/400 software and data to do all of the above. Whether you need a simple city name retrieval system or a sophisticated CASS postal coding system, we have it for you!

    The ZIP/CITY system is based on 5-digit ZIP Codes. You can retrieve city names, state names, county names, area codes, time zones, latitude, longitude, and more just by knowing the ZIP Code. We supply information on all the latest area code changes. A nearest dealer locator function is also included. ZIP/CITY includes software, data, monthly updates, and unlimited support. The cost is $495 per year.

    PER/ZIP4 is a sophisticated CASS certified postal coding system for assigning ZIP Codes, ZIP+4, carrier route, and delivery point codes. PER/ZIP4 also provides county names and FIPS codes. PER/ZIP4 can be used interactively, in batch, and with callable programs. PER/ZIP4 includes software, data, monthly updates, and unlimited support. The cost is $3,900 for the first year, and $1,950 for renewal.

    Just call us and we’ll arrange for 30 days FREE use of either ZIP/CITY or PER/ZIP4.

    WorksRight Software, Inc.
    Phone: 601-856-8337
    Fax: 601-856-9432
    Email: software@worksright.com
    Website: www.worksright.com

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    SSA RFID Offering Shows Benefits in Wal-Mart Test OpenPowers Prove IBM Can Do Puppy i5s

    Leave a Reply Cancel reply

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24
  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle