• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru: In Pursuit Of Reasonable Data

    March 9, 2020 Ted Holt

    My awareness of the indifference of computers toward the reasonableness of data dawned when I enrolled in the university. The band director told me to sign up for band and jazz band for zero semester hours. He said the stupid computer wouldn’t know that no one takes a course for zero hours, wouldn’t flag me to obtain permission from the dean to carry an overload, yet would grant me the credit. He was right.

    Love ‘em or hate ‘em, we live with them, we learn to get around them when they limit us, and we deal with them when they make mistakes. Except that you and I know that computers don’t make mistakes. As someone has said, “He who blames the computer is making two mistakes, one of which is blaming the computer.”

    For a computer to do its job properly, two things are necessary: reliable, dependable, bug-free software, and reasonable data. I wrote about the latter recently, and I wish to continue with that topic today.

    Domains

    You have undoubtedly heard the general database terms that IBM encourages us to use these days: table instead of physical file, row instead of record, column instead of field. Let me introduce another relational database term: domain. To quote a textbook I found online, “a domain is a set of acceptable values that a column is allowed to contain.” I don’t believe anyone could give a better definition of domain.

    Enforcing a set of acceptable values is easy when those acceptable values are few in number. We need only add an appropriate constraint, and the database will not accept invalid data. Notice the ID and Type columns in the following table definition.

    create or replace table EmployeeMaster
       for system name EmpMast
      ( ID      dec(5) primary key check (ID > 0),
        Name    varchar(16),
        Type    char(1)    check (Type in ('S', 'H', 'T', 'P')),
        Salary  dec(9,2),
        Wage    dec(5,2)
       )
    

    The acceptable values for the ID and Type columns can easily be listed. Given enough time, I could list all the positive five-digit whole numbers. I could list the four permitted values of Type much more quickly.

    But what about a column for which the number of acceptable values is impractical or impossible to list? What we do about that?

    We look for a pattern.

    Let’s say that we work for a company that maintains inventory. Each inventory item has an ID number. The first question to ask ourselves is “what is the domain of item numbers?” Or to say it in plain English, “What does an item number look like?”

    In our pretend company, an item number is one of the letters A, B, or C, followed by four digits.

    Some valid item numbers Some invalid item numbers
    A2157 2A157
    B3200 C32
    C0000 BR549

    The following check constraint enforces this pattern:

    create or replace table . . .
     item char(5)  
       check (translate(item, 'AA000000000','BC123456789') = 'A0000'),
    . . .
    

    The TRANSLATE function returns the first value after the characters in the second value have replaced the corresponding characters in the third value. In this example, all Bs and Cs are converted to As and digits 1 through 9 are converted to zeros. Everything else, including As and zeros, remains unconverted. If the converted value is A0000, the item number is in the domain.

    Suppose item numbers have two patterns. Some follow the convention just mentioned. Others consist of two of the letters A, B and C in any combination followed by three digits. Here’s the check constraint:

    item char(5) 
     check (translate(item, 'AA000000000','BC123456789')
                in ('A0000', 'AA000'))
    

    TRANSLATE converts the characters to the corresponding characters and the result can be either of two values.

    Let’s consider one more example. Let’s check a name for reasonableness. What’s the domain for a name of a person?

    In English, names consist of letters and two punctuation marks: hyphen and apostrophe. Some people have names with an embedded blank. There’s no way to check for all possible names, but we can check for a pattern. Here’s one possibility:

    name char(25)
    check (translate(name, ' ',
      'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-''')
        = ' '),
    

    TRANSLATE replaces all uppercase letters, all lowercase letters, hyphens and apostrophes with blanks. Since the second parameter is shorter than the third parameter, the system pads the second parameter with enough blanks to make it as long as the third parameter. If the name has a character that is not found in the third parameter, TRANSLATE will not convert that character to blank and TRANSLATE will return a non-blank value.

    This doesn’t completely enforce reasonableness, of course. It would allow a name like JXnD’pt—-k, but the computer can only do so much.

    Is the Best Yet to Come?

    Some other relational database management systems allow the use of regular expressions in constraints. For example, you might do this to check for the A0000 pattern:

    item char(5) check (regexp_like(item,'[A-Z][0-9]{4}'))
    

    While DB2 for i supports regular expressions in other contexts, it doesn’t let us use them in constraints — yet. I suspect that it’s only a matter of time until IBM adds this support. Regular expressions would allow us to be even more specific about pattern matches than TRANSLATE does, but TRANSLATE will have to do for now.

    Midrange shops have traditionally enforced the reasonableness of data through validation routines in file maintenance and data entry programs. This is not an effective strategy for keeping an accurate database. If it were, I would not have seen so much bad data over the past decades. We need constraints and triggers in addition to the validation routines.

    RELATED STORIES

    Guru: Better Check Constraints

    Database Design – 2nd Edition – Chapter 7 The Relational Data Model (Adrienne Watt)

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guru, FHG, Four Hundred Guru, IBM i

    Sponsored by
    ARCAD Software

    [Live Webinar] Rewrite your Synon in Java? It could be… a mistake!

    Converting from Synon CA 2E to a modern language? There are many good reasons to take this step. Beyond the critical shortage of Synon skills, applications developed in this 4GL environment can no longer take advantage of key technology advances in the IBM i operating system. And though a leader in its day, the Synon CASE tool is incompatible with modern DevOps practices and concurrent development.

    Do you think the best way to move away from Synon CA 2E is by rewriting everything in Java?

    Think again. A full rewrite is risky, expensive, and slow — often leading to years of redevelopment and countless functional regressions.

    Join Philippe Magne, CEO of ARCAD Software and Ray Bernardi, senior solution architect to learn why a conversion to Free Form RPG is a risk-free and high-performance option that leverages the architecture of the original Synon application – retaining all the reliability and security advantages of the host platform.

    In our Webinar, we will share a secure, automated migration path for business applications developed in Synon CA 2E:

    • 100% automated conversion to modern Free Form RPG and DDL (SQL)
    • Guaranteed conversion accuracy with test automation process
    • Possible modernization of the user interface during the project
    • Risk free, fixed price Modernization as a Service (MaaS)
    • Modern full DevOps framework for delivery

    Leverage the competitive advantage within your Synon application.

    Register Today!

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    As I See It: In the Shadow of The West Wing Thoroughly Modern: Giving IBM i Developers A Helping Hand

    Leave a Reply Cancel reply

TFH Volume: 30 Issue: 17

This Issue Sponsored By

  • ProData Computer Services
  • Fresche Solutions
  • Fresche Solutions
  • WorksRight Software
  • Raz-Lee Security

Table of Contents

  • How GraphQL Can Improve IBM i APIs
  • Thoroughly Modern: Giving IBM i Developers A Helping Hand
  • Guru: In Pursuit Of Reasonable Data
  • As I See It: In the Shadow of The West Wing
  • Big Blue Cuts Deals On Entry Power Systems Iron

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Stacking Up Power10 And Power11 Systems Price/Performance
  • Where Infor Is Headed With Its ERPs For IBM i
  • Rocket Delivers More DevOps Capabilities For IBM i
  • A Few More Power Systems Announcements Before Year End
  • IBM i PTF Guide, Volume 27, Number 46
  • Bang For The Buck On Entry Power10 And Power11 Machines
  • A Hardware Refresh Is The Perfect Time To Re-Evaluate Your HA/DR Strategy
  • Fresche Taps AI For New RPG-To-Java Conversion Tool
  • Gartner Raises 2025 IT Spending Forecast, Puts Out 2026 Prediction
  • IBM i PTF Guide, Volume 27, Number 45

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle