• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Data Scrubbing Functions In DB2 For i

    November 17, 2015 Hey, Mike

    Imagine a character database field that stores a phone number. No formatting rules are involved, so the values in the column vary such as 8370155738, 837/015-5738, 837-015.5738, etc. We’d like to write an RPG program that allows the user to enter a number, formatted or not, onto the screen and if that string of numbers is found in the table’s phone column, show it to the user. Is there a way to use SQL to strip the non-numeric characters from the phone field, and select the record if the result matches the user input?

    –Brad

    Hi, Brad.

    Scrubbing data is a common task for database applications, and fortunately DB2 for i has plenty of built in functions to help you with this task. There are two ways to approach this problem: using the new regular expression functionality in DB2 for i (available in IBM i 7.1 and 7.2 with recent group PTFs); or, for older IBM i installations, using the REPLACE and TRANSLATE functions.

    If you are on IBM i 7.1 TR9 or 7.2 TR1, the REGEXP_REPLACE function can be used to clean unwanted characters from a phone number as easy as this:

    WITH CUSTOMER_DATA AS (
    SELECT *
      FROM (VALUES(1,'PH:800/234-1412'),
                  (2,' 8370155738'),
                  (3,'837/015-5738'),
                  (4,'837-015.5738'),
                  (5,'7218675309 x243')
           ) x(ID,PHONE_NO)
    )
    SELECT LEFT(REGEXP_REPLACE(PHONE_NO,'[^d]',''),10) PHONE
      FROM CUSTOMER_DATA
    

    The pattern [^d] instructs the REGEXP_REPLACE function to identify all non-digit characters within the phone number and then the function replaces them with an empty string so that they effectively disappear.

    The query result is:

    Phone

    8002341412

    8470155738

    8470155738

    8470155738

    7218675309

    If your DB2 for i level doesn’t offer the REGEX_REPLACE function, other options are available. The simplest is to nest multiple REPLACE functions around the phone number to remove any specific unwanted characters. For example, consider this expression that removes all hyphens, spaces and parenthesis from column PHONE_NO:

    SELECT REPLACE(
             REPLACE(
               REPLACE(
                 REPLACE(PHONE_NO,'-',''),
                ')',''),
              '(',''),
            ' ','') AS PHONE_NO 
       FROM CUSTOMER_DATA
    

    The only problem with this approach is if you find other characters embedded in the phone number (such as X, ., /), you need to add additional REPLACE functions to handle them.

    As an alternative, a single instance of the TRANSLATE function can be used to change all non-numeric characters within a phone number to a space. Thereafter, the REPLACE function can be employed to remove all of the spaces from the phone number. The following example illustrates this:

    WITH CUSTOMER_DATA AS (
    SELECT *
      FROM (VALUES(1,'PH:800/234-1412'),
                  (2,' 8370155738'),
                  (3,'837/015-5738'),
                  (4,'837-015.5738'),
                  (5,'7218675309 x243')
           ) x(ID,PHONE_NO)
    )
    SELECT LEFT(REPLACE(TRANSLATE(PHONE_NO,' ',
    'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
    ~`|"''!@#$%^&*()-=+[]{}
    ;:.,<>?/'),' ',''),10) AS PHONE
      FROM CUSTOMER_DATA
    

    Because you will likely need to use this same function twice in the same SQL statement (once to scrub the table data and once to scrub the phone number the user enters on screen), it may be worthwhile to encapsulate it in a scalar user-defined function:

    CREATE OR REPLACE FUNCTION CLEAN_PHONE_NO (@PHONE_NO VARCHAR(24))
    RETURNS CHAR(10)
    LANGUAGE SQL
    NO EXTERNAL ACTION
    DETERMINISTIC
    NOT FENCED
    SET OPTION COMMIT=*NONE
    BEGIN
        RETURN LEFT(REPLACE(TRANSLATE(@PHONE_NO,' ', 
    	'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
    	~`|"''!@#$%^&*()-=+[]{}
    	;:.,<>?/'),
    ' ',''),10);
    END
    

    Once the function is built, it is easy to compare the phone number in the table (PHONE_NO column) with the RPG host variable SEARCH_PHONE_NO once they’re both cleaned up:

    /Exec SQL
        SELECT *
          FROM CUSTOMER_DATA
         WHERE CLEAN_PHONE_NO(PHONE_NO)=CLEAN_PHONE_NO(:SEARCH_PHONE_NO);
    

    Or, to go back to the REGEX_REPLACE example:

    /Exec SQL
        SELECT *
          FROM CUSTOMER_DATA
         WHERE LEFT(REGEXP_REPLACE(PHONE_NO,'[^d]',''),10)=
               LEFT(REGEXP_REPLACE(:SEARCH_PHONE_NO,'[^d]',''),10);
    

    If you want the RPG variable cleaned up within the program for other purposes (aside from the SQL lookup), you can scrub the user’s input (variable SEARCH_PHONE_NO) with an embedded SET statement in RPG:

    /Exec SQL
        SET :SEARCH_PHONE_NO=CLEAN_PHONE_NO(:SEARCH_PHONE_NO);
    

    On a related topic, if users are going to do frequent searches on the phone number, it may be a good idea to periodically clean-up the phone number column and put an index on it:

    UPDATE CUSTOMER_DATA
    SET PHONE_NO=CLEAN_PHONE_NO(PHONE_NO);
    
    CREATE INDEX IDX_CUSTOMER_DATA_PHONE ON CUSTOMER_DATA (PHONE_NO);
    

    If PHONE_NO is cleaned up in the table, this SELECT statement can take advantage of the index:

    SELECT *
      FROM CUSTOMER_DATA
     WHERE PHONE_NO=CLEAN_PHONE_NO(:SEARCH_PHONE_NO)
    

    Disclaimer: The “scrub” functionality shown here only allows for a 10-character phone number. You may need to tweak the code if you want to allow for international phone numbers, a leading 1 for long distance, etc.

    Michael Sansoterra is a DBA for Broadway Systems in Grand Rapids, Michigan. Send your questions or comments for Mike via the IT Jungle Contact page.

    RELATED STORY

    Native Regular Expressions In DB2 For i 7.1 And 7.2

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags:

    Sponsored by
    New Generation Software

    FREE Webinar:

    Creating Great Data for Enterprise AI

    Enterprise AI relies on many data sources and types, but every AI project needs a data quality, governance, and security plan.

    Wherever and however you want to analyze your data, adopting modern ETL and BI software like NGS-IQ is a great way to support your effort.

    Webinar: June 26, 2025

    RSVP today.

    www.ngsi.com – 800-824-1220

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Sponsored Links

    NGS:  Nov. 18 FREE Webinar: Migration Alternatives for Query/400 Users
    United Computer Group:  VAULT400 BaaS delivers secure cloud backup and DR solutions
    BCD:  IBM i Webinar with Jon Paris - RPG OA: So Misunderstood! November 18 at 1pm EST

    HelpSystems’ Application Integration Begins With GUI ERP Upgrades: Love ‘Em or Leave ‘Em?

    One thought on “Data Scrubbing Functions In DB2 For i”

    • ajay says:
      October 17, 2018 at 12:32 pm

      I want to find out to remove spaces between character field ( ABC DE EFF CA). I cannot find one answer? in AS400 SQL .

      Reply

    Leave a Reply Cancel reply

Volume 15, Number 24 -- November 17, 2015
THIS ISSUE SPONSORED BY:

ProData Computer Services
Connectria
WorksRight Software

Table of Contents

  • An Introduction to Processing XML with RPG, Part 4
  • Formatting Dates with SQL, Take 3
  • Data Scrubbing Functions In DB2 For i

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Public Preview For Watson Code Assistant for i Available Soon
  • COMMON Youth Movement Continues at POWERUp 2025
  • IBM Preserves Memory Investments Across Power10 And Power11
  • Eradani Uses AI For New EDI And API Service
  • Picking Apart IBM’s $150 Billion In US Manufacturing And R&D
  • FAX/400 And CICS For i Are Dead. What Will IBM Kill Next?
  • Fresche Overhauls X-Analysis With Web UI, AI Smarts
  • Is It Time To Add The Rust Programming Language To IBM i?
  • Is IBM Going To Raise Prices On Power10 Expert Care?
  • IBM i PTF Guide, Volume 27, Number 20

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle