• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru: Use SQL to Generate Random Data

    June 17, 2019 Ted Holt

    Suppose I needed to generate a large database table with random data in order to adequately test the performance of an SQL query. Suppose that, for security reasons, I was not allowed to copy the production version of the table. Suppose that I needed a way to generate a lot — and I do mean a lot! — of random data. Suppose this scenario is not mere supposition.

    Before an SQL query goes into production, it should be tested against a production-like dataset. Running a query against a test dataset of 25 rows (records) can produce unpleasant surprises when it’s installed into production and forced to face a production dataset of thousands or even millions of rows. Obviously we can’t key that much data, so how do we generate it? We use SQL. Let me give you an example.

    Suppose you wish to test a query against a single table. (That’s not realistic, but it will keep the example small and I can illustrate the techniques.) You already have a copy of your table, and you just need to load data into it. However, since this is an illustration, here’s the statement to create my illustrative table.

    create table mylib.Cust24
      ( ID       dec(5),
        Name     char(20),
        Type     dec(1),
        Balance  dec(5,2),
        DueDate  date,
      primary key (ID));
    

    You need data for these columns (fields). You might just use a sequential number for ID, the primary key. If so, a recursive common table expression generates the sequence.

    with list (IDNbr) as 
      (values(1) 
        union all
       select IDNbr + 1
         from list
        where IDNbr < 25)
    select IDNbr from list
    

    This generates 25 rows with sequential ID numbers. You’d generate many more, of course.

    Now put random data into the other columns. The Type column should have an integer from 1 to 3.

    dec(rand()*3+1, 1,0)
    

    The rand() function generates a random number in the range zero to one. Multiplying that number by three changes it to a random number in the range zero to three. Adding one to that puts it into the range one to four. (Chances of the number being four are slim to none. I’ve never seen a random number generator return a zero or a one, only numbers in between them.) The dec() function removes the decimal places, leaving a whole number.

    The Balance column is a five-digit packed-decimal number. If you want a positive number, you can use an expression much like the one for the Type column.

    dec(rand()*999, 5, 2)
    

    If you want some negative numbers as well, adjust the expression to give you some.

    dec(rand()*1000, 5, 2)-500
    

    A random date isn’t hard to get. Use the rand() function to generate a number of days to adjust a reference date, such as the current date.

    current date + int(rand()*100) days
    

    This expression adds up to 100 days to the current date.

    I’ve saved the tricky part for last. You need to generate a random Name value. The rand() function only returns numeric values. How do you translate those numeric values into character strings? It’s slightly messy, but not difficult. Here’s one possibility.

    with list (IDNbr, RandName) as 
      (values(1,         dec(rand()*100000,5,0)) 
        union all
       select IDNbr + 1, dec(rand()*100000,5,0)
         from list
        where IDNbr < 25)
    select IDNbr,
       substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz',
                  int(mod(RandName       ,53)+1), 1) concat
       substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz', 
                  int(mod(RandName*0.1   ,53)+1), 1) concat
       substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz',
                  int(mod(RandName*0.01  ,53)+1), 1) concat
       substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz',
                  int(mod(RandName*0.001 ,53)+1), 1) concat
       substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz',
                  int(mod(RandName*0.0001,53)+1), 1)
    

    Let’s break that down a piece at a time.

    with list (IDNbr, RandName) as 
      (values(1,         dec(rand()*100000,5,0)) 
        union all
       select IDNbr + 1, dec(rand()*100000,5,0)
         from list
        where IDNbr < 25)
    

    RandName is a five-digit packed-decimal random number. You can use RandName to generate subscripts into an array of characters.

    substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz',
           int(mod(RandName,53)+1), 1)
    

    The mod() function divides RandName by 53, the number of characters in the literal, returning a number from zero to 52. Adding one gives me a valid subscript value to the string. Substr() returns one character from the literal. You have one random character, which is either a letter or space.

    You can generate more characters by dividing RandName by other powers of ten.

    substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz', 
               int(mod(RandName*0.1 ,53)+1), 1)
    substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz',
               int(mod(RandName*0.01,53)+1), 1)
    

    Here’s the entire query.

    with list (IDNbr, RandName) as 
      (values(1,         dec(rand()*100000,5,0)) 
        union all
       select IDNbr + 1, dec(rand()*100000,5,0)
         from list
        where IDNbr < 25)
    select IDNbr, 
       substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz',
                  int(mod(RandName       ,53)+1), 1) concat
       substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz', 
                  int(mod(RandName*0.1   ,53)+1), 1) concat
       substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz',
                  int(mod(RandName*0.01  ,53)+1), 1) concat
       substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz',
                  int(mod(RandName*0.001 ,53)+1), 1) concat
       substr('ABCDEFGHIJKLMNOPQRSTUVWYXZ abcdefghijklmnopqrstuvwxyz',
                  int(mod(RandName*0.0001,53)+1), 1) as Name,
       dec(rand()*3+1, 1,0) as Type,
       dec(rand()*1000, 5, 2) as Balance,
       current date + int(rand()*100) days as DueDate
      from list;
    

    Does it work? You be the judge.

    select * from cust24;
    

    I’ve included links to two articles from the IBM Knowledge Center for your benefit. (See the first two links under Related Stories.) They’re not for Db2 for i, but you may find them helpful. Be aware that the queries may not work on your system. One example generates character data by putting the mod() function in the second parameter of the translate() function, and Db2 for i does not allow that (yet).

    One shortcoming of using a large table of random data is that you may not be assured that all cases get tested. But you would have already done that test with a small dataset that contains all the data needed to fully exercise your code. The purpose of testing against random data is performance, not accuracy.

    RELATED STORIES

    How to populate a table with randomly generated test data

    Generate test data using SQL

    RAND

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guru, DB2 for i, FHG, Four Hundred Guru, IBM i, SQL

    Sponsored by
    Krengeltech

    When it comes to consuming web APIs on your IBM i, your options often boil down to one of two things:

    First, you end up having to rely on a variety of open source and non-RPG solutions. This adds developer complexity, taking away time that could have been better spent invested in other projects. Of course, open source software is free, but generally comes at the cost of no professional support, which adds an element of risk in your production environment. RXS is completely professionally supported, and is complemented by a staff of trained IBM i developers who can address your nuanced development challenges, head on.

    Second, if you choose not to pursue an open-source solution, you’re often left having to shake up your current program architecture with proprietary software, external dependencies, and partial RPG implementations – many of which are sub-par compared to RPG-XML Suite’s wide range of features. RXS aims to simplify the efforts of developers with tools like code generators, useful commands, and subprocedures written in 100% RPG – no Java. Because they are entirely RPG, the RXS subprocedures are easy to add to new or existing ILE programs and architecture, helping to cut your development time. RPG-XML Suite offers powerful capabilities in an accessible, easy-to-implement format.

    With RPG-XML Suite, you can accomplish a variety of complex tasks, such as:

    • Calling REST and SOAP web services from your IBM i
    • Offering APIs from your IBM i
    • Creating JSON & XML
    • Parsing JSON & XML
    • Text manipulation, Base64 encoding/decoding, CCSID handling, hashing and encryption functions, and more.

    To try RXS for yourself, we recommend a free proof of concept, which not only gives you access to all of RPG-XML Suite’s subprocedures and utilities but also includes a tailor-made software demonstration that can be used as a starting point for your future API implementations.

    For a free proof of concept, contact us at sales@krengeltech.com, or visit our website for more information.

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Four Hundred Monitor, June 17 Meet IBM’s New Security Architect for IBM i

    Leave a Reply Cancel reply

TFH Volume: 29 Issue: 37

This Issue Sponsored By

  • ProData Computer Services
  • New Generation Software
  • CNX
  • MITEC
  • iTech Solutions

Table of Contents

  • Pricing Revealed For IBM i Slices On IBM Cloud
  • Meet IBM’s New Security Architect for IBM i
  • Guru: Use SQL to Generate Random Data
  • Four Hundred Monitor, June 17
  • IBM i PTF Guide, Volume 21, Number 24

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • IBM i 7.3 TR12: The Non-TR Tech Refresh
  • IBM i Integration Elevates Operational Query and Analytics
  • Simplified IBM i Stack Bundling Ahead Of Subscription Pricing
  • More Price Hikes From IBM, Now For High End Storage
  • Big Blue Readies Power10 And IBM i 7.5 Training for Partners
  • IBM Delivers More Out-of-the-Box Security with IBM i 7.5
  • Groundhog Day For Malware
  • IBM i Community Reacts to IBM i 7.5
  • Four Hundred Monitor, May 11
  • IBM i PTF Guide, Volume 24, Number 19

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2022 IT Jungle

loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.