• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru Classic: The Efficiency of Varying Length Character Variables

    August 14, 2019 Jon Paris

    Remember the bad old days when dinosaurs still roamed the earth and the only way to build strings in RPG involved playing silly games with arrays? Or worse still, obscure combinations of MOVE operations? Thankfully those days are far behind us — although sadly there are still a few RPG/400 dinosaurs coding away!

    RPG IV introduced many powerful new string handling options, such as the %TRIMx family of BIFs, but even now there are capabilities in the language that few programmers fully exploit. One of my favorites is variable length fields. This lack of familiarity made this tip an obvious choice to update in an age where we are frequently tasked with building CSV, JSON, XML, and HTML strings. There are many good reasons to use these varying length fields in such cases, but in this tip we’re going to focus mainly on performance.

    For those of you unfamiliar with varying length fields, the following definition (A) shows how they are defined.

    (A)   dcl-s  varyField  varChar(256)  Inz;
    

    Under the covers, varying length fields have two components: the current length that is represented by a 2-byte integer in the first two positions, followed by the actual data. In today’s RPG they are defined by the VARCHAR keyword. Back in the days of D specs they were identified by adding the keyword VARYING to a regular character definition. Actually to say that they have a 2-byte length is not true in all cases. Version 6 heralded an increase in maximum field lengths with the result that while varying length fields up to 65,535 characters in length have a 2-byte length, longer fields need to use a 4-byte length to accommodate the length. The programmer has no need to be concerned with this however.

    You should train yourself to always code the INZ keyword to ensure that the length field is set correctly. This is critical when varying length fields are incorporated in data structures. Why? Because by default, data structures are initialized to spaces (hex 40) and that causes havoc when interpreted as the field length!

    Whenever the content of a varying length field is changed, the compiler automatically adjusts the associated length to reflect the new content. Note that you should always use %TRIMx when loading data from a fixed length field into a varying length field, otherwise any trailing and/or leading blanks will be counted in the field length. Any time you want to know how long the field is, use the %LEN() built-in function to obtain the current value.

    Now that we’ve reviewed the basics of variable length fields, let’s see how they can be used to boost the performance of some types of string operation. Take a look at the following two pieces of code. Both of them build a string of 100 comma separated values. At first glance there is very little difference in the logic, but would you believe that the second one can run hundreds or even thousands of times faster?

           dcl-c  baseString  'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
    
           dcl-s  fixedField   char(10);
           dcl-s  longFixed    char(2000);
           dcl-s  longVarying  varchar(2000);
    
           For i = 1 to 10;
              For j = 1 to 10;
    (B)          fixedField = %Subst(baseString: i: j );
    (C)          longFixed = %TrimR(longFixed) + ',' + fixedField;
              EndFor;
           EndFor;
    
           For i = 1 to 10;
              For j = 1 to 10;
                 fixedField = %Subst(baseString: i: j );
    (D)          longVarying += ',' + %TrimR(fixedField);
              EndFor;
           EndFor;
    

    The reason is simple. The second one (D) makes use of a varying length field to build up the result string! This difference in speed is easy to understand if you think about what is going on under the hood. The first version (C) uses a fixed length target string so these are the steps that take place:

    1. Work out where the last non-space character is.
    2. Add the comma in the next position.
    3. Add the content of fixedField in the next and subsequent positions.
    4. If longFixed is not yet full, add blanks to fill it.

    This process is repeated for each new value added to the string. Notice that having carefully padded the string with blanks (step 4), the very next thing we do (step 1) is to work out how many there are so that we can ignore them!

    Contrast this with the mechanics of the second version using the variable length field (D):

    1. Increment the field length by 1, and place the comma in that position.
    2. Determine the length of the field to add (i.e., ignoring trailing spaces).
    3. Copy that new data in starting at the field length + 1 position incrementing the field length.

    Much simpler! And the resulting speed differences can be staggering. In tests I ran while preparing this tip, even with a target field length as small as 256 characters, the varying length field version took only half the time of the fixed length version. When I raised the field length to 25,600, which is a much more realistic size when building a CSV, HTML or XML string, the speed difference rose to 1,300 to 1!

    Another point to consider is that the code shown above (C, i.e., the “slow” version) is already much more efficient than much of the code I have seen in customers’ programs. The two variants below are both very common and both even less efficient. In the first case (E) the field being added is being trimmed of blanks, which are immediately added back if it does not fill the target field! In the second case (F) the separation of the two functions means that the calculations for the effective length of the target field and the subsequent blank filling occur twice for each loop. You can imagine what that does to the speed. And yes, I have seen cases where people combine both E and F!

    (E)        longFixed = %TrimR(longFixed) + ',' + %TrimR(fixedField); 
    
    (F)        longFixed = %TrimR(longFixed) + ','; 
               longFixed = %TrimR(longFixed) + fixedField; 
    

    That’s all for this first look at variable length fields. In this follow-on tip, I describe their uses and abuses in the database.

    P.S. For those of you wondering what the purpose of the code at (B) is, it is simply used to generate fields of different effective lengths (one to 10 characters) to act as the test data to be added to the target string.

    Jon Paris is one of the world’s foremost experts on programming on the IBM i platform. A frequent author, forum contributor, and speaker at User Groups and technical conferences around the world, he is also an IBM Champion and a partner at Partner400 and System i Developer. He hosts the RPG & DB2 Summit twice per year with partners Susan Gantner and Paul Tuohy.

    RELATED STORIES

    Variable-Length Database Fields Better Use Disk Space

    The Geezer’s Guide to Free-Form RPG, Part 2: Data Structures and More

    Four Reasons RPG Geezers Should Care About The New Free-Form RPG

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guruclassic, BIF, CSV, FHGC, Four Hundred Guru Classic, HTML, IBM i, JSON, RPG IV, RPG/400, XML

    Sponsored by
    Raz-Lee Security

    Raz-Lee Security is the leader in security and compliance solutions that guard business-critical information on IBM i servers. We are committed to providing the best and most comprehensive solutions for compliance, auditing, and protection from threats and ransomware. We have developed cutting-edge solutions that have revolutionized analysis and fortification of IBM i servers.

    Raz-Lee’s flagship iSecurity suite of products is comprised of solutions that help your company safeguard and monitor valuable information assets against intrusions. Our state-of-the-art products protect your files and databases from both theft and extortion attacks. Our technology provides visibility into how users access data and applications, and uses sophisticated user tracking and classification to detect and block cyberattacks, unauthorized users and malicious insiders.

    With over 35 years of exclusive IBM i security focus, Raz-Lee has achieved outstanding development capabilities and expertise. We work hard to help your company achieve the highest security and regulatory compliance.

    Key Products:

    • AUDIT
    • FIREWALL
    • ANTIVIRUS
    • ANTI-RANSOMWARE
    • MULTI-FACTOR AUTHENTICATION
    • AP-JOURNAL
    • DB-GATE
    • FILESCOPE
    • COMPLIANCE MANAGER
    • FIELD ENCRYPTION

    Learn about iSecurity Products at https://www.razlee.com/isecurity-products/

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Guru Classic: Everybody Likes Shortcuts! Part 1, Navigation Guru Classic: Looking for Commitment, Part 3

    2 thoughts on “Guru Classic: The Efficiency of Varying Length Character Variables”

    • Chris Chambers says:
      August 15, 2019 at 5:08 am

      Any chance you can do same article for COBOL Jon ?

      Reply
      • Jon Paris says:
        August 22, 2019 at 4:13 pm

        You’d have to ask Ted about that Chris. Unfortunately COBOL still doesn’t have variable length field support) unless there is a recent update with which I am not familiar.

        That said they can be handled in COBOL – just not as simply and cleanly.

        P.S. Sorry for the delay in responding – for some reason I am not being notified of comments on my articles.

        Reply

    Leave a Reply Cancel reply

TFH Volume: 29 Issue: 46

This Issue Sponsored By

  • RPG & DB2 Summit
  • RPG & DB2 Summit
  • RPG & DB2 Summit

Table of Contents

  • Guru Classic: Looking for Commitment, Part 3
  • Guru Classic: The Efficiency of Varying Length Character Variables
  • Guru Classic: Everybody Likes Shortcuts! Part 1, Navigation

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Security Still Top Concern, IBM i Marketplace Study Says
  • Bob Langieri Shares IBM i Career Trends Outlook for 2023
  • Kisco Brings Native SMS Messaging to IBM i
  • Four Hundred Monitor, February 1
  • 2023 IBM i Predictions, Part 4
  • Power Systems Did Indeed Grow Revenues Last Year
  • The IBM Power Trap: Three Mistakes That Leave You Stuck
  • Big Blue Decrees Its 2023 IBM Champions
  • As I See It: The Good, the Bad, And The Mistaken
  • IBM i PTF Guide, Volume 25, Number 5

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2022 IT Jungle

loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.