• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru Classic: The Efficiency of Varying Length Character Variables

    August 14, 2019 Jon Paris

    Remember the bad old days when dinosaurs still roamed the earth and the only way to build strings in RPG involved playing silly games with arrays? Or worse still, obscure combinations of MOVE operations? Thankfully those days are far behind us — although sadly there are still a few RPG/400 dinosaurs coding away!

    RPG IV introduced many powerful new string handling options, such as the %TRIMx family of BIFs, but even now there are capabilities in the language that few programmers fully exploit. One of my favorites is variable length fields. This lack of familiarity made this tip an obvious choice to update in an age where we are frequently tasked with building CSV, JSON, XML, and HTML strings. There are many good reasons to use these varying length fields in such cases, but in this tip we’re going to focus mainly on performance.

    For those of you unfamiliar with varying length fields, the following definition (A) shows how they are defined.

    (A)   dcl-s  varyField  varChar(256)  Inz;
    

    Under the covers, varying length fields have two components: the current length that is represented by a 2-byte integer in the first two positions, followed by the actual data. In today’s RPG they are defined by the VARCHAR keyword. Back in the days of D specs they were identified by adding the keyword VARYING to a regular character definition. Actually to say that they have a 2-byte length is not true in all cases. Version 6 heralded an increase in maximum field lengths with the result that while varying length fields up to 65,535 characters in length have a 2-byte length, longer fields need to use a 4-byte length to accommodate the length. The programmer has no need to be concerned with this however.

    You should train yourself to always code the INZ keyword to ensure that the length field is set correctly. This is critical when varying length fields are incorporated in data structures. Why? Because by default, data structures are initialized to spaces (hex 40) and that causes havoc when interpreted as the field length!

    Whenever the content of a varying length field is changed, the compiler automatically adjusts the associated length to reflect the new content. Note that you should always use %TRIMx when loading data from a fixed length field into a varying length field, otherwise any trailing and/or leading blanks will be counted in the field length. Any time you want to know how long the field is, use the %LEN() built-in function to obtain the current value.

    Now that we’ve reviewed the basics of variable length fields, let’s see how they can be used to boost the performance of some types of string operation. Take a look at the following two pieces of code. Both of them build a string of 100 comma separated values. At first glance there is very little difference in the logic, but would you believe that the second one can run hundreds or even thousands of times faster?

           dcl-c  baseString  'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
    
           dcl-s  fixedField   char(10);
           dcl-s  longFixed    char(2000);
           dcl-s  longVarying  varchar(2000);
    
           For i = 1 to 10;
              For j = 1 to 10;
    (B)          fixedField = %Subst(baseString: i: j );
    (C)          longFixed = %TrimR(longFixed) + ',' + fixedField;
              EndFor;
           EndFor;
    
           For i = 1 to 10;
              For j = 1 to 10;
                 fixedField = %Subst(baseString: i: j );
    (D)          longVarying += ',' + %TrimR(fixedField);
              EndFor;
           EndFor;
    

    The reason is simple. The second one (D) makes use of a varying length field to build up the result string! This difference in speed is easy to understand if you think about what is going on under the hood. The first version (C) uses a fixed length target string so these are the steps that take place:

    1. Work out where the last non-space character is.
    2. Add the comma in the next position.
    3. Add the content of fixedField in the next and subsequent positions.
    4. If longFixed is not yet full, add blanks to fill it.

    This process is repeated for each new value added to the string. Notice that having carefully padded the string with blanks (step 4), the very next thing we do (step 1) is to work out how many there are so that we can ignore them!

    Contrast this with the mechanics of the second version using the variable length field (D):

    1. Increment the field length by 1, and place the comma in that position.
    2. Determine the length of the field to add (i.e., ignoring trailing spaces).
    3. Copy that new data in starting at the field length + 1 position incrementing the field length.

    Much simpler! And the resulting speed differences can be staggering. In tests I ran while preparing this tip, even with a target field length as small as 256 characters, the varying length field version took only half the time of the fixed length version. When I raised the field length to 25,600, which is a much more realistic size when building a CSV, HTML or XML string, the speed difference rose to 1,300 to 1!

    Another point to consider is that the code shown above (C, i.e., the “slow” version) is already much more efficient than much of the code I have seen in customers’ programs. The two variants below are both very common and both even less efficient. In the first case (E) the field being added is being trimmed of blanks, which are immediately added back if it does not fill the target field! In the second case (F) the separation of the two functions means that the calculations for the effective length of the target field and the subsequent blank filling occur twice for each loop. You can imagine what that does to the speed. And yes, I have seen cases where people combine both E and F!

    (E)        longFixed = %TrimR(longFixed) + ',' + %TrimR(fixedField); 
    
    (F)        longFixed = %TrimR(longFixed) + ','; 
               longFixed = %TrimR(longFixed) + fixedField; 
    

    That’s all for this first look at variable length fields. In this follow-on tip, I describe their uses and abuses in the database.

    P.S. For those of you wondering what the purpose of the code at (B) is, it is simply used to generate fields of different effective lengths (one to 10 characters) to act as the test data to be added to the target string.

    Jon Paris is one of the world’s foremost experts on programming on the IBM i platform. A frequent author, forum contributor, and speaker at User Groups and technical conferences around the world, he is also an IBM Champion and a partner at Partner400 and System i Developer. He hosts the RPG & DB2 Summit twice per year with partners Susan Gantner and Paul Tuohy.

    RELATED STORIES

    Variable-Length Database Fields Better Use Disk Space

    The Geezer’s Guide to Free-Form RPG, Part 2: Data Structures and More

    Four Reasons RPG Geezers Should Care About The New Free-Form RPG

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guruclassic, BIF, CSV, FHGC, Four Hundred Guru Classic, HTML, IBM i, JSON, RPG IV, RPG/400, XML

    Sponsored by
    Rocket Software

    Unlock the full potential of your data with Rocket Software. Our scalable solutions deliver AI-driven insights, seamless integration, and advanced compliance tools to transform your business. Discover how you can simplify data management, boost efficiency, and drive informed decisions.

    Learn more today.

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Guru Classic: Everybody Likes Shortcuts! Part 1, Navigation Guru Classic: Looking for Commitment, Part 3

    2 thoughts on “Guru Classic: The Efficiency of Varying Length Character Variables”

    • Chris Chambers says:
      August 15, 2019 at 5:08 am

      Any chance you can do same article for COBOL Jon ?

      Reply
      • Jon Paris says:
        August 22, 2019 at 4:13 pm

        You’d have to ask Ted about that Chris. Unfortunately COBOL still doesn’t have variable length field support) unless there is a recent update with which I am not familiar.

        That said they can be handled in COBOL – just not as simply and cleanly.

        P.S. Sorry for the delay in responding – for some reason I am not being notified of comments on my articles.

        Reply

    Leave a Reply Cancel reply

TFH Volume: 29 Issue: 46

This Issue Sponsored By

  • RPG & DB2 Summit
  • RPG & DB2 Summit
  • RPG & DB2 Summit

Table of Contents

  • Guru Classic: Looking for Commitment, Part 3
  • Guru Classic: The Efficiency of Varying Length Character Variables
  • Guru Classic: Everybody Likes Shortcuts! Part 1, Navigation

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24
  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle