fhg
Volume 8, Number 31 -- September 10, 2008

The Efficiency of Varying Length Character Variables

Published: September 10, 2008

by Jon Paris

Remember the bad old days when dinosaurs still roamed the earth and the only way to build strings in RPG involved playing silly games with arrays? Or worse still, obscure combinations of MOVE operations? Thankfully those days are far behind us--although sadly there are still a few RPG/400 dinosaurs coding away!

RPG IV introduced many powerful new string handling options, such as the %TRIMx family of BIFs, but even now there are capabilities in the language that few programmers fully exploit. One of my favorites is variable length fields. There are many good reasons to use these fields, but in this tip we're going to focus mainly on performance.

For those of you unfamiliar with varying length fields, the following D specs show how they are defined and illustrate the constituent parts.

Varying length fields have two components: the current length that is represented by a 2-byte integer in the first two positions, followed by the actual data. They are differentiated from regular character fields by the use of the keyword "Varying." (See (A) in the code that follows.)

You should train yourself to always code the INZ keyword to ensure that the length field is set correctly. This is critical when varying length fields are incorporated in data structures. Why? Because by default, data structures are initialized to spaces (hex 40) and that causes havoc when interpreted as the field length! At (B) and (C) in the code example that follows, I have defined the two components as separate fields--overlaying varyField--to demonstrate the layout.

     D varyingStruct   DS 
(A)  D  varyField                   256a   Varying Inz
// Following fields are defined just to show the layout of a varying field
(B)  D   length      5i 0 Overlay(varyField)
(C)  D   data      256a   Overlay(varyField: *Next)

Whenever the content of a varying length field is changed, the compiler adjusts the length to reflect the new content. Note that you should always use %Trimx when loading data from a fixed length field into a varying length field, otherwise any trailing blanks will be counted in the field length. Any time you want to know how long the field is, use the %Len() built-in function to obtain the current value.

Now that we've reviewed the basics of variable length fields, let's see how they can be used to boost the performance of some types of string operation. Take a look at the following two pieces of code. Both of them build a string of 100 comma separated values. At first glance there is very little difference in the logic, but would you believe that the second one can run hundreds or even thousands of times faster?

       For i = 1 to 10;
         For j = 1 to 10;
(D)        fixedField = %Subst(baseString: i: j );         
(E)        longFixed = %TrimR(longFixed) + ',' + fixedField; 
         EndFor;
       EndFor;

       For i = 1 to 10;
         For j = 1 to 10;
           fixedField = %Subst(baseString: i: j ); 
(F)        longVarying += ',' + %TrimR(fixedField);
         EndFor;
       EndFor;

The reason is simple. The second one (F) makes use of a varying length field to build up the result string! This difference in speed is easy to understand if you think about what is going on under the hood. The first version (E) uses a fixed length target string so these are the steps that take place:

  • Work out where the last non-space character is.
  • Add the comma in the next position.
  • Add the content of fixedField in the next and subsequent positions.
  • If longFixed is not yet full, add blanks to fill it.

This process is repeated for each new value added to the string. Notice that having carefully padded the string with blanks (4), the very next thing we do (1) is to work out how many there are so that we can ignore them!

Contrast this with the mechanics of the version using the variable length field (F):

  • Increment the field length by 1, and place the comma in that position.
  • Determine the length of the field to add (i.e., ignoring trailing spaces).
  • Copy the new data in starting at the field length + 1 position incrementing the field length.

Much simpler! And the resulting speed differences can be staggering. In tests I ran while preparing this tip, even with a target field length as small as 256 characters, the varying length field version took only half the time of the fixed length version. When I raised the field length to 25,600, which is a much more realistic size when building a CSV, HTML or XML string, the speed difference rose to 1,300 to 1!

Another point to consider is that the code shown above (E) is already much more efficient than much of the code I have seen in customers' programs. The two variants below are both very common and both even less efficient. In the first case (G) the field being added is being trimmed of blanks, which are immediately added back if it does not fill the target field! In the second case (H) the separation of the two functions means that the calculations for the effective length of the target field and the subsequent blank filling occur twice for each loop. You can imagine what that does to the speed. And yes, I have seen cases where people combine both G and H!

(G)        longFixed = %TrimR(longFixed) + ',' + %TrimR(fixedField); 

(H)        longFixed = %TrimR(longFixed) + ','; 
           longFixed = %TrimR(longFixed) + fixedField; 

That's all for this first look at variable length fields. In a future tip we'll look at their uses and abuses in the database.

P.S. For those of you wondering what the purpose of the code at (D) is, it is simply used to generate fields of different effective lengths (one to 10 characters) to act as the test data to be added to the target string.


Jon Paris is one of the world's most knowledgeable experts on programming on the System i platform. Paris cut his teeth on the System/38 way back when, and in 1987 he joined IBM's Toronto software lab to work on the COBOL compilers for the System/38 and System/36. He also worked on the creation of the COBOL/400 compilers for the original AS/400s back in 1988, and was one of the key developers behind RPG IV and the CODE/400 development tool. In 1998, he left IBM to start his own education and training firm, a job he does to this day with his wife, Susan Gantner--also an expert in System i programming. Paris and Gantner, along with Paul Tuohy, are co-founders of System i Developer, which hosts the new RPG & DB2 Summit conference. Send your questions or comments for Jon to Ted Holt via the IT Jungle Contact page.




                     Post this story to del.icio.us
               Post this story to Digg
    Post this story to Slashdot


Sponsored By
HELP/SYSTEMS

SEQUEL™ -- IBM® System i™ Business Intelligence Made Easy

                  · Easy to use by IT and end users
                  · Automated data access and display
                  · Complete BI package: reports, tables, key performance indicators, and dashboards
                  · System i-centric for real-time data analysis
                  · Multiple interface options: graphical, green-screen, browser
                  · Expert support and training

SEQUEL meets your System i data access and analysis needs.

http://www.helpsystems.com/400g


Senior Technical Editor: Ted Holt
Technical Editor: Joe Hertvik
Contributing Technical Editors: Edwin Earley, Brian Kelly, Michael Sansoterra
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message.

Sponsored Links

ProData Computer Services:  Access remote databases from RPG, Webinar, Sept. 17, 2 p.m. CDT
MoshiMoshi:  Episode Three now showing! Watch and Win!
COMMON:  Join us at the Focus 2008 workshop conference, October 5 - 8, in San Francisco, California


 

IT Jungle Store Top Book Picks

Easy Steps to Internet Programming for AS/400, iSeries, and System i: List Price, $49.95
Getting Started with PHP for i5/OS: List Price, $59.95
The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
The System i Pocket RPG & RPG IV Guide: List Price, $69.95
The iSeries Pocket Database Guide: List Price, $59.00
The iSeries Pocket Developers' Guide: List Price, $59.00
The iSeries Pocket SQL Guide: List Price, $59.00
The iSeries Pocket Query Guide: List Price, $49.00
The iSeries Pocket WebFacing Primer: List Price, $39.00
Migrating to WebSphere Express for iSeries: List Price, $49.00
iSeries Express Web Implementer's Guide: List Price, $59.00
Getting Started with WebSphere Development Studio for iSeries: List Price, $79.95
Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
Getting Started with WebSphere Express for iSeries: List Price, $49.00
WebFacing Application Design and Development Guide: List Price, $55.00
Can the AS/400 Survive IBM?: List Price, $49.00
The All-Everything Machine: List Price, $29.95
Chip Wars: List Price, $29.95


 
The Four Hundred
Expanded Power Systems i Boxes on the Horizon?

Entry Power System i Boxes Compete Well with Windows Boxes

X64 Servers See Pricing Pressure in Q2, Big Box Sales Grow

The Law of Attraction

Arrow Says Midrange Shops More Worried About Security than Money

The Linux Beacon
Why Blade Servers Still Don't Cut It, and How They Might

Intel Keeps Both Arms Swinging with Xeons, Jabs with Itanium

Microsoft Ponies Up Another $100 Million for Novell Linux

Mad Dog 21/21: Newtonian Economics

Two More Xeon-Based Galaxy Servers from Sun

Four Hundred Stuff
SafeData Launches First Fully Managed Service for i5/OS HA

Aldon Brings PHP Closer Into Change Management Fold

HiT's DBMoto Gains Enterprise Replication Features

AmeriVault Debuts DR Service for i OS Servers

BMC Aims to Bring Virtual Servers Under Control

Big Iron
For Some Customers, the Mainframe Is Green

Top Mainframe Stories From Around the Web

Chats, Webinars, Seminars, Shows, and Other Happenings

System i PTF Guide
August 23, 2008: Volume 10, Number 34

August 16, 2008: Volume 10, Number 33

August 9, 2008: Volume 10, Number 32

August 2, 2008: Volume 10, Number 31

July 26, 2008: Volume 10, Number 30

July 19, 2008: Volume 10, Number 29

The Windows Observer
Citrix Addresses Performance with XenApp 5

Server Buyers Shop Like It's 1999 in the Second Quarter

Intel Keeps Both Arms Swinging with Xeons, Jabs with Itanium

Mad Dog 21/21: Newtonian Economics

Microsoft Does Something About Those SQL Injection Attacks

The Unix Guardian
What the Heck Is the Midrange, Anyway?

Overseas and Notebook Sales Offset Printer Declines for HP in Q3

Two More Xeon-Based Galaxy Servers from Sun

Mad Dog 21/21: Newtonian Economics

Intel's Nehalems to Star at IDF, AMD Pitches Shanghai

Four Hundred Monitor
Four Hundred Monitor's
Full iSeries Events Calendar

THIS ISSUE SPONSORED BY:

Help/Systems
ProData Computer Services
System i Developer


Printer Friendly Version


TABLE OF CONTENTS
The Efficiency of Varying Length Character Variables

SQL Assist: Powerful Interactive SQL

Admin Alert: Getting into a i5/OS Restricted State

Four Hundred Guru

BACK ISSUES

From the IT Jungle Forums
MQ Help Desired

Printing TCP/IP Details into a Spoolfile

IFF ACTIVE Equivalent in CL

Capture Sort File and Copy to Database File

SNMP Traps on i5OS

Java Messages





 
Subscription Information:
You can unsubscribe, change your email address, or sign up for any of IT Jungle's free e-newsletters through our Web site at http://www.itjungle.com/sub/subscribe.html.

Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc., 50 Park Terrace East, Suite 8F, New York, NY 10034

Privacy Statement