|Editors:||Ted Holt||Managing Editor:||Mari Barrett|
|Howard Arner||Technical Editor:||David Morris|
Free Advice From Howard:
Most of the consulting work that I do involves helping companies with SQL access issues against their OS/400 legacy systems. Typically, these companies are doing Active Server Pages or Java Server Pages to display information from the system via the Web. This week, I want to share one of the most common performance problems that I see in my practice and how you can avoid sending me or another consultant money by proactively solving the problem.
The issue is something I call "lazy-fare" programming. When you program using record level access, you are used to opening a physical file or logical file and having all of the fields available to your program just by referencing them. Usually, when RPG programmers construct SQL statements, they tend to use the SQL * character to indicate to SQL to return all fields from the physical file. (A lot of SQL programmers are guilty of this shorthand, too, by the way.) So, the following statement might be found in an ASP or JSP page to list customers that match an inquiry screen:
SELECT * FROM CUSMAS WHERE CUSNAME LIKE 'ABC%'
Now, this is fine and dandy SQL. If there is a logical file associated with the physical file with the CUSNAME field as the first key, it will probably return records quite quickly. However, the number of records per second that a query can return is directly effected by the number of fields in the physical file. To wit: If CUSMAS has 10 fields, the query will run really fast, probably more than 2,000 records per second. But, if CUSMAS has 150 fields, the query will return noticeably slower, probably under 750 records per second.
<<Communications and memory>>
It is important to understand what is happening during the processing of the SQL statement on both the client and the server to understand why there is such a huge disparity in performance outlined in the example above. I'll tell you all about it after the advertisement below.
OK. Let's get back to SQL programming. On the client side, whether you are using a Java program or a Visual Basic program with Active-X Data Objects (ADO), the following steps occur when you attempt to run the above query:
In the preparation phase, the query is sent to the server and the server parses the SQL statement to determine what information (fields) the client wants and where the information comes from (physical file). The AS/400 query optimizer then looks at the physical file and its associated logical files to determine the best way to get the data that you want. Once this is done, the optimizer has what is called an execution plan. Having a plan on how to get the data, the AS/400 then gets information from the system catalogue about each field that your query is asking to return and sends this information (metadata) back to the client application. The metadata consists of the name of the field, the data type, length, etc. Once the client application gets the metadata, it constructs objects in memory to hold the returned query information.
At this point, the client can send the AS/400 an execution request. The execution request causes the AS/400 to look for the data that satisfies the query request by using the execution plan that it formulated. Once it finds records that satisfy the request, the AS/400 sends an acknowledgement message to the client that the execution request is complete.
Having received the notification, the client can then request that the AS/400 return one or more records from the open query using by issuing a fetch request. The AS/400 will acknowledge this request with one or more records from the open data path. (That one or more caveat depends on how the client issues the fetch request and the size of the fetch buffers. But that is a different column…) Now that you understand what is happening during the execution of an SQL query, lets examine why SELECT * is a bad idea.
If you use SELECT * and a physical file contains only ten fields, then only 10 pieces of metadata are returned to the client application. Then, the client application only has to allocate 10 buffers to hold the returned fields. The AS/400 only has to read 10 fields of information per record and only has to transform 10 fields from internal representation to pretty client-side representation. Now, contrast this with a physical file that contains 150 fields of information. The AS/400 has to look up and return metadata on 150 fields. The client has to interpret 150 pieces of metadata to allocate objects or buffers to hold 150 pieces of information. The AS/400 has to read 150 fields and transform them from internal storage format to pretty client format. Also, since the number of fields is larger, the size of the records is larger, so fewer records fit into a transmission buffer causing more packets to be sent and acknowledged in transferring the information. Easy to understand, right?
Ok, so why is SELECT * bad? It's like my momma used to tell me, only put on your plate what you are going to eat. Most of the programs I see that have SELECT * do not require all of the fields in the physical file, just 5 or 10. Why squander your resources gathering data that you are not going to look at anyway? Name your fields and you increase the throughput of your information, thereby increasing the scalability of your system on both the client and server side.
If you don’t believe me, here are a few numbers from the real world. I had a client that was doing the above-mentioned customer listing web page. His query was a SELECT * going against a physical file with 179 columns of information. His program was only using 15 fields of information from the result set. Before he changed the query, the execution phase of the query took .4 seconds and the records per second clocked in at 250. After changing the query to only ask for the fields required, the execution phase went from .4 seconds to .1 seconds (less metadata and client memory allocation), and the records per second returned from the AS/400 went to over 1,200. Also, the time it took for JSP to render the page went from 2.2 seconds down to .9 seconds.
Remember, less is sometime more.
To unsubscribe, change your email address, or sign up for any of Guild Companies, Inc's free email newsletters, visit http://www.itjungle.com. Hit the SUBSCRIBE button on the homepage and it will lead you to our online subscription system.
When you sign up for one of our e-newsletters, you can be assured that your e-mail address will NEVER be sold to an outside company.
Please contact Timothy Prickett Morgan at
Phone: 212 942 5818
When I am using Interactive SQL/400 (i.e., STRSQL) and need to run a CL command, I have to exit SQL, run the command, and re-enter SQL. Is there any way to run system commands within an interactive SQL session?
Here's the deal. IBM has not provided a way to run CL commands from an SQL session, but you can create a stored procedure to run them. Here are two examples to get you started.
The first one creates a stored procedure, called EXCCMD, to run program QCMDEXC:
CREATE PROCEDURE mylib/exccmd
Once you have created this stored procedure, you will be able to run CL commands using the SQL CALL command:
call exccmd ('wrkoutq garbage',15)
Here's the SQL command to create a stored procedure called CMD, which brings up a pop-up window like the one you get when you press F21 from Screen Entry Utility (SEU):
CREATE PROCEDURE mylib/cmd
To run CL commands, type the following command in your interactive SQL session:
To resume entering SQL commands, press Enter with an empty command line, or press F12.
One of the great things about the OS/400 community is that it is indeed a community. We may be all working from our cubicles, but we are all connected and trying to figure out how to best employ the computer technology at our disposal. There are more than a few ways to skin any cat, and if you have a clever and unique answer to a problem that one of our Midrange Gurus has solved, we'd love to hear from you. This newsletter is an open dialog, and we value your input as well as your readership.
It goes without saying--but we'll say it anyway--that your hard technical questions pertaining to real world problems are equally valuable as a foundation for this newsletter as are your programming insights. We hope you find all the editions of Midrange Guru valuable, and we are going to work hard to make sure that they are.
If you have a tough problem, our gurus can probably help. Their mailboxes are always open.
* Email Ted Holt at firstname.lastname@example.org
* Email Howard Arner at email@example.com
Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved. This document may be redistributed freely and enthusiastically by email, but only in its unedited form. Thanks for your cooperation. Midrange Guru is a registered trademark of Guild Companies, Inc. IBM, AS/400, iSeries, OS/400, and eServer registered trademarks of International Business Machines Corp. All other product names are trademarked or copyrighted by their respective holders.