Performance of the SQL LIKE Predicate
December 6, 2002 Timothy Prickett Morgan
I have heard that SQL’s LIKE predicate cannot take advantage of indexes. Can I use indexes to speed up a query that uses LIKE?
LIKE has long had the reputation of being a poor performer, probably because of the way it has been implemented on various platforms. In some cases, using LIKE in a query guarantees a database scan (e.g., reading the entire table).
If the expression begins with a wildcard character–an underscore ( _ ) to represent a single character, a percent sign (%) to represent multiple characters–then the query will require a database scan in order to select the correct records. However, if the expression begins with some other character, and there is an index on the field that is being tested, then the query governor may be able to use the index.
The process of speeding up a query containing LIKE is no different from the process of speeding up any query. One method is to use Visual Explain, which is part of the iSeries Navigator (formerly known as Operations Navigator), to tell you what steps the query engine has to take to retrieve the data.
Another method is to run the SQL query interactively in debug. Then take a look at the job log. If you see messages like the following, you may be able to speed up the query.
Access path built for file MYFILE. Access path suggestion for file MYFILE.
The first message tells you that the query engine built an access path to fulfill the query. The second message indicates that you may be able to improve performance of the query by building a certain access path. Press F1 on the message to see the second-level text. That’s where the recommendation will be.
If you see the following message, then the query engine has used an access path.
Access path of file MYFILE2 was used by query.
To get back to your original question, I know that the query engine will use an access path to answer a LIKE condition because I used such a query when working up this example, and the job log messages shown above came from the test session. In my test, the LIKE predicate was “WHERE MYFIELD LIKE ‘5%’.
When I changed the literal from 5% to %5%, however, the job log showed the following messages:
All access paths were considered for file RANDATA. Access path built for file RANDATA.
This is a good technique to learn because it works with SQL, Query, Open Query File (OPNQRYF) command, and anything else (including non-IBM software) that uses the query engine. Visual Explain is good, but it only works with SQL queries.
What is it? Why do you need it?
CASS stands for “Coding Accuracy Support System.” This is a test developed by the U.S. Postal Service to determine whether ZIP Code software can accurately assign ZIP Codes to mailing addresses. CASS-certified software is intended to improve the accuracy of carrier route, 5-digit ZIP, ZIP+4, and delivery point codes.
CASS certified means the software has passed the test. When you use CASS software to update and maintain your mailing file, then your mailing file becomes CASS certified. What does this mean to you?
CASS certification is the first step in qualifying for postage discounts. Look at the mail you receive at home and at work; you will see that a lot of it was mailed for less than the regular 37 cent rate.
Despite all the glitz and glamour of the Internet, email, and such, the wheels of America’s economy are lubricated by the ordinary envelope and the U.S. Postal Service.
If your company sends out a lot of mail, there is a good chance CASS certification can cut your postage expense. There are other steps you must perform to get these discounts, but CASS is the first step.
Your mail must be printed, sorted, and then packaged according to postal regulations. It takes some effort, but the potential savings make it worthwhile.
Exactly what happens when you process your mailing files with CASS software? The software breaks the address down into its individual elements: state, city, street name, and so forth. Then the information is compared to the national ZIP+4 database. If a match is found, the ZIP Code, ZIP+4, delivery point, and carrier route are assigned. Also the delivery line is standardized to comply with Postal Service preferences.
If an address can’t be matched, no action is taken. This has the positive effect of allowing you to identify addresses in your mailing files which possibly can’t be delivered, will be delayed in delivery, or at the very minimum need some minor correction to one of the address elements.
Using CASS certified software has many side benefits. You will be able to identify addresses that are potentially undeliverable. It has been estimated that as much as 30% of all advertising mail is never delivered.
That means the costs of postage, printing, paper, and overhead for undelivered mail are wasted. That means if you spend $100,000 on a direct mail campaign, as much as $30,000 could be completely wasted.
CASS software provides a number of intangible benefits. With CASS certification and bar coding there is a good likelihood that that your mail will be delivered sooner. If your invoices are delivered one day sooner, you may receive payment one day sooner.
If you CASS certify your mailing file, you can move on to the next step which is postal automation. This means applying a POSTNET bar code.
POSTNET is that little row of tall and short bars you see on your mail. POSTNET is a special bar code used by the Postal Service to allow automatic sorting machines to work.
It has been reported that bar code standard class (the old third class) mail gets delivered about as soon as first class, instead of the usual 2 or 3 weeks.
The bottom line is that using CASS software will help you keep your customers’ address information in much better shape than they would be otherwise. Your mail may be delivered sooner at less cost and that means more bottom-line dollars for your company.
If you would like more information about CASS and CASS software,
Are FedEx, United Parcel, and Airborne charges
Every bad address can result in a $5 to $10 charge. Even seemingly small addressing errors such as misspelled street names, or missing suite numbers can result in the same penalty as if the address were totally wrong.
Our PER/ZIP4 addressing matching software will match your addresses to the national ZIP+4 postal database. Based on the match PER/ZIP4, will update your addresses to postal standards. Our sophisticated address logic can add missing directionals, correct minor misspellings, and standardize the address format. If PER/ZIP4 can’t match the address, then you know that there is likely a serious error in the address that needs to be corrected.
PER/ZIP4 can be used in a batch mode to update an entire file or interactively to update individual address one at a time. We provide a search function so that you can search the national ZIP+4 database to verify a specific address.
Visit our Web site www.worksright.com for more information and to order a free, no-hassle, 30-day trial. Or call WorksRight Software, Inc., at 601-856-8337.