Stuff
OS/400 Edition
Volume 1, Number 10 -- May 23, 2002

The Art of Globbing


by Ted Holt

If you do any Qshell scripting at all, you will eventually have to learn about "globbing," Qshell's ability to expand expressions containing wildcard characters into lists of file names. Before Qshell interprets a command or passes arguments to a utility, it examines the command string for special characters and replaces expressions with lists of file names. Globbing is good, in that it gives you an easy way to work with groups of files while writing only a little bit of code.

Before I go any further, let me share a bit of trivia. According to the The Jargon File, the term "glob" comes from a subprogram that expanded wildcards in old versions of the Unix shell.

Wildcard Characters

The first thing you need to know about globbing is that certain characters are considered "wildcards"--that is, these characters are substituted for others. There are only a few wildcards:

  • The asterisk (*) stands for zero or more characters.

  • The question mark (?) stands for exactly one character.

  • The brackets ([ ]) indicate a group of characters or a range of characters.

  • The exclamation point (!), sometimes called a bang, negates a group or a range.

Here are some examples of globbing using the list directory contents (ls) utility. I have chosen ls because globbing is used so frequently when running ls in interactive Qshell sessions.

ls  d* list file names that begin with a lowercase D
ls  D* list file names that begin with uppercase D
ls  [dD]* list file names that begin with either an uppercase or a lowercase d

Let's pause for a minute to understand something about Qshell and file names in the root system of the IFS: Case is significant--sort of. The ls utility distinguishes between uppercase and lowercase letters. However, other utilities, such as cat, do not. If you create a file called dog, you can use any of the following commands to display its contents on the screen:

cat  dog
cat  DOG
cat  Dog

So case does matter, at least in certain situations.

Back to our examples.

ls*.txt list file names that end with a period and the letters txt

Time for another digression. Here's something else you need to know about the root file system, especially if you've been using IBM-compatible PC's. When IBM introduced the PC, some 20 years ago, file names were divided into two parts--a 1- to 8-character base name, followed by a 0- to 3-character extension. When referencing the file, the two parts were separated by a period (.), but the period was not part of the file name. This convention was strictly enforced until Microsoft released Windows 95. Since then, the rules have been relaxed, and a file name on a PC can contain many more letters and even more than one period.

The root system of the IFS is more like the new PC convention than the original PC convention. Properly speaking, there is no basename and extension. File names can contain periods, and a period can be followed by zero to three characters, but the three characters do not constitute an extension, per se. Nevertheless, I commonly refer to such appendages as "extensions," so I might be heard to say (incorrectly) that the preceding example lists the names of files with a txt extension.

ls my*.txt file names that begin with my and end with .txt, with any number of intervening characters

Remember that file names can contain blanks. However, blanks are used to separate arguments. To include a blank in an expression, precede it with a backslash (\):

ls a\ *.bak file names that begin with an a and a blank, and end with .bak, with any number of intervening characters
ls [dj]* file names that begin with either a lowercase d or a lowercase j
ls [d-j]* file names that begin with lowercase letters d, e, f, g, h, i, or j
ls [d-jD-J]* file names that begin with d through j, regardless of case
ls [!d-j]*.qsh file names that begin with any letter except lowercase d through lowercase j, followed by any characters, ending with .qsh

It is possible for IFS files (at least the ones in the root system) to have the bracket characters and hyphens (-) in their names. If you want to include these special characters in an expression, distinguish them from the group and range expressions by context.

The following commands list file names that have a hyphen or the letter r, in either case, in the second position.

ls ?[-rR]*
ls ?[rR-]*	

Qshell knows that the hyphen does not indicate a range, since no character precedes the hyphen in the first case or follows the hyphen in the second.

Don't depend on this behavior. Let's change the command slightly by removing the question mark (?).

ls [-rR]*

This tells Qshell that we want to see file names that begin with a hyphen, a lowercase r, or an uppercase r. If there is no file whose name begins with a hyphen, this command will probably work correctly. But if there is such a file, Qshell interprets it as a command option. If the file name happens to be a valid option, Qshell will use the option; otherwise it will choke.

For example, assume the directory contains a file called -l (hyphen lowercase ell) and two file names that begin with r. Since the little ell option produces a directory listing in long format, executing the previous command gives you a list of files that begin with r or R in long format, because, after globbing, this is what the command looks like:

ls -l  readdata.qsh  Ruff.txt 

Here's an example that lists the file names that begin with an opening bracket:

ls [*

How do you think Qshell handles the following command?

ls [abc]

I'll tell you what it does: If one or more files exist named a, b, or c, Qshell lists their file names. However, if none of them exists, Qshell looks for a file called [abc]. If that file exists, Qshell lists its name. Go figure. The moral of the story is that it is probably best not to put special globbing characters in file names.

Globbing can expand the capabilities of a command in some ways. For example, the ls command does not normally show the names of files that begin with a period, but you can use globbing to make it do so.

ls  -d  .*	file names that begin with a period

The -d switch prevents ls from showing the contents of the current directory (represented by a single period) or the parent directory (represented by two periods.)

Here's one last example:

ls ?r*.o??	file names that begin with any 
 character followed by an r, followed by zero or 
  more characters, followed by a period and a lowercase 
   o, followed by any two characters.

Other Substitution Characters

Be aware that Qshell also interprets other characters that are not part of the globbing process.

  • The tilde (~) stands for the user's home directory.

  • A single period (.) stands for the current directory.

  • Two periods (..) stand for the parent directory.

For example, to change to the parent directory of the one from which you're currently working, use this command:

cd ..

To move file var1.qsh from the current directory to the parent directory and rename it to temppy.qsh in the process, use this command:

mv  var1.qsh  ../temppy.qsh

The following example shows how you could copy an IFS file from your home directory to the directory you're currently working from:

cp  ~/var1.qsh   .

Notice that the intepretation of these characters depends on context. A period may stand for the current directory, or it may be a character in a file name. Similarly, OS/400 will not prevent you from creating a file with names like a..b.c or a~~~d. But, as with the globbing characters, you may want to avoid using these characters in file names.

Preventing Globbing

The following echo command displays the file names that begin with a lowercase p and end with .qsh, with any number of intervening characters:

echo	p*.qsh

But what if you want to display the six-character string p*.qsh? In that case, you have to tell Qshell you don't want globbing. Here are three ways to do that:

  • Put the expression in double or single quotation marks.
  • echo "p*.qsh"
    echo 'p*.qsh'
    
  • Precede a substitution character with a backslash.
  • echo p\*.qsh
    
  • Turn off globbing by setting the noglob option. This may be specified in two ways.
  • set -f
    set -o noglob
    
    When you use the set utility, Qshell does not glob any expressions it encounters. To turn globbing back on, use the same commands but change the minus signs (-) to plus signs (+):
    set +f
    set +o noglob
    

    Globbing with Other Commands

    Globbing is a function of the Qshell, not of the commands or utilities. When you run a command with a globbed expression, the shell expands the expression before the command ever sees it.

    This is true of the Qshell utilities and your own scripts. You might want to try a few little experiments.

    Create a text file with the following lines in it (I will refer to it as myscript.qsh):

    echo $# parms were passed 
    echo They are $@
    

    Run this script in an interactive Qshell session by typing its name, followed by a globbed expression:

    myscript.qsh   *.csv
    

    When I ran this on the Netshare400 machine, with two csv files in my working directory, the script generated the following output:

    2 parms were passed          
    They are a file.csv cust.csv
    

    The first line of output tells me that myscript.qsh thought that I had keyed two parameters, even though I keyed only one. In other words, globbing expanded the expression to a list of two values. The second line of output listed two file names--a file.csv (yes, it has an embedded blank in the file name) and cust.csv.

    As far as myscript.qsh was concerned, it was as if I had typed the following command:

    myscript.qsh "a file.csv" cust.csv
    

    If I "escape" the asterisk when typing in the command, Qshell does not carry out globbing:

    globb01.qsh \*.csv           
    

    Qshell responds with the following output:

    1 parms were passed          
    They are a file.csv cust.csv
    

    Script myscript.qsh sees only one parameter, with the value *.csv. The backslash escape character was a message to Qshell, and was not passed along to the script as part of the parameter. So why did the echo utility not display *.csv? Because Qshell globbed the expression before echo saw it.

    Glob a Work of Art Today

    Have I given you something to think about? I hope so. Globbing is confusing, at least to me, but the more I work with it, the more I understand it. Globbing is not an exact science, which is why I named this article "The Art of Globbing." Have fun globbing, and please let me know of any idiosyncrasies you stumble across, at tholt@itjungle.com.

    References


    Sponsored By
    ADVANCED SYSTEMS CONCEPTS

    SEQUEL meets all your iSeries and AS/400 data access needs in a single, integrated solution:

    • Windows, Web or host user interfaces
    • Convert AS/400 data into PC file formats
    • E-mail or FTP query results, reports and spool files
    • Run-time prompted queries and reports for end users
    • IF-THEN-ELSE logic in queries and reports
    • Report, form and label formatting second to none
    • Easily convert date fields, character-to-numeric, numeric-to-character and other data manipulation
    • SORT or JOIN using a calculated field
    • Quick summarization of data with Tabling function
    • Run multiple SEQUEL requests as one with the SEQUEL Scripting function
    • OLAP Business Intelligence at a fraction of the cost of comparable solutions

    Take 6 minutes to view a SEQUEL ViewPoint ScreenCam movie to see how simple Windows-based AS/400 and iSeries data access can be! In just a few short minutes, you can find out ways to make your job easier and improve data access throughout your organization. Download the ViewPoint movie here.

    For more information or a FREE trial of SEQUEL, call 847/605-1311 or visit Advanced Systems Concepts



    THIS ISSUE
    SPONSORED BY:

    ASNA
    Aldon Computer Group
    LANSA
    Advanced Systems Group
    Profound Logic Software
    WorksRight Software


    BACK ISSUES

    TABLE OF CONTENTS
    JavaScript and the SHELL Command

    Data Queue Basics

    Simplify Java Web App Deployment with WAR Files

    The Art of Globbing

    Keyed Data Queues: The Key to Flexible Subfiles

    Work with Active Jobs from Operations Navigator

    Editors
    Shannon O'Donnell
    Kevin Vandever

    Managing Editor
    Shannon Pastore

    Contributing Editors:
    Howard Arner
    Joe Hertvik
    Ted Holt
    David Morris
    Richard Shaler

    Contact the Editors
    Do you have a gripe, inside dope or an opinion?
    Email the editors:
    editors@itjungle.com



    Last Updated: 5/22/02
    Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.