Stuff
OS/400 Edition
Volume 1, Number 11 -- June 6, 2002

Qshell Pattern Matching


by Ted Holt

In "The Art of Globbing," I explained how Qshell expands patterns of strings into lists of file names. Pattern matching is used for more than globbing. In this article, I explain some other ways that Qshell uses pattern-matching techniques. If you have not read "The Art of Globbing," I encourage you to do so. While I repeat some of the high points, I do not repeat everything I covered in that article.

display

Meta Characters

A pattern consists of two types of characters. Normal characters stand for themselves. Meta characters, also called wild cards, represent other characters.

The four meta characters are as follows:

  • the asterisk (*), which represents zero or more characters

  • the question mark (?), which represents exactly one character

  • the brackets ([ ]), which allow you to specify ranges or groups of characters

  • the exclamation point (!), also called a bang, used within brackets to indicate negation of a pattern

A pattern is a combination of normal characters and meta characters. Normally the pattern will contain at least one meta character.

Case

One of the most common places to use patterns is in the case construct. Case is similar to if-then-else, in that it provides a way to make a Qshell script select between mutually exclusive paths of execution. However, case differs from if in an important way. Case evaluates patterns in order to make decisions; whereas if checks the exit status of Qshell commands. (For more information about if, see "Success or Failure in Qshell Scripts.")

Here is the syntax of the case construct:

case word in
	pattern ) list ;;
...
esac

The word is normally a variable of some sort. Often it's a positional parameter. The pattern is a pattern string, or two or more pattern strings connected with the or operator, which is a vertical bar ( | ). A pattern is followed by a right parenthesis [  ) ], which separates the pattern from a list of commands. The list is terminated with two semicolons (;).

Everything from the pattern to the two semicolons constitutes a condition block, and you may have many condition blocks. End the case construct with esac, which is case spelled backward.

Here's a script that makes good use of the case construct. I call it countf.qsh. It tells me how many of each type of file I have in a directory, classifying files according to "extension"; that is, a period (.) followed by other characters.

#! /bin/qsh
# count files by file type

# initialize counters
archives=0
backups=0
classes=0
csvs=0
htmls=0
javas=0
scripts=0
texts=0
others=0
total=0

# select directory
dir=${1:-$PWD}

# classify all files by "extension"
for fname in $dir/*
do
   let total=total+1
   case $fname in
      *.bak          ) let backups=backups+1 ;;
      *.qsh          ) let scripts=scripts+1 ;;
      *.txt | *.text ) let texts=texts+1 ;;
      *.zip | *.jar  ) let archives=archives+1 ;;
      *.csv          ) let csvs=csvs+1 ;;
      *.java         ) let javas=javas+1 ;;
      *.class        ) let classes=classes+1 ;;
      *.htm | *.html ) let htmls=htmls+1 ;;
      *              ) let others=others+1 ;;
   esac
done

# print summary of file types
printf '====================\n'
printf 'Directory: %s\n' $dir
printf '====================\n'
printf '    CSV files: %5d\n' $csvs
printf '   HTML files: %5d\n' $htmls
printf ' Java classes: %5d\n' $classes
printf ' Java sources: %5d\n' $javas
printf 'Shell scripts: %5d\n' $scripts
printf '   Text files: %5d\n' $texts
printf 'Archive files: %5d\n' $archives
printf ' Backup files: %5d\n' $backups
printf '       Others: %5d\n' $others
printf '====================\n'
printf '        Total: %5d\n' $total
printf '====================\n'
exit

Here's a brief explanation of how this script works. First, it initializes a set of counters to zero. Next, it sets variable directory to the first parameter if the first parameter is defined, or to the current directory (also called the present working directory) if the first parameter is not defined.

The for loop in the next section of code processes one file name at a time from the selected directory. As it retrieves a file name, the case structure increments a counter, based on the final characters of the file name. Notice the vertical bars in three of the conditions, meaning that the file name can match either of two patterns. Notice also that the last pattern in the case structure is a single asterisk, which matches anything. This is a "catch all" for files whose names don't match any of the previous patterns.

If I type countf.qsh in an interactive Qshell session, I see something like the following on the display:

====================   
Directory: /home/jsmith 
====================
    CSV files:     3
   HTML files:     2
 Java classes:     2
 Java sources:     2
Shell scripts:    58
   Text files:    18
Archive files:     3
 Backup files:     0
       Others:    42
====================
        Total:   130
====================

Parameter Expansion

Parameter expansion is the process of extracting a value from a parameter. In its simplest form, a parameter number or variable name is preceded by a dollar sign ($) and the entire value of the parameter is extracted.

echo $1
echo $dir

There are four parameter expansion expressions that include pattern matching. They allow you to extract a modified value from a parameter:

  • ${parameter%word} Remove Smallest Suffix Pattern.

  • ${parameter%%word} Remove Largest Suffix Pattern.

  • ${parameter#word} Remove Smallest Prefix Pattern.

  • ${parameter##word} Remove Largest Prefix Pattern.

The only way I know how to explain this is to give you an illustration.

Suppose that variable gname has the value customer.csv.bak. Notice that there are two periods in the value.

Here are the values that the four parameter expansions would yield:

${gname%.*}

The single percent sign (%) tells Qshell to return the parameter value without the smallest suffix (group of characters from the end) that matches the pattern. The pattern is .*, which means a period followed by zero or more characters. Two sequences of characters meet that criterion: .csv.bak and .bak. The latter is the smaller, so the result of the parameter expansion is customer.csv.

${gname%%.*}

The double percent sign tells Qshell to return the parameter value without the largest suffix that matches the pattern. The result of the parameter expansion is customer.

${gname#*.}

The single pound sign (#) tells Qshell to return the parameter value without the smallest prefix (group of characters from the beginning) that matches the pattern. The pattern is *., which means a group of zero or more characters followed by a period. Two sequences of characters meet that criterion--csv.bak. and bak.. The latter is the smaller, so the result of the parameter expansion is csv.bak.

${gname##*.}

The double pound sign tells Qshell to return the parameter value without the largest prefix that matches the pattern. The result of the parameter expansion is bak.

Here's a script that uses two parameter expansions that contain pattern matching. I call it backup.qsh. It copies all files that have a certain suffix to corresponding files with the .bak suffix instead.

#! /bin/qsh                                  
# Copy all files with a certain "extension"  
# to .bak files                              
                                              
# make sure first parameter was passed
# if not, send message to stderr
if [ -z $1 ]                                 
   then echo "Usage: ${0#$PWD/} pattern" >&2 
        exit 1                               
fi                                           
                                              
# copy files to backup files                 
for fn in *.$1                               
   do  cp  $fn  ${fn%.$1}.bak                
   done                                      
                                              
exit 0

This script requires one parameter—the suffix of the files to be backed up. If I run this script but do not pass any parameters, the script sends a message to the standard error device. Notice the parameter expansion following the word Usage. If this script is running from the same directory in which it is stored, the current directory name is removed from the message. The user sees this:

Usage: backup pattern

Not this:

Usage: /home/JSMITH/backup pattern

If the current directory is not the one in which the script resides, the pattern match fails and the complete parameter zero is displayed.

The other instance of pattern matching within parameter expansion is found inside the for loop. The copy (cp) command copies a file to another file with the .bak suffix. For example, suppose parameter one has the value csv and the directory contains a file called customer.csv. The cp command copies customer.csv to customer.bak. Notice the pattern %.$1. This tells Qshell to extract everything up until the final period followed by the suffix in parameter one.

Other Pattern-Matching Commands

These are a few more places where pattern matching is used in Qshell. The concept is the same, so I will not provide examples.

  • grep -F

  • fgrep (same as grep -F)

  • find (-name and -path options)

  • pax

  • ajar

I hope you see that pattern matching is not hard. Like anything else, it just takes getting used to. If you write Qshell scripts, you will probably use pattern matching structures often.

References

  • Advanced Bash-Scripting Guide (testing and branching)

  • iSeries Information Center: Qshell Interpreter

  • Zsh Workshop: Parameter Expansion




  • Sponsored By
    ASNA

    Why Barnes & Noble Uses ASNA Visual RPG for Development:

    Barnes & Noble needed to design a new system with a Windows appearance, but utilize their AS/400 database and the RPG development staff. The developers were able to create a new Windows application with the look and feel of a true Windows environment, and develop it in a language they were all familiar with. In doing so, they were able to transform from green-screen programmers into Windows programmers and they now have the knowledge of Visual programming with exposure to object oriented programming.


    "ASNA Visual RPG provides experienced RPG programmers with the ability to create GUI-based applications easily with minimal formal training."
    —Yuriy Khaykin, Barnes & Noble


    ASNA Visual RPG (AVR) for Web, Windows and .NET Development

    ASNA Visual RPG (AVR) is an integrated development environment for creating enterprise Web, Windows and .NET applications. Transparent database access; an integrated editor, compiler and debugger; support for emerging standards such as XML and SOAP; and equally powerful Web or Windows deployment possibilities make ASNA Visual RPG the one application development environment you can't afford to ignore! Use your RPG skills to develop Web, Windows and .NET applications today.

    Download your FREE trial of AVR today!

    http://www.asna.com/downloads.asp


    THIS ISSUE
    SPONSORED BY:

    Aldon Computer Group
    Client Server Dev.
    Affirmative Computer
    ASNA
    Profound Logic Software
    WorksRight Software


    BACK ISSUES

    TABLE OF CONTENTS
    How Do I Set My CLASSPATH? Let Me Count the Ways

    VARPG and iSeries Databases

    Create WAR Files with Sun's Free DeployTool

    Qshell Pattern Matching

    VARPG Subfiles: An Introduction

    iSeries Toolbox for Java: Running an iSeries Command

    Editors
    Shannon O'Donnell
    Kevin Vandever

    Managing Editor
    Shannon Pastore

    Contributing Editors:
    Howard Arner
    Joe Hertvik
    Ted Holt
    David Morris
    Richard Shaler

    Contact the Editors
    Do you have a gripe, inside dope or an opinion?
    Email the editors:
    editors@itjungle.com



    Last Updated: 6/6/02
    Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.