|
|
![]() |
|
|
Qshell Pattern Matching by Ted Holt In "The Art of Globbing," I explained how Qshell expands patterns of strings into lists of file names. Pattern matching is used for more than globbing. In this article, I explain some other ways that Qshell uses pattern-matching techniques. If you have not read "The Art of Globbing," I encourage you to do so. While I repeat some of the high points, I do not repeat everything I covered in that article.
Meta CharactersA pattern consists of two types of characters. Normal characters stand for themselves. Meta characters, also called wild cards, represent other characters. The four meta characters are as follows:
A pattern is a combination of normal characters and meta characters. Normally the pattern will contain at least one meta character. CaseOne of the most common places to use patterns is in the case construct. Case is similar to if-then-else, in that it provides a way to make a Qshell script select between mutually exclusive paths of execution. However, case differs from if in an important way. Case evaluates patterns in order to make decisions; whereas if checks the exit status of Qshell commands. (For more information about if, see "Success or Failure in Qshell Scripts.") Here is the syntax of the case construct: case word in pattern ) list ;; ... esac The word is normally a variable of some sort. Often it's a positional parameter. The pattern is a pattern string, or two or more pattern strings connected with the or operator, which is a vertical bar ( | ). A pattern is followed by a right parenthesis [ ) ], which separates the pattern from a list of commands. The list is terminated with two semicolons (;). Everything from the pattern to the two semicolons constitutes a condition block, and you may have many condition blocks. End the case construct with esac, which is case spelled backward. Here's a script that makes good use of the case construct. I call it countf.qsh. It tells me how many of each type of file I have in a directory, classifying files according to "extension"; that is, a period (.) followed by other characters.
#! /bin/qsh
# count files by file type
# initialize counters
archives=0
backups=0
classes=0
csvs=0
htmls=0
javas=0
scripts=0
texts=0
others=0
total=0
# select directory
dir=${1:-$PWD}
# classify all files by "extension"
for fname in $dir/*
do
let total=total+1
case $fname in
*.bak ) let backups=backups+1 ;;
*.qsh ) let scripts=scripts+1 ;;
*.txt | *.text ) let texts=texts+1 ;;
*.zip | *.jar ) let archives=archives+1 ;;
*.csv ) let csvs=csvs+1 ;;
*.java ) let javas=javas+1 ;;
*.class ) let classes=classes+1 ;;
*.htm | *.html ) let htmls=htmls+1 ;;
* ) let others=others+1 ;;
esac
done
# print summary of file types
printf '====================\n'
printf 'Directory: %s\n' $dir
printf '====================\n'
printf ' CSV files: %5d\n' $csvs
printf ' HTML files: %5d\n' $htmls
printf ' Java classes: %5d\n' $classes
printf ' Java sources: %5d\n' $javas
printf 'Shell scripts: %5d\n' $scripts
printf ' Text files: %5d\n' $texts
printf 'Archive files: %5d\n' $archives
printf ' Backup files: %5d\n' $backups
printf ' Others: %5d\n' $others
printf '====================\n'
printf ' Total: %5d\n' $total
printf '====================\n'
exit
Here's a brief explanation of how this script works. First, it initializes a set of counters to zero. Next, it sets variable directory to the first parameter if the first parameter is defined, or to the current directory (also called the present working directory) if the first parameter is not defined. The for loop in the next section of code processes one file name at a time from the selected directory. As it retrieves a file name, the case structure increments a counter, based on the final characters of the file name. Notice the vertical bars in three of the conditions, meaning that the file name can match either of two patterns. Notice also that the last pattern in the case structure is a single asterisk, which matches anything. This is a "catch all" for files whose names don't match any of the previous patterns. If I type countf.qsh in an interactive Qshell session, I see something like the following on the display:
====================
Directory: /home/jsmith
====================
CSV files: 3
HTML files: 2
Java classes: 2
Java sources: 2
Shell scripts: 58
Text files: 18
Archive files: 3
Backup files: 0
Others: 42
====================
Total: 130
====================
Parameter ExpansionParameter expansion is the process of extracting a value from a parameter. In its simplest form, a parameter number or variable name is preceded by a dollar sign ($) and the entire value of the parameter is extracted. echo $1 echo $dir There are four parameter expansion expressions that include pattern matching. They allow you to extract a modified value from a parameter:
The only way I know how to explain this is to give you an illustration. Suppose that variable gname has the value customer.csv.bak. Notice that there are two periods in the value. Here are the values that the four parameter expansions would yield:
${gname%.*}
The single percent sign (%) tells Qshell to return the parameter value without the smallest suffix (group of characters from the end) that matches the pattern. The pattern is .*, which means a period followed by zero or more characters. Two sequences of characters meet that criterion: .csv.bak and .bak. The latter is the smaller, so the result of the parameter expansion is customer.csv.
${gname%%.*}
The double percent sign tells Qshell to return the parameter value without the largest suffix that matches the pattern. The result of the parameter expansion is customer.
${gname#*.}
The single pound sign (#) tells Qshell to return the parameter value without the smallest prefix (group of characters from the beginning) that matches the pattern. The pattern is *., which means a group of zero or more characters followed by a period. Two sequences of characters meet that criterion--csv.bak. and bak.. The latter is the smaller, so the result of the parameter expansion is csv.bak.
${gname##*.}
The double pound sign tells Qshell to return the parameter value without the largest prefix that matches the pattern. The result of the parameter expansion is bak. Here's a script that uses two parameter expansions that contain pattern matching. I call it backup.qsh. It copies all files that have a certain suffix to corresponding files with the .bak suffix instead.
#! /bin/qsh
# Copy all files with a certain "extension"
# to .bak files
# make sure first parameter was passed
# if not, send message to stderr
if [ -z $1 ]
then echo "Usage: ${0#$PWD/} pattern" >&2
exit 1
fi
# copy files to backup files
for fn in *.$1
do cp $fn ${fn%.$1}.bak
done
exit 0
This script requires one parameterthe suffix of the files to be backed up. If I run this script but do not pass any parameters, the script sends a message to the standard error device. Notice the parameter expansion following the word Usage. If this script is running from the same directory in which it is stored, the current directory name is removed from the message. The user sees this: Usage: backup pattern Not this: Usage: /home/JSMITH/backup pattern If the current directory is not the one in which the script resides, the pattern match fails and the complete parameter zero is displayed. The other instance of pattern matching within parameter expansion is found inside the for loop. The copy (cp) command copies a file to another file with the .bak suffix. For example, suppose parameter one has the value csv and the directory contains a file called customer.csv. The cp command copies customer.csv to customer.bak. Notice the pattern %.$1. This tells Qshell to extract everything up until the final period followed by the suffix in parameter one. Other Pattern-Matching CommandsThese are a few more places where pattern matching is used in Qshell. The concept is the same, so I will not provide examples.
I hope you see that pattern matching is not hard. Like anything else, it just takes getting used to. If you write Qshell scripts, you will probably use pattern matching structures often. References
|
Editors
Contact the Editors |
|
Last Updated: 6/6/02 Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved. |