|
Circumventing Two Limitations of CPYTOIMPF
Published: July 22, 2009
by Ted Holt
Dear Intelligent Power-on-i Professional:
I deem it grand of IBM to give us great tools that relieve us of the need to write programs to accomplish mundane, quotidian tasks. The Copy to Import File (CPYTOIMPF) command is such an animal. However, nothing's perfect on this planet, and CPYTOIMPF is no exception. Here are two annoyances or problems (depending on your point of view), and how to get around them.
Strip Out Unwanted Blanks
I'm using CPYTOIMPF to build a CSV file. CPYTOIMPF leaves lots of blank space in the stream file. Is there any way to tell CPYTOIMPF that I don't want the extra space?
In a word, no. However, Qshell's SED utility can compress out the unwanted blanks. First, let's create the white-space-laden CSV file, using the usual convention: commas separate the fields and string values are in quotation marks.
CPYTOIMPF FROMFILE(QIWS/QCUSTCDT)
TOSTMF(custcdt.temp)
MBROPT(*REPLACE)
RCDDLM(*LF)
STRDLM('"')
FLDDLM(', ')
The data looks like this:
938472 ,"Henning ","G K","4859 Elm Ave ","Dallas","TX",75217 ,5000
,3 ,37.00 ,.00
839283 ,"Jones ","B D","21B NW 135 St","Clay ","NY",13041 ,400
,1 ,100.00 ,.00
392859 ,"Vine ","S S","PO Box 79 ","Broton","VT",5046 ,700
,1 ,439.00 ,.00
938485 ,"Johnson ","J A","3 Alpine Way ","Helen ","GA",30545 ,9999
,2 ,3987.50 ,33.50
397267 ,"Tyron ","W E","13 Myrtle Dr ","Hector","NY",14841 ,1000
,1 ,.00 ,.00
389572 ,"Stevens ","K L","208 Snow Pass","Denver","CO",80226 ,400
,1 ,58.75 ,1.50
846283 ,"Alison ","J S","787 Lake Dr ","Isle ","MN",56342 ,5000
,3 ,10.00 ,.00
475938 ,"Doe ","J W","59 Archer Rd ","Sutter","CA",95685 ,700
,2 ,250.00 ,100.00
693829 ,"Thomas ","A N","3 Dove Circle","Casper","WY",82609 ,9999
,2 ,.00 ,.00
593029 ,"Williams","E D","485 SE 2 Ave ","Dallas","TX",75218 ,200
,1 ,25.00 ,.00
192837 ,"Lee ","F L","5963 Oak St ","Hector","NY",14841 ,700
,2 ,489.50 ,.50
583990 ,"Abraham ","M T","392 Mill St ","Isle ","MN",56342 ,9999
,3 ,500.00 ,.00
Enter the Qshell environment and key the following command:
sed -e 's# *\",#\",#g' -e 's# *,#,#g' custcdt.temp > custcdt.CSV
SED reads from custcdt.temp and writes to custcdt.csv, replacing the contents of the output file. The two "e" switches precede two substitution commands that SED is to carry out on each line of input.
In the first substitution command, the search expression.blank, asterisk, backslash, quotation mark, comma. is between the first and second pound signs, and means zero or more blanks followed by a quotation mark and comma. The replacement string is between the second and third pound signs, and contains only a quotation mark and comma. This expression gets rid of the extra blanks after the last name, street address, and city name (second, fourth, and fifth fields).
The second substitution replaces zero or more blanks followed by a comma with a single comma. This deletes the blanks that precede the numeric fields.
The "g" at the end of each regular expression tells Qshell to replace all occurrences of the search argument in a line, not just the first one.
The data looks like this:
938472,"Henning","G K","4859 Elm Ave","Dallas","TX",75217,5000,3,
37.00,.00
839283,"Jones","B D","21B NW 135 St","Clay","NY",13041,400,1,
100.00,.00
392859,"Vine","S S","PO Box 79","Broton","VT",5046,700,1,
439.00,.00
938485,"Johnson","J A","3 Alpine Way","Helen","GA",30545,9999,2,
3987.50,33.50
397267,"Tyron","W E","13 Myrtle Dr","Hector","NY",14841,1000,1,
.00,.00
389572,"Stevens","K L","208 Snow Pass","Denver","CO",80226,400,1,
58.75,1.50
846283,"Alison","J S","787 Lake Dr","Isle","MN",56342,5000,3,
10.00,.00
475938,"Doe","J W","59 Archer Rd","Sutter","CA",95685,700,2,
250.00,100.00
693829,"Thomas","A N","3 Dove Circle","Casper","WY",82609,9999,2,
.00,.00
593029,"Williams","E D","485 SE 2 Ave","Dallas","TX",75218,200,1,
25.00,.00
192837,"Lee","F L","5963 Oak St","Hector","NY",14841,700,2,
489.50,.50
583990,"Abraham","M T","392 Mill St","Isle","MN",56342,9999,3,
500.00,.00
Onward to the second question!
Two-Character Field Separators
Here's one I've never seen before. Our corporate IT department wants us to transmit a file in which variable-length fields are separated by a two-character combination. What was wrong with CSV files? Anyway, I've got plenty to do, so I tried to use Copy to Import File (CPYTOIMPF). Guess what? The Field Delimiter (FLDDLM) parameter only allows one character. Any suggestions?
I think we can make this work. First, run your CPYTOIMPF command, using a single character for a delimiter. Try to use a character that is not found in the data. In this example, I use the backslash character to separate the fields.
CPYTOIMPF FROMFILE(QIWS/QCUSTCDT)
TOSTMF('custcdt.temp')
MBROPT(*REPLACE)
RCDDLM(*CRLF)
STRDLM(*NONE)
FLDDLM('\')
The data in stream file custcdt.temp looks like this:
938472 \Henning \G K\4859 Elm Ave \Dallas\TX\75217 \5000 \3
\37.00 \.00
839283 \Jones \B D\21B NW 135 St\Clay \NY\13041 \400 \1
\100.00 \.00
392859 \Vine \S S\PO Box 79 \Broton\VT\5046 \700 \1
\439.00 \.00
938485 \Johnson \J A\3 Alpine Way \Helen \GA\30545 \9999 \2
\3987.50 \33.50
397267 \Tyron \W E\13 Myrtle Dr \Hector\NY\14841 \1000 \1
\.00 \.00
389572 \Stevens \K L\208 Snow Pass\Denver\CO\80226 \400 \1
\58.75 \1.50
846283 \Alison \J S\787 Lake Dr \Isle \MN\56342 \5000 \3
\10.00 \.00
475938 \Doe \J W\59 Archer Rd \Sutter\CA\95685 \700 \2
\250.00 \100.00
693829 \Thomas \A N\3 Dove Circle\Casper\WY\82609 \9999 \2
\.00 \.00
593029 \Williams\E D\485 SE 2 Ave \Dallas\TX\75218 \200 \1
\25.00 \.00
192837 \Lee \F L\5963 Oak St \Hector\NY\14841 \700 \2
\489.50 \.50
583990 \Abraham \M T\392 Mill St \Isle \MN\56342 \9999 \3
\500.00 \.00
Then use QShell's stream editor, SED, to convert the single delimiter to a two-character delimiter. I'll illustrate by replacing the backslash separator with a question mark and a vertical bar.
sed 's/ *\\/?|/g' custcdt.temp > custcdt.txt
The data looks like this:
938472?|Henning?|G K?|4859 Elm Ave?|Dallas?|TX?|75217?|5000?|3?
|37.00?|.00
839283?|Jones?|B D?|21B NW 135 St?|Clay?|NY?|13041?|400?|1?
|100.00?|.00
392859?|Vine?|S S?|PO Box 79?|Broton?|VT?|5046?|700?|1?
|439.00?|.00
938485?|Johnson?|J A?|3 Alpine Way?|Helen?|GA?|30545?|9999?|2?
|3987.50?|33.50
397267?|Tyron?|W E?|13 Myrtle Dr?|Hector?|NY?|14841?|1000?|1?
|.00?|.00
389572?|Stevens?|K L?|208 Snow Pass?|Denver?|CO?|80226?|400?|1?
|58.75?|1.50
846283?|Alison?|J S?|787 Lake Dr?|Isle?|MN?|56342?|5000?|3?
|10.00?|.00
475938?|Doe?|J W?|59 Archer Rd?|Sutter?|CA?|95685?|700?|2?
|250.00?|100.00
693829?|Thomas?|A N?|3 Dove Circle?|Casper?|WY?|82609?|9999?|2?
|.00?|.00
593029?|Williams?|E D?|485 SE 2 Ave?|Dallas?|TX?|75218?|200?|1?
|25.00?|.00
192837?|Lee?|F L?|5963 Oak St?|Hector?|NY?|14841?|700?|2?
|489.50?|.50
583990?|Abraham?|M T?|392 Mill St?|Isle?|MN?|56342?|9999?|3?
|500.00?|.00
Let's break that SED command down into components.
sed 's/ *\\/?|/g' custcdt.temp > custcdt.txt
SED reads custcdt.temp and writes the output to custcdt.txt. The single greater-than symbol tells the system to replace the data in file custcdt.txt if it already exists.
The substitution expression is messy, so let's work through it. The part between the first and second forward slashes is the search value. The blank followed by an asterisk means one or more blanks. The two backslashes mean one backslash. (The first backslash is an escape character.)
The part between the second and third forward slashes is the replacement value--a question mark and a vertical bar. The "g" following the last slash stands for "global". That is, replace all occurrences of the search string, not just the first one of each line.
Running the Commands
To put it all together, embed the CPYTOIMPF and QSH commands in a CL program. Here's how the second example would look:
PGM
... More stuff ...
CPYTOIMPF FROMFILE(QIWS/QCUSTCDT) +
TOSTMF('custcdt.temp') +
MBROPT(*REPLACE) RCDDLM(*CRLF) STRDLM(*NONE) FLDDLM('\')
QSH CMD('sed ''s/ *\\/?|/g'' custcdt.temp > custcdt.txt')
... More stuff ...
ENDPGM
Notice the doubled apostrophes in the regular expression of the SED command.
Post this story to del.icio.us
Post this story to Digg
Post this story to Slashdot
|