Circumventing Two Limitations of CPYTOIMPF

July 22, 2009 Ted Holt

Dear Intelligent Power-on-i Professional:

I deem it grand of IBM to give us great tools that relieve us of the need to write programs to accomplish mundane, quotidian tasks. The Copy to Import File (CPYTOIMPF) command is such an animal. However, nothing’s perfect on this planet, and CPYTOIMPF is no exception. Here are two annoyances or problems (depending on your point of view), and how to get around them.

Strip Out Unwanted Blanks

I’m using CPYTOIMPF to build a CSV file. CPYTOIMPF leaves lots of blank space in the stream file. Is there any way to tell CPYTOIMPF that I don’t want the extra space?

In a word, no. However, Qshell’s SED utility can compress out the unwanted blanks. First, let’s create the white-space-laden CSV file, using the usual convention: commas separate the fields and string values are in quotation marks.

CPYTOIMPF FROMFILE(QIWS/QCUSTCDT)
          TOSTMF(custcdt.temp)
          MBROPT(*REPLACE)
          RCDDLM(*LF)
          STRDLM('"')
          FLDDLM(', ')

The data looks like this:

938472  ,"Henning ","G K","4859 Elm Ave ","Dallas","TX",75217  ,5000  
,3  ,37.00   ,.00
839283  ,"Jones   ","B D","21B NW 135 St","Clay  ","NY",13041  ,400   
,1  ,100.00  ,.00
392859  ,"Vine    ","S S","PO Box 79    ","Broton","VT",5046   ,700   
,1  ,439.00  ,.00
938485  ,"Johnson ","J A","3 Alpine Way ","Helen ","GA",30545  ,9999  
,2  ,3987.50 ,33.50
397267  ,"Tyron   ","W E","13 Myrtle Dr ","Hector","NY",14841  ,1000  
,1  ,.00     ,.00
389572  ,"Stevens ","K L","208 Snow Pass","Denver","CO",80226  ,400   
,1  ,58.75   ,1.50
846283  ,"Alison  ","J S","787 Lake Dr  ","Isle  ","MN",56342  ,5000  
,3  ,10.00   ,.00
475938  ,"Doe     ","J W","59 Archer Rd ","Sutter","CA",95685  ,700   
,2  ,250.00  ,100.00
693829  ,"Thomas  ","A N","3 Dove Circle","Casper","WY",82609  ,9999  
,2  ,.00     ,.00
593029  ,"Williams","E D","485 SE 2 Ave ","Dallas","TX",75218  ,200   
,1  ,25.00   ,.00
192837  ,"Lee     ","F L","5963 Oak St  ","Hector","NY",14841  ,700   
,2  ,489.50  ,.50
583990  ,"Abraham ","M T","392 Mill St  ","Isle  ","MN",56342  ,9999  
,3  ,500.00  ,.00

Enter the Qshell environment and key the following command:

sed -e 's# *",#",#g' -e 's# *,#,#g' custcdt.temp > custcdt.CSV

SED reads from custcdt.temp and writes to custcdt.csv, replacing the contents of the output file. The two “e” switches precede two substitution commands that SED is to carry out on each line of input.

In the first substitution command, the search expression.blank, asterisk, backslash, quotation mark, comma. is between the first and second pound signs, and means zero or more blanks followed by a quotation mark and comma. The replacement string is between the second and third pound signs, and contains only a quotation mark and comma. This expression gets rid of the extra blanks after the last name, street address, and city name (second, fourth, and fifth fields).

The second substitution replaces zero or more blanks followed by a comma with a single comma. This deletes the blanks that precede the numeric fields.

The “g” at the end of each regular expression tells Qshell to replace all occurrences of the search argument in a line, not just the first one.

The data looks like this:

938472,"Henning","G K","4859 Elm Ave","Dallas","TX",75217,5000,3,
37.00,.00
839283,"Jones","B D","21B NW 135 St","Clay","NY",13041,400,1,
100.00,.00
392859,"Vine","S S","PO Box 79","Broton","VT",5046,700,1,
439.00,.00
938485,"Johnson","J A","3 Alpine Way","Helen","GA",30545,9999,2,
3987.50,33.50
397267,"Tyron","W E","13 Myrtle Dr","Hector","NY",14841,1000,1,
.00,.00
389572,"Stevens","K L","208 Snow Pass","Denver","CO",80226,400,1,
58.75,1.50
846283,"Alison","J S","787 Lake Dr","Isle","MN",56342,5000,3,
10.00,.00
475938,"Doe","J W","59 Archer Rd","Sutter","CA",95685,700,2,
250.00,100.00
693829,"Thomas","A N","3 Dove Circle","Casper","WY",82609,9999,2,
.00,.00
593029,"Williams","E D","485 SE 2 Ave","Dallas","TX",75218,200,1,
25.00,.00
192837,"Lee","F L","5963 Oak St","Hector","NY",14841,700,2,
489.50,.50
583990,"Abraham","M T","392 Mill St","Isle","MN",56342,9999,3,
500.00,.00

Onward to the second question!

Two-Character Field Separators

Here’s one I’ve never seen before. Our corporate IT department wants us to transmit a file in which variable-length fields are separated by a two-character combination. What was wrong with CSV files? Anyway, I’ve got plenty to do, so I tried to use Copy to Import File (CPYTOIMPF). Guess what? The Field Delimiter (FLDDLM) parameter only allows one character. Any suggestions?

I think we can make this work. First, run your CPYTOIMPF command, using a single character for a delimiter. Try to use a character that is not found in the data. In this example, I use the backslash character to separate the fields.

CPYTOIMPF FROMFILE(QIWS/QCUSTCDT)
          TOSTMF('custcdt.temp')
          MBROPT(*REPLACE)
          RCDDLM(*CRLF)
          STRDLM(*NONE)
          FLDDLM('')

The data in stream file custcdt.temp looks like this:

938472  Henning G K4859 Elm Ave DallasTX75217  5000  3  
37.00   .00
839283  Jones   B D21B NW 135 StClay  NY13041  400   1  
100.00  .00
392859  Vine    S SPO Box 79    BrotonVT5046   700   1  
439.00  .00
938485  Johnson J A3 Alpine Way Helen GA30545  9999  2  
3987.50 33.50
397267  Tyron   W E13 Myrtle Dr HectorNY14841  1000  1  
.00     .00
389572  Stevens K L208 Snow PassDenverCO80226  400   1  
58.75   1.50
846283  Alison  J S787 Lake Dr  Isle  MN56342  5000  3  
10.00   .00
475938  Doe     J W59 Archer Rd SutterCA95685  700   2  
250.00  100.00
693829  Thomas  A N3 Dove CircleCasperWY82609  9999  2  
.00     .00
593029  WilliamsE D485 SE 2 Ave DallasTX75218  200   1  
25.00   .00
192837  Lee     F L5963 Oak St  HectorNY14841  700   2  
489.50  .50
583990  Abraham M T392 Mill St  Isle  MN56342  9999  3  
500.00  .00

Then use QShell’s stream editor, SED, to convert the single delimiter to a two-character delimiter. I’ll illustrate by replacing the backslash separator with a question mark and a vertical bar.

sed 's/ */?|/g' custcdt.temp > custcdt.txt

The data looks like this:

938472?|Henning?|G K?|4859 Elm Ave?|Dallas?|TX?|75217?|5000?|3?
|37.00?|.00
839283?|Jones?|B D?|21B NW 135 St?|Clay?|NY?|13041?|400?|1?
|100.00?|.00
392859?|Vine?|S S?|PO Box 79?|Broton?|VT?|5046?|700?|1?
|439.00?|.00
938485?|Johnson?|J A?|3 Alpine Way?|Helen?|GA?|30545?|9999?|2?
|3987.50?|33.50
397267?|Tyron?|W E?|13 Myrtle Dr?|Hector?|NY?|14841?|1000?|1?
|.00?|.00
389572?|Stevens?|K L?|208 Snow Pass?|Denver?|CO?|80226?|400?|1?
|58.75?|1.50
846283?|Alison?|J S?|787 Lake Dr?|Isle?|MN?|56342?|5000?|3?
|10.00?|.00
475938?|Doe?|J W?|59 Archer Rd?|Sutter?|CA?|95685?|700?|2?
|250.00?|100.00
693829?|Thomas?|A N?|3 Dove Circle?|Casper?|WY?|82609?|9999?|2?
|.00?|.00
593029?|Williams?|E D?|485 SE 2 Ave?|Dallas?|TX?|75218?|200?|1?
|25.00?|.00
192837?|Lee?|F L?|5963 Oak St?|Hector?|NY?|14841?|700?|2?
|489.50?|.50
583990?|Abraham?|M T?|392 Mill St?|Isle?|MN?|56342?|9999?|3?
|500.00?|.00

Let’s break that SED command down into components.

sed 's/ */?|/g' custcdt.temp > custcdt.txt

SED reads custcdt.temp and writes the output to custcdt.txt. The single greater-than symbol tells the system to replace the data in file custcdt.txt if it already exists.

The substitution expression is messy, so let’s work through it. The part between the first and second forward slashes is the search value. The blank followed by an asterisk means one or more blanks. The two backslashes mean one backslash. (The first backslash is an escape character.)

The part between the second and third forward slashes is the replacement value–a question mark and a vertical bar. The “g” following the last slash stands for “global”. That is, replace all occurrences of the search string, not just the first one of each line.

Running the Commands

To put it all together, embed the CPYTOIMPF and QSH commands in a CL program. Here’s how the second example would look:

PGM

... More stuff ...

CPYTOIMPF FROMFILE(QIWS/QCUSTCDT) +
    TOSTMF('custcdt.temp') +
    MBROPT(*REPLACE) RCDDLM(*CRLF) STRDLM(*NONE) FLDDLM('')

QSH  CMD('sed ''s/ */?|/g'' custcdt.temp > custcdt.txt')

... More stuff ...

ENDPGM

Notice the doubled apostrophes in the regular expression of the SED command.

                     Post this story to del.icio.us
               Post this story to Digg
    Post this story to Slashdot