Creating Pivot Tables on the iSeries
by Bruce Guetzkow
The code for this article is available for download.
If you've ever used spreadsheet software, there's a good chance that you've created a pivot table. Pivot tables allow you to convert rows of data into columns of data, which may be more meaningful to the end user. Creating pivot tables on the iSeries can be a complicated process. The commands demonstrated here will greatly simplify that task.
A Pivot Table Scenario
Before we get to the commands, let's first describe a situation where a pivot table comes in handy. Suppose that you have a Sales History file with the following data elements:
- Date Sold
- Sales Region
- Sales Quantity
- Sales Amount
I've kept the file simple for illustration purposes. If you were to create a report using RPG or Query/400 you would most likely see the data listed exactly as it appears in the database, with a single report line for each date and region, or possibly summarized by date and region. While this information is valuable, it might be more valuable if you could see the region totals side-by-side for each date. A pivot table allows you to summarize the sales information and create quantity and amount fields for each region, based on the number of regions in your data sample.
If your data had sales history for four regions, you could then create a report with the following columns:
- Date Sold
- Sales Quantity--Region 1
- Sales Quantity--Region 2
- Sales Quantity--Region 3
- Sales Quantity--Region 4
- Sales Quantity--Total for All Regions
- Sales Amount--Region 1
- Sales Amount--Region 2
- Sales Amount--Region 3
- Sales Amount--Region 4
- Sales Amount--Total for All Regions
With information presented in this fashion it would be easy to see region comparisons to make business decisions based on sales information. If at some point another region is added, recreating the pivot table to include data referencing all five regions would add columns for quantity and amount for the new region.
The CRTPVTTBL Command
In order to create a file to hold the pivoted data, you need to know how many regions are represented in the data sample and create the appropriate DDS. You then need to create a program or collection of SQL statements to summarize the information into the new file. The CRTPVTTBL (Create Pivot Table) command does all of that for you.
The CRTPVTTBL command has the following parameters:
- SRCDTA (Source Data File): This is the input file that you have selected to pivot. It can be any physical or logical file.
- SRCDTAMBR (Source Data Member): For multi-member files, you can specify the member to use as input.
- ROWFLDS (Row Fields): You can specify up to three fields to summarize your data by.
- COLFLDS (Column Fields): This is the field that the source data is pivoted on.
- DTAFLDS (Data Fields): Up to five data fields can be specified that will be summarized. In addition, you can specify a heading to be used for each field.
- PVTTBL (Pivot Table File): This is the name of the file that will be generated as a result of this command.
- PVTTBLRPL (Pivot Table Replace): If you have already created a file named as indicated in the previous parameter, you can indicate that the file is to be deleted and recreated (*YES). Specifying *NO will prevent you from continuing if the file already exists unless you also change the file name.
- LODTBL (Load Pivot Table Source): Specify a source file name to contain SQL statements used in loading data from the source data file into the pivot table file. The default is QTEMP/QSQLSRC, which will be deleted at the end of the current job (batch or interactive).
- LODTBLMBR (Load Pivot Table Member): This is the source member that will hold the SQL statements. The default is *PVTTBL which will be replaced with the pivot table file name.
- LODTBLRPL (Load Pivot Table Member Replace): You can indicate whether to replace an existing source member (*NO). As with the previous "replace" parameter, if the source member already exists, you cannot execute the command if you specify *NO.
- LODPVTTBL (Load Pivot Table with Data): Specify whether to load data into the pivot table after it is created (*YES).
Do not confuse the term "source data" with source code that is stored in source physical files. When speaking of pivot tables, "source data" refers to the data from which the pivot table is generated.
Let's see how to create the pivot table described above using the CRTPVTTBL command. The Source Data File is the Sales History file. You can leave the default (*FIRST) for the Source Data Member, or specify any member in the file if there is more than one. We have a single Row Field, Date Sold. The Column Field is the Region field. For Data Fields specify Sales Quantity and Sales Amount.
You can specify any valid OS/400 file name for the Pivot Table Name, in any library. If the pivot table file does not exist, the Pivot Table Replace parameter is ignored. If you plan to keep the SQL statements generated, specify a source file other than one in QTEMP and any member name that you find appropriate. I chose a default member name the same as the pivot table being created as an obvious link. Again, the Replace parameter is only used if the source member already exists.
The resulting command would be something like this:
DTAFLDS((SALESQTY 'Sales Qty') (SALESAMT 'Sales Amt'))
Under the Covers
The command processing program (CPP) for this command is CLLE source member CRTPVTTBL. It begins by parsing input parameters for the Source Data, Pivot Table and Pivot Table Source files. It then resolves the LODTBLMBR parameter if *PVTTBL was specified. Next each of the field names are placed into their own data fields.
An OVRDBF (Override Database File) command makes sure that the correct source file and member are referenced from this point forward. Two Query Management Queries (QMQRY) are run to count the number of column values and then to create a file containing those values. If an existing pivot table is to be re-created, it is now deleted.
A source file is created if library QTEMP has been specified. A member is added and text is specified. This will be explained a bit later. Now it's time to create the pivot table.
REXX to the Rescue
I've used REXX (REstructured eXtended eXecutor language) to dynamically build and execute SQL statements to create the pivot table and add field text and column headings. Unlike languages like RPG or COBOL, REXX is not compiled and does not rely on predefined variable lengths. Each execution of the REXX procedure adapts the variables to whatever size is needed. It can also execute CL commands or SQL statements.
Executing SQL makes REXX a very powerful tool; however it has some restrictions on the SQL statements that are allowed. I initially thought that these restrictions were a limitation, but they ended up being an unexpected advantage. While the CREATE TABLE and LABEL ON statements used to create the physical file and add headings work fine from within REXX, INSERT and UPDATE statements needed to load data into the pivot table need special consideration.
I found that it was much easier to write the INSERT and UPDATE statements to a source member and execute them using the RUNSQLSTM (Run SQL Statements) command. By saving these statements in the source file and member specified in the LODTBL and LODTBLMBR parameters, the pivot table can be reloaded at a later date without the overhead of physically creating the pivot table file.
For this reason I split out loading the pivot table with data into its own command: LODPVTTBL (Load Pivot Table). Executing this command is the last step in the CRTPVTTBL CPP, if *YES is specified on the LODPVTTBL parameter.
Loading the Pivot Table
The LODPVTTBL command has the following parameters:
- LODTBL (Load Pivot Table Source): Specify a source file name that contains SQL statements used in loading data from the source data file into the pivot table file.
- LODTBLMBR (Load Pivot Table Member): This is the source member that holds the SQL statements.Sales Region
If you specified a source file in library QTEMP on the CRTPVTTBL command, it will be deleted at the end of the command and will not be available for subsequent loads of the pivot table. Also, the LODPVTTBL command does not clear the pivot table before loading. You will need to do that yourself.
Putting it together
Both commands have validity checker programs to ensure that all values are valid. I suggest creating the commands before creating the program objects as the CRTPVTTBL program references the LODPVTTBL command. If the command does not exist, program CRTPVTTBL will not compile.
Here are the steps needed to create the objects:
Before compiling source members CRTPVTTBL and CRTPVTTBLV, change the value of variable &REXSRCLIB in each to specify the library where your REXX source is located.
Remember, the REXX source does not need to be compiled.
Points to Ponder
Creating pivot tables can be a great way to look at data in a new way. However, be careful what you ask for . . . you just might get it. Remember that the number of fields in your pivot table is equal to:
(number of row fields + (number of data fields * (number of unique values for column
field + 1) ) ).
If you specify 3 row fields and 5 data fields and there are 100 unique values for your column field, your pivot table will contain 508 fields, so be sure that you want what you are asking for.
Bruce Guetzkow has programmed on the AS/400 and iSeries since 1990, in manufacturing, distribution, and other industries. He is currently the IS director at United Credit Service in Elkhorn, Wisconsin. Click here to contact Bruce by e-mail.