Admin Alert: When System Job Tables Attack, Part II
October 1, 2008 Joe Hertvik
Last issue, I discussed the ins and out of i5/OS system job tables and how job table overflows can hurt your system. This week, I’ll provide some advice on how to detect and delete excessive jobs that are cluttering your system. Next week, I’ll show you how to maintain the job tables and how to check for job table damage.
What Are System Job Tables, Again?
System job tables are internal system objects that i5/OS uses to track partition jobs. There can be up to 10 job tables on a partition, and the maximum number of system jobs for a partition is designated in the Maximum Number of Jobs (QMAXJOB) system value.
As the number of partition jobs approach the system’s QMAXJOB value, your system job tables start to fill up and a number of issues can occur. These issues include slow backups, failure to accept new jobs, high DASD usage due to a large number of spooled files, performance issues, and IPL problems. As the number of system jobs approaches the system’s QMAXJOB value, the more likely you are to experience system issues.
You can check your current system job table status by running the Display Job Tables (DSPJOBTBL) command, which generates a screen display that looks like this.
Display Job Tables Permanent job structures: Temporary job structures: Initial . . . . : 30 Initial . . . . : 20 Additional . . . : 10 Additional . . . : 10 Available . . . : 72480 Available . . . : 583 Total . . . . . : 126625 Maximum . . . . : 163520 ---------------------Entries---------------------- Table Size Total Available In-use Other 1 16752384 16352 0 16352 0 2 16749312 16352 99 16253 0 3 16749312 16352 1718 14634 0 4 16749312 16352 11136 5216 0 5 16749312 16352 15891 461 0 6 16749312 16352 16312 40 0 7 16749312 16352 15833 519 0
The specifics on reading this display are contained in last week’s column. Generally, you can detect the following situations when examining the Permanent Job Entries on the DSPJOBTBL screen.
I’ll cover how to handle the first two situations this week and talk about the third situation next week.
What Causes Job Table Usage To Run High?
The number one reason that system job table problems occur is because organizations maintain too many spooled files on a system. It may be that system applications overproduce spooled file output, or that spooled files aren’t efficiently pruned from the system by printing or deletion, or that your organization likes to hang on to their system output. Whatever the reason, too many jobs with unprocessed spooled files can cause performance problems and getting control of spooled file output is critical to resetting your system.
The easiest way to check for excessive spooled files is to browse through your output queues by using the Work with Output Queues (WRKOUTQ) command, like this:
This will display all the output queues on your system and how many spooled files are in each output queue. From here, it’s fairly easy to identify which output queues contain too many spooled files. Here’s a sample WRKOUTQ display that I recently ran on one of my machines that had excessive job table entries.
Work with All Output Queues Type options, press Enter. 2=Change 3=Hold 4=Delete 5=Work with 6=Release 8=Description 9=Work with Writers 14=Clear Opt Queue Library Files Writer Status QDKT QGPL 0 RLS QKROUTQ QGPL 0 RLS QPFR OUTQ QGPL 0 RLS QPRINT QGPL 321969 RLS QPRINTH QGPL 0 RLS QPRINTM QGPL 28 RLS QPRINTS QGPL 0 RLS QPRINT1 QGPL 0 RLS QPRINT2 QGPL 387 RLS QUERY QGPL 830 RLS
A screen like this makes it easy to identify where excessive job table entries are coming from. The hard part is deciding which spooled files to delete. Some users can be notoriously fussy when it comes to removing their spooled file output. It’s helpful if your shop has a policy that it will only keep spooled files for a certain time period, perhaps 30 days. With a policy in place, you can set up routines to automatically delete spooled files that are older than a certain date. To help with that task, I published a generic routine for selectively deleting spooled files according to any criteria you wish. After spooled files are deleted, the jobs associated with the spooled files will be removed from the system and your available permanent job structure entries will go up. If you want to use the brute force method, you can also reduce your available entries by simply clearing overcrowded job queues when you find them.
It’s also worthwhile to check the number of entries in the QEZJOBLOG output queue. On busy systems, tens of thousands of job logs can be kept on the system, taking up disk storage and filling up the system job tables. In the i5/OS CLEANUP options, IBM provides a setting that lets you designate how many days you should keep job logs and other system output. Job logs older than the designated number of days are automatically deleted. Take the following options to check and reduce a partition’s number of days to keep: job logs and other system output value.
1. Call the Cleanup Tasks menu by executing the following Go to Menu (GO) command:
2.Take option 1, Change cleanup options from the menu that appears. You can change the number of days to keep values on this screen.
By reducing this value to a reasonable number, you are setting up an automatic routine to prune your system job tables by deleting excessive job logs and the jobs they are associated with.
Keeping Your Spooled Files and Eating Them, Too
There’s a second way to take care of excessive permanent structure entries caused by jobs that contain unprinted and undeleted spooled files. In i5/OS V5R2 and above, IBM offers a system value called Spooled file action (QSPLFACN). You can set QSPLFACN to one of two values: *KEEP (its default) or *DETACH. When set to *KEEP, spooled files are associated with the job that produced them. After completion, the system table entries for jobs containing spooled files remain active with a status of completed.
Two things happen when you change QSPLFACN to *DETACH. First, when the job ends, all its associated spooled files are detached from the job. After the spooled files are detached, the job itself is removed and its permanent job structure entry is recycled back to the system. *DETACH processing takes effect immediately for any job that becomes active after the change occurs. It does not affect jobs that ended before the change occurred.
Detaching spooled files from their originating jobs can be a great boon in keeping your system job tables under control, because it will keep your job tables relatively small. It can also cause problems in locating spooled file output through i5/OS job commands. After changing QSPLFACN to *DETACH, you will no longer be able to find completed jobs and their associated spooled files by using any of the following job commands:
Work with Job (WRKJOB) Work with Submitted Jobs (WRKSBMJOB) Work with User Jobs (WRKUSRJOB)
The bottom line is that setting QSPLFACN to *DETACH is a great move if you will never need to locate spooled files by examining the jobs that created them (and this includes looking for job logs). However, if your spooled files are more device-oriented than job-oriented (such as barcode labels or packing lists) and you don’t care about finding the jobs that created them, QSPLFACN can help you skirt most of the issues with system job table overflow.
A Correction From My Last Column (Already)
In last week’s article, I stated that i5/OS has no mechanism for changing the threshold value at which the CPI1468 message is sent. I was wrong. Reader C. Barbie wrote in to tell me that IBM is offering two PTFs to add this capability to i5/OS V5R4 and V6.1 systems:
V5R4 – PTF SI29585
Once these PTFs are applied, you can create a two-digit decimal data area called QMAXJOBPCT that contains a new lower threshold value to use for sending out the CPI1468 error message. If you want the system to monitor system job table usage when it reaches 80 percent, for example, you can create QMAXJOBPCT by using the following Create Data Area command (CRTDTAARA).
CRTDTAARA DTAARA(QSYS/QMAXJOBPCT) TYPE(*DEC) LEN(2 0) VALUE(80)
To activate the new threshold, you would do one of two things. You can either IPL the system or you can change the value in the QMAXJOB. The new QMAXJOBPCT data area value will immediately take effect after one of these two actions is completed.
Too Much for Two Articles
Unfortunately, I don’t have enough time this week to cover my final topics: how to maintain system job tables and how to check for job table damage. I’ll fill you in on this valuable information next issue.
About Our Testing Environment
Configurations described in this article were tested on an i5 550 box running i5/OS V5R4. Many of the commands are also available in earlier versions of the operating system running on iSeries or AS/400 machines. If a command is present in earlier versions of the i5/OS or OS/400 operating systems, you may notice some variations in the pre-V5R4 copies of these commands. These differences may be due to command improvements that have occurred from release to release.