Admin Alert: Is It a Performance Issue or a Throughput Issue?
Published: January 11, 2012
by Joe Hertvik
It's common for Power i users to complain their batch jobs are running too slowly. But is system capability responsible for slow batch throughput or could the problem be caused by poor work management procedures? This week, let's look at a few scenarios where users say their batch jobs are running too slowly and discuss what, if anything (short of a hardware upgrade), can help speed up batch processing.
Getting To the Bottom of Slow Batch Processing
Users who feel that their batch work isn't completing in a timely manner may blame that perceived slowness on the system hardware. This may or may not be true. Slowness covers a lot of issues and you may be able to generally boil down batch throughput issues to the following work management items.
- Lack of memory in batch job subsystems
- System traffic
- Nightly or weekend job streams not running in their allocated time windows
Slowness in these areas can often be confused with poor system performance, where people may think the system isn't powerful enough to run its assigned workload. Some people may even argue that a system upgrade is in order to allow the system to keep up with its work. Before you start specking out a Power 7 upgrade to improve batch throughput, look at these possible bottlenecks and solutions for improving throughput and user perceptions about system speed.
Work Management Issue #1: Lack of System Memory Assigned to Batch Subsystems
If you have a fair amount of system memory, slowness might be eased by reallocating that memory among your subsystem storage pools. Check your system to see whether lack of system memory may be hampering batch processing. For a good primer on memory analysis using IBM's Performance Adjuster, see this article on using the i5/OS Performance Adjuster to Better Manage Memory.
I've also had a lot of luck using Midrange Performance Group's Performance Navigator software to create "State of the Union" reports that detail system performance issues. These reports are easy to understand and graphical (good for helping overview-oriented executives understand issues).
If you're looking for just a one-time analysis that doesn't require a software purchase, you can also contact your business partner (BP) to see if they provide free or discounted performance analysis services. I once worked with a BP who provided free system performance analyses, as needed. Another business partner arranged for a well-known software company to analyze my system at a discounted rate. Regardless of how you get this information, performance reports are helpful either for justifying new hardware purchases or for determining that the system is doing fine and running within acceptable parameters.
To ensure that your batch subsystems are getting enough memory to run jobs efficiently, I like to perform the following steps on my systems:
- Move each of my critical batch subsystems into its own private storage pool. This ensures that batch jobs in that subsystem aren't competing with jobs in other subsystems for memory. For an example of how to do this, check out this article on moving a subsystem into its own shared storage pool.
- Turn on the IBM i Performance Adjuster to automatically move memory between subsystem storage pools, as needed. Performance adjuster comes with the system. It can help you perform limited analysis, but it really shines in automatically moving system memory between subsystem storage pools, as needed. Check out this article for information on tuning i OS storage pools for performance improvements.
Work Management Issue #2: Check Your Throughput
Batch system processing is like comedy. Sometimes it all comes down to timing. Slow batch processing might be caused by system traffic, not system performance. The user who complains that their batch job or query is taking too long may merely be caught in line behind a long-running job. You may have batch system traffic jams when some of these situations exist.
- All production batch jobs are submitted to the QBATCH subsystem.
- Your QBATCH subsystem is configured to only service one job queue at a time.
- Programmers doing work on the system submit batch jobs to the same job queues that your users submit jobs to or that production jobs are submitted to.
- Scheduled jobs and on-demand jobs (including queries) are submitted to the same job queues.
- Many jobs in the QBATCH subsystem run more than 15 minutes, holding up other work.
Any of these situations can convince your users that the system is "running slow" when their work may just be parked behind other long-running jobs. To increase throughput, I usually recommend the following steps.
- Set up a query subsystem with its own job queue and storage pool, and direct all query jobs to that subsystem. This will prevent long-running on-demand queries from competing with production jobs.
- Increase the number of job queues servicing the QBATCH subsystem and configure your jobs so that each job queue only services certain types of jobs Change scheduling and job submission entries to send jobs to their own specialized job queues. This will allow similar types of jobs to have their own processing environment, cutting down competition between various user jobs. You could segment job queues by department (one job queue for accounting, one for manufacturing, one for Inventory, etc.), by function (one job queue for invoices, one for purchase orders, one for producing shipping documentation, etc.), or by some other dividing line.
- Consider creating job queues and subsystems for processing long-running jobs (a slow job queue), fast-running jobs (a quick job queue), and emergency or high-priority jobs (a right-now job queue). Change job submission entries as necessary to submit jobs to each job queue. The fast job queues will allow your quick running jobs to complete faster. The slow job queues will let lumbering jobs do their thing without holding up other users. And the right-now job queues will allow you to process items on an emergency basis or allow high-priority jobs to move ahead of other jobs. The right-now job queue can even be set up with its own subsystem that runs its batch jobs at a higher priority than other batch jobs.
For more information on creating multiple job queues, see part one and part two of my articles on better subsystem throughput through multiple job queues. For information on creating a right-now job queue and subsystem, see this article on creating a high-priority batch subsystem.
Work Management Issue #3: Job Streams Not Running Within Their Allocated Time Frames
In my shop, we have overnight EOD jobs that perform additional processing on orders and must always be completed between 10 p.m. and 6 a.m. A lot of work happens during that time frame and we have sometimes had trouble making sure that our batch work is done by start of business the next day. If you're in the same situation where long-running job streams must be finished in a specific time frame, consider using the following strategies.
- Analyze the job stream and determine if there are jobs that are currently running sequentially (one at a time) that can be changed to run concurrently. The job stream may have originally been set up for single-threaded processing, but you may find that some jobs can now be run side-by-side. This can allow more jobs to finish in a faster time frame.
- Look for unnecessary commands or Delay Job commands (DLYJOB) within job streams. Analyze the CL code that runs your scheduled jobs, particularly if the code was written several years ago. You may be surprised to find commands or call statements that are no longer relevant to your company. I was surprised one time to find some CL programs that had DLYJOB statements in them that were no longer needed and delayed some jobs by 10 to 15 minutes. If you can tighten your code, you may be able to shorten the run time for your overall job stream.
Of Course, a Hardware Upgrade Could Also Help
After all this, be aware that a hardware upgrade may still be necessary if you're suffering from low job throughput or slow response time. However, be sure to do your homework first to ensure that you've done everything you can to speed up system processing before adding more hardware.
Creating a High-Priority Batch Subsystem
Tuning i5/OS Storage Pools for Performance
Better Subsystem Throughput Via Multiple Job Queues, Part One
Better Subsystem Throughput Via Multiple Job Queues, Part One
Using i5/OS Performance Adjuster to Better Manage Memory
Moving a Subsystem into its own Shared Pool
Post this story to del.icio.us
Post this story to Digg
Post this story to Slashdot