• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru: IBM i Save File Compression Options

    April 1, 2019 Michael Sansoterra

    As I finished populating some test tables with a large volume of data on a small and transient IBM i partition in the cloud, I thought life was good. But my countenance fell as I realized the tables plus OS hogged over 70 percent of the disk space. I wondered how to get all the data into a single save file for safe keeping.

    The buzzer in my mind was loud and clear: it ain’t gonna work, you don’t have enough room. As I loathed the thought of using multiple save files to save my test data, I remembered that most save file commands have a data compression (DTACPR) parameter. I never used it so I decided to try it with Save Library (SAVLIB) to see how well it worked. I executed SAVLIB with DTACPR(*HIGH) and I was pleased that the compression was good enough to let me save the entire test library with about 7 percent storage to spare on the system.

    IBM offers three compression options (*LOW, *MEDIUM and *HIGH) and shown below is how IBM’s documentation describes each option (the emphasis is mine):

    • *NO — No data compression is performed.
    • *YES — If the save is to tape and the target device supports compression, hardware compression is performed. If compression is not supported, or if the save data is written to optical media or to a save file, software compression is performed. Low software compression is used for all devices except optical DVD, which uses medium software compression.
    • *LOW — If the save operation is to a save file or optical, software data compression is performed with the SNA algorithm. Low compression is usually faster and the compressed data is usually larger than if medium or high compression is used.
    • *MEDIUM — If the save operation is to a save file or optical, software data compression is performed with the TERSE algorithm. Medium compression is usually slower than low compression but faster than high compression. The compressed data is usually smaller than if low compression is used and larger than if high compression is used.
    • *HIGH — If the save operation is to a save file or optical, software data compression is performed with the LZ1 algorithm. High compression is usually slower and the compressed data is usually smaller than if low or medium compression is used.

    These are all older compression algorithms, and I had only heard of LZ1.

    I decided to go back and compare the available compression options. I used the save object (SAVOBJ) command to save an 8GB CUSTOMER table into a save file as follows:

    SAVOBJ OBJ(CUSTOMER)
           LIB(MYDATA)
           DEV(*SAVF)
           OBJTYPE(*FILE)
           SAVF(QGPL/MYSAVF)
           CLEAR(*REPLACE)
           DTACPR(*NO)
    

    I cleared and re-used the same save file (SAVF) with each test. The results are shown in the table below with the variations of the data compression option:

    DTACPR Option Avg CPU % Utilization SAVOBJ Duration SAVF size (bytes) % of Original Size
    *NO 4% 5:41 8774656000
    *HIGH 40% 13:11 5687762944 64.8%
    *MEDIUM 33% 10:46 5701132288 65.0%
    *LOW 14% 3:42 6383755264 72.8%

    This test was done on a Power9 cloud partition running IBM i 7.3 with two vCPUs, 4GB of RAM and 200GB of disk.

    The average CPU% utilization in the chart isn’t a high precision metric, it was basically me eye-balling the work with system activity (WRKSYSACT) command and watching the average CPU utilization over time. Even though the system wasn’t doing much besides these save tests, there is still some CPU cost to run everything. This machine varied between .5 percent and 1.5 percent while “idle”. The majority of the CPU was definitely due to the compression operation.

    The chart demonstrates that it can be quite expensive in terms of CPU to request *HIGH or *MEDIUM compression levels, though admittedly this machine only had 2 vCPUs. Even so, you certainly would want to make sure your system has enough CPU capacity before running a save command (SAVnnn) with one of these compression options.

    For my customer table, there wasn’t much space savings between *HIGH and *MEDIUM compression (only about .2 percent). While the *LOW option wasn’t as efficient in space savings (by about 8 percent compared to *HIGH), it performed the fastest out of all the methods. If time is of the essence, beware, as you can see the *HIGH and *MEDIUM options took quite a bit longer than a save without compression.

    Of course your results may vary, depending on how conducive your data objects are to compression. Data with many repetitive elements typically compresses well. Admittedly, my test “CUSTOMER” table had a bunch of random characters in it. So, the odds are you may expect a better compression ratio for “normal” data.

    I decided to do a secondary test to see how well “compressible” data such as a large plain text file would do. I downloaded the free list of Great Britain postal codes from the Geonames.org website. I unzipped it to /tmp/GB_full.txt on the IBM i and used the save object (SAV) command to save this text file from the IFS to a save file.

    SAV DEV('/qsys.lib/qgpl.lib/mysavf.file')
        OBJ(('/tmp/GB_full.txt'))
        CLEAR(*REPLACE)
        DTACPR(*HIGH)
    

    This chart contains the various save file sizes depending on the selected data compression (DTACPR):

    File Description File Size (bytes) % of Original Size
    Uncompressed file 173821160
    Zip file (original download) 13946129 8.0%
    Save file-no compression 184705024 106.3%
    Save file-high compression 18907136 10.9%
    Save file-medium compression 32538624 18.7%
    Save file-low compression 173039616 99.6%

    I did not include duration or CPU% for this test, because the elapsed time of the save operation wasn’t significant. I’m glad I did this test because this result is quite a bit different from the first test with respect to how well the various compression levels performed.

    Zip compression was the clear winner compared to the IBM i’s older compression algorithms. Keep in mind, you can use the jar command using QSHELL for zipping/unzipping IFS files. If you don’t mind searching the internet, a number of utilities and other compression formats (including 7z and tar) can also be used from QSHELL to compress IFS files, if getting significant size reduction or sharing data without a save file is your primary goal. If needed, you could always place the zip file into a save file to have the best of both worlds!

    Unlike the first compression demo, there was quite a bit of difference between the resulting file sizes for the different compression types. Whereas *LOW compression was quite useful in the prior test, with the plain text file *LOW accomplished almost nothing.

    In conclusion, when saving data to a save file, it pays to experiment to gauge the cost (CPU utilization and duration) vs benefit (disk space savings) of using a particular data compression option. Don’t forget, the optimal settings will depend on your data set (for example, program objects vs table data and journal receivers, plain text data vs binary data, etc.) so remember to test for each variation. If you’re only concerned with compressing IFS data, then other compression options are available.

    RELATED STORY

    Save Object (SAVOBJ)

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guru, FHG, Four Hundred Guru, IBM i, IFS, Power9, qshell

    Sponsored by
    WorksRight Software

    Do you need area code information?
    Do you need ZIP Code information?
    Do you need ZIP+4 information?
    Do you need city name information?
    Do you need county information?
    Do you need a nearest dealer locator system?

    We can HELP! We have affordable AS/400 software and data to do all of the above. Whether you need a simple city name retrieval system or a sophisticated CASS postal coding system, we have it for you!

    The ZIP/CITY system is based on 5-digit ZIP Codes. You can retrieve city names, state names, county names, area codes, time zones, latitude, longitude, and more just by knowing the ZIP Code. We supply information on all the latest area code changes. A nearest dealer locator function is also included. ZIP/CITY includes software, data, monthly updates, and unlimited support. The cost is $495 per year.

    PER/ZIP4 is a sophisticated CASS certified postal coding system for assigning ZIP Codes, ZIP+4, carrier route, and delivery point codes. PER/ZIP4 also provides county names and FIPS codes. PER/ZIP4 can be used interactively, in batch, and with callable programs. PER/ZIP4 includes software, data, monthly updates, and unlimited support. The cost is $3,900 for the first year, and $1,950 for renewal.

    Just call us and we’ll arrange for 30 days FREE use of either ZIP/CITY or PER/ZIP4.

    WorksRight Software, Inc.
    Phone: 601-856-8337
    Fax: 601-856-9432
    Email: software@worksright.com
    Website: www.worksright.com

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Power Systems Not Getting 3D XPoint Memory Anytime Soon The Search For Intelligent Life Apparently Is Over

    3 thoughts on “Guru: IBM i Save File Compression Options”

    • John Tappin says:
      April 1, 2019 at 7:43 am

      It is worth considering that a smaller saved object can result in faster recovery times, and a lot less storage for multiple backup copies. Using compression is one answer.

      Choosing not to save access paths can also dramatically reduce save time, storage of the backup and therefore CPU time, albeit at the expense of a lot of extra time and CPU when restoring. This may be OK for a small test system.

      When dealing with recovery time is usually more of a precious resource than CPU in my experience though.

      Reply
    • Steven says:
      April 1, 2019 at 9:32 am

      Also, for the CL fans at 7.2 and later is CPYTOARCF and CPYFRMARCF for zipping and unzipping files.

      Reply
    • David Dolphin says:
      May 3, 2020 at 1:40 am

      I used CPYTOARCF on V7R3 and it zipped a library to the IFS. However, the CPYFRMARCF does not allow me to nominate a library to unzip the files. It gives an error “CPFA0A2 Information passed to this operation was not valid”.

      The “TODIR” parameter seems to only accept a directory name which is useless for restoring library objects.

      The help text for the command gives an example
      CPYFRMARCF FROMARCF(‘/MYDIR/MyArchiveFile.zip’)
      TODIR(‘/QSYS.LIB/MYLIB.LIB/’)
      RPLDTA(*YES)
      but doesn’t seem to want to do that. I cannot find any Google or IBM information on this problem other than comments from those who have the same problem.

      Reply

    Leave a Reply Cancel reply

TFH Volume: 29 Issue: 21

This Issue Sponsored By

  • iTech Solutions
  • WorksRight Software
  • COMMON
  • Computer Keyes
  • Manta Technologies

Table of Contents

  • What Vintage Is Your IBM i Wine?
  • The Search For Intelligent Life Apparently Is Over
  • Guru: IBM i Save File Compression Options
  • Power Systems Not Getting 3D XPoint Memory Anytime Soon
  • Traditional IT Spending Bests Cloud Infrastructure, For Now

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • Meet The Next Gen Of IBMers Helping To Build IBM i
  • Looks Like IBM Is Building A Linux-Like PASE For IBM i After All
  • Will Independent IBM i Clouds Survive PowerVS?
  • Now, IBM Is Jacking Up Hardware Maintenance Prices
  • IBM i PTF Guide, Volume 27, Number 24
  • Big Blue Raises IBM i License Transfer Fees, Other Prices
  • Keep The IBM i Youth Movement Going With More Training, Better Tools
  • Remain Begins Migrating DevOps Tools To VS Code
  • IBM Readies LTO-10 Tape Drives And Libraries
  • IBM i PTF Guide, Volume 27, Number 23

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle