• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru: Finding Large Files With Python

    July 15, 2019 Mike Larsen

    It’s always a good idea to purge files that aren’t needed any longer. Chances are that you already have procedures in place to purge data from Db2 files and tables, but what about files that reside in the IFS? Do you have a good solution for keeping the IFS clean?

    Perhaps you have old order files stored in the IFS. If you work for a large company, these types of files can accumulate quickly. I’ve written processes in the past to remove files from the IFS using RPG, but I’d like to offer an alternative. I’m going to show how you can use Python to search a directory in the IFS, display attributes of the file, and then delete it. To take it one more step, I’ll show how you can search for files with names that meet a certain criterion. In this example, I’m going to filter for text files.

    This story contains code, which you can download here.

    Figure 1. IFS folder with various types of documents.

    With the goal set, let’s jump right into the code. I start by importing some Python modules, as seen in the following piece of code, to help me with various tasks.

    from datetime import datetime
    import os
    import fnmatch
    

    The ‘datetime’ module helps me to display a nicely formatted date. In this example, I’m displaying the last modified date. If I don’t format the date, I’d get an Epoch date (Figure 2) that is not easily deciphered (at least for me anyway).

    Figure 2. Example of an Epoch date.

    The ‘os’ module will help me read the contents of an IFS directory. At the end of the process, I also use it to delete the file from the IFS.

    Finally, the ‘fnmatch’ module assists with ensuring I only process text files as that’s what I’ve chosen to do in my process.

    To format the date, I created a function that will be executed for each file I read in the directory.

    def convert_date(timestamp):
        d = datetime.utcfromtimestamp(timestamp)
        formatted_date = d.strftime('%d %b %Y')
        
        return formatted_date 
    

    The next (and final) section of code does the heavy lifting. I’ll show the entire snippet of code, followed by an explanation.

    dir_entries = os.scandir('/home/MLARSEN/test_folder/')
    
    for entry in dir_entries:
       if entry.is_file():
       	
          if fnmatch.fnmatch(entry.name, '*.txt'):   
          		
             info = entry.stat()
             
             # I just picked an arbitrary file size for which to look
             
             if info.st_size > 207:
                print(f'{entry.path}\t {entry.name}\t Last Modified:    {convert_date(info.st_mtime)}\t Size in bytes: {info.st_size}')
                
                os.remove(entry.path)
    

    I start by scanning the directory that holds my files. In a production process, you’d likely soft code the path, but I hard coded it here to make the example more readable.

    Next, I loop through the directory entries and perform a few checks. I want to ensure that the directory entry is indeed a file (versus a directory or other entity) and also that it’s a text file (has .txt in the file name). If these conditions are true, I grab the attributes of the file and check the size. In my example, I’m looking for files that are larger than 207 bytes. That’s just a made-up number I’m using in my example. You can make that number whatever you like and you might also want to soft code it.

    I print some of the file attributes to the terminal, then delete the file using ‘os.remove’. That’s it! With a few lines of code, I’ve built a very powerful process.

    Now that the script is built, it’s time to test it out. To run Python scripts, I like to use SSH Terminal from ACS (Figure 3).

    Figure 3. SSH Terminal.

    When I use SSH Terminal, it opens a PuTTY session for me where I can execute my script (Figure 4).

    Figure 4. Execute Python script.

    When I execute the script, it returns three files that met the criterion I specified. I chose to print the file attributes out to the terminal for illustrative purposes before the process deletes the files. When I go back to view the IFS from ACS (Figure 5), I see the 3 files have been deleted.

    Figure 5. IFS file listing.

    With a small amount of code, I built a very productive piece of software. Using RPG to perform this task as I have in the past was just fine, but doing this in Python was a lot easier to do and the code was much more concise. The complete code for the Python script used in this article is available for download.

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guru, ACS, DB2, FHG, Four Hundred Guru, IBM i, IFS, Python, RPG

    Sponsored by
    ARCAD Software

    Embrace VS Code for IBM i Development

    The IBM i development landscape is evolving with modern tools that enhance efficiency and collaboration. Ready to make the move to VS Code for IBM i?

    Watch this webinar where we showcase how VS Code can serve as a powerful editor for native IBM i code and explore the essential extensions that make it possible.

    In this session, you’ll discover:

    • How ARCAD’s integration with VS Code provides deep metadata insights, allowing developers to assess the impact of their changes upfront.
    • The role of Git in enabling seamless collaboration between developers using tools like SEU, RDi, and VS Code.
    • Powerful extensions for code quality, security, impact analysis, smart build, and automated RPG conversion to Free Form.
    • How non-IBM i developers can now contribute to IBM i projects without prior knowledge of its specifics, while ensuring full control over their changes.

    The future of IBM i development is here. Let ARCAD be your guide!

    Watch the replay now!

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Four Hundred Monitor, July 15 Remain Breaks New Ground With ALM Suite

    3 thoughts on “Guru: Finding Large Files With Python”

    • Reynaldo Dandreb Medilla says:
      July 16, 2019 at 10:22 am

      another awesome one Mike, your usual fan Dandreb

      Reply
      • Mike says:
        July 19, 2019 at 12:50 pm

        Thank you, Dandreb. There is more on the way!

        Reply
    • Bob Cagle says:
      August 6, 2019 at 2:55 pm

      How do we get the SSH Terminal option in ACS? I have the latest version and do not see it listed…

      Reply

    Leave a Reply Cancel reply

TFH Volume: 29 Issue: 41

This Issue Sponsored By

  • T.L. Ashford
  • OCEAN User Group
  • Profound Logic Software
  • Manta Technologies
  • WorksRight Software

Table of Contents

  • IBM Takes A Hands Off Approach With Red Hat
  • Remain Breaks New Ground With ALM Suite
  • Guru: Finding Large Files With Python
  • Four Hundred Monitor, July 15
  • IBM i PTF Guide, Volume 21, Number 28

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • POWERUp 2025 –Your Source For IBM i 7.6 Information
  • Maxava Consulting Services Does More Than HA/DR Project Management – A Lot More
  • Guru: Creating An SQL Stored Procedure That Returns A Result Set
  • As I See It: At Any Cost
  • IBM i PTF Guide, Volume 27, Number 19
  • IBM Unveils Manzan, A New Open Source Event Monitor For IBM i
  • Say Goodbye To Downtime: Update Your Database Without Taking Your Business Offline
  • i-Rays Brings Observability To IBM i Performance Problems
  • Another Non-TR “Technology Refresh” Happens With IBM i TR6
  • IBM i PTF Guide, Volume 27, Number 18

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle