• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru: Use SQL To Find Duplicate Source Code

    March 12, 2018 Ted Holt

    According to Brian Tracy, “good habits are hard to develop but easy to live with; bad habits are easy to develop but hard to live with. The habits you have and the habits that have you will determine almost everything you achieve or fail to achieve.” This is as true in programming as in anything else we may do.

    Unfortunately, even those of us who strive for good work habits often have to follow the work of people who did not. One bad habit I come across occasionally is known in software engineering as WET solutions. WET stands for “write everything twice” or “we enjoy typing” or “waste everyone’s time.” The antidote is the DRY principle: “don’t repeat yourself.”

    Not long ago I had to modify a 13,000-line RPG program, the sort of thing that is beyond the capacity of my little brain to comprehend. I could tell there was repetition in the code, and how did I find it? I used SQL.

    It may seem strange to use SQL for source code, but source code is data. It’s output from a programmer and input to a compiler. Since it’s stored in source physical files, using SQL to query it — and even to modify it — is a cinch.

    A source physical file has three fields, which the Display File Field Description (DSPFFD) command will show you. They are: SRCSEQ (sequence number), SRCDAT (change date), and SRCDTA (source data). You will probably ignore the source date.

    To query a source member, create an alias. If you query the source physical file itself, you will access the first member, which is not the first member alphabetically, but the one that was added first. It will likely not be the member you want.

    create or replace alias qtemp.tempalias
     for somelib.somefile(somembr)
    

    In this example, I cleverly named the alias TEMPALIAS and put it in the QTEMP library. When you reference TEMPALIAS in an SQL statement, the database manager will access member SOMEMBR in source physical file SOMEFILE in library SOMELIB.

    Now let’s look for duplicate code.

    with source as 
       (select s.srcseq, s.srcdta
          from qtemp.tempalias as s
         where substr(s.srcdta,7,1) <> '*'
           and substr(s.srcdta,8  ) <> ' '
           and substr(s.srcdta,6,1) =  ' ')
     select a.srcseq, b.srcseq, a.srcdta, b.srcdta
       from source as a
       join source as b
         on trim(a.srcdta) = trim(b.srcdta)
        and a.srcseq < b.srcseq
    

    I began with a common table expression, SOURCE, to select the records that I want to include in the query. The important part of this expression is the WHERE clause, because that’s where you specify which lines of source code you want to include in the query. I remove blank lines and lines with an asterisk in column 7 and only blanks following it. In this example, I also included a line to select only rows with a blank in column 6, which in the 13,000-line program meant free-form calculations only. There WHERE clause varies widely depending on the type of source code you are analyzing — fixed-form RPG, free-form RPG, DDS, CL, etc. — and the preferences of the person or persons who wrote the code.

    In the main query, I joined the source member to itself, looking for lines that matched but with different sequence numbers. By selecting records where the sequence number in the primary file was less than the sequence number in the secondary file, I reduced the size of the result set and yet found the duplicate code I was looking for.

    In this example, I used the TRIM function in the join in case the person who copied the code from one spot in the program to another shifted the code. In other situations, you may not want to trim the blanks. Your query doesn’t have to be perfect — you’re not running your business on it. It only has to find duplicate code.

    Of course, you can also use SQL to look for code that was duplicated between two source members. You need only create two aliases.

    You use SQL to help other people do their work. Why not use SQL to help you do yours?

    RELATED STORY

    Don’t Repeat Yourself

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guru, CL, DDS, FHG, Four Hundred Guru, Guru, IBM i, RPG, SQL

    Sponsored by
    Rocket Software

    Meet digital age demands while maximizing your IT investment.

    Future-proof your mission-critical applications with Rocket® Solutions for IBM® i that keep your business ahead of the curve.

    Learn More

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    As I See It: Homo Digitalis The Performance Impact Of Spectre And Meltdown

    Leave a Reply Cancel reply

TFH Volume: 28 Issue: 19

This Issue Sponsored By

  • New Generation Software
  • ASNA
  • Profound Logic Software
  • ARCAD Software
  • WorksRight Software

Table of Contents

  • HelpSystems Has A New Number One Investor
  • The Performance Impact Of Spectre And Meltdown
  • Guru: Use SQL To Find Duplicate Source Code
  • As I See It: Homo Digitalis
  • Proprietary Innovation: An IBM i Ace In The Hole

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • POWERUp 2025 –Your Source For IBM i 7.6 Information
  • Maxava Consulting Services Does More Than HA/DR Project Management – A Lot More
  • Guru: Creating An SQL Stored Procedure That Returns A Result Set
  • As I See It: At Any Cost
  • IBM i PTF Guide, Volume 27, Number 19
  • IBM Unveils Manzan, A New Open Source Event Monitor For IBM i
  • Say Goodbye To Downtime: Update Your Database Without Taking Your Business Offline
  • i-Rays Brings Observability To IBM i Performance Problems
  • Another Non-TR “Technology Refresh” Happens With IBM i TR6
  • IBM i PTF Guide, Volume 27, Number 18

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle