Thoroughly Modern: DevOps Refactoring Of RPG Applications with RDi
January 11, 2021 Ray Everhart
Regardless of your current situation – namely, the size and age of your codebase – there is great value to be gained from refactoring. In this article, I will explain what refactoring is, provide the business justification, and describe some refactoring best practices.
So what is refactoring? In the strictest sense of the word, refactoring is when you improve the quality of code without changing what it does. Refactoring is not about enhancements or bug fixes, but code quality and making the code more efficient and maintainable. Code refactoring also improves readability, which makes QA and debugging go much more smoothly. And while refactoring doesn’t remove bugs, it can certainly help prevent them in the future.
Code refactoring is important if you want to avoid the dreaded code rot. Code rot results from duplicate code, patches made to patches and other programming discrepancies. Having a lot of different developers writing in their own styles can also contribute to code rot, as there is often a lack of cohesion to the overall code.
The reality is, everyone wants quality code, so why haven’t you refactored your IBM i RPG applications? Maybe you’re not sure where to start, maybe you’re not sure how to do it, or even why you should. That’s what we’re going to be covering here. Depending on your situation – operating system version, the version of RPG you have to support, management directives concerning free format, etc. – here is a list of some of the technical areas to consider for your code refactoring effort:
- Table structures that are still program described, and you just want to get to DDL
- Printer files that are program described and you want to move them into DDS
- Free format RPG not supporting everything in fixed format (Move L, or Move)
- Indicators that are just numbers and would be better as named indicators
- Variables scattered throughout the code that could be declared at the beginning of the code
- Six or eight-character variable names that could be longer to make it easier to understand what’s going on inside of the code
- RPG Cycle code that would be better as linear main modules
- Global variables that you want to switch to local variables for easier debugging
- Existence of subroutines when your future state is moving toward sub-procedures
- Record level access that might be changed to SQL
- Abstraction of all IO into one program so that there is only one place to change if the underlying database structure changes.
This is just a short list of some of the key areas to target. There are a lot of things you can do that will improve your code, but what should you do? What should you spend your effort on?
What’s The Business Case For Refactoring?
Any refactoring effort should contribute to some measurable outcome, such as increased agility. These are the kinds of things that CXOs and IT leadership will understand and appreciate. What if after this refactoring, you can say: “Future maintenance to this code is going to be reduced by 50%. Bugs are going to be reduced by 80%. I can reduce the amount of code that I have and then make sure that business rules are consistently enforced because I’m going to consolidate my redundant logic.” These are all key measurables that leaders can get behind and support.
Guiding Principles For Refactoring
When you’re building a case for refactoring, you need to find your North Star, your guiding principle, for doing all of this. The things that you focus on while refactoring will further maximize the business value of this asset, this software that you have that provides competitive advantage, that gives you the ability to deliver what your customers need.
Refactoring code is guided by four principles:
- Improving maintainability, so that it will take less time to make a change and get it into production.
- Improving resilience, so that you know when there’s an issue, where it is and how to quickly correct it.
- Improving reusability, so that functionality is isolated to a single occurrence and can be reused by as many programs as necessary. That way, changes take place in one program, not 100.
- Improving testability by becoming more modular and eliminating monolithic code that must all be tested with every small change to ensure nothing was broken by the change.
And that’s how we determine the business justification to refactor – we provide a clear-cut value statement of the work that we’re about to do, with a clear path for defined outcomes.
What Should You Refactor?
When evaluating options for refactoring, you should think about your entire IBM i application portfolio. There may be some programs that you never change and don’t warrant enough value to justify refactoring or converting. There’s really no maintenance benefit if you make it easier to maintain; there’s no testing benefit if you’re not changing it.
So, think about the high value targets. Which programs are the hardest to change? Every shop has a program that nobody wants to work on because it’s hard and it takes a long time. That’s an ideal candidate. What programs provide your company’s competitive edge, that distinctive value and need to be maintained? That’s an area that you want to invest in, an area that allows you to continue to compete and to adapt quickly. Another choice for an ideal candidate would be a program that has a lot of bugs. It has a lot of bugs because it’s hard to maintain and maybe the original intent of the program no longer aligns with the business process. There are all kinds of patches, so refactoring it will help you break it down into smaller pieces before you improve it.
Which programs get changed most often? And which programs are keeping you from meeting your business needs? With the pandemic, everyone is acutely aware that we need to be able to change the way we do business very quickly. What programs are keeping you from adjusting how you do business? What programs are keeping you from implementing new things that would drive extra value to your company? Those are the ideal candidates for doing this refactoring work.
Refactoring Best Practices
There are three main best practices:
- Test early and test often.
- Make incremental changes.
- Make no functional changes while you’re refactoring.
Best practice 1: Test early and test often
You need to think about testing before you even begin to refactor, as you need to identify all of the objects that are going to be part of the test. What do you need in order to test a program? What files, what data areas, what data queues?
You also need to make sure that the test can be isolated – you don’t want any files getting changed by mistake that could impact clients or other developers’ work, and certainly not production. Fresche’s X-Datatest is a tool that greatly simplifies testing and test data management.
Then you need to take a look at the code and see what kind of testing you can do. The important thing is to make sure that the refactoring process changes nothing in the way the program works. You need a base level test that tells you that this is what it did before you started, and you can run it again to make sure that there are no differences.
The test needs to be reusable – this is not something that you want to do manually. You want to have this process set up in your test environment. Get the objects you need into a library so that when it comes time you can restore them to the test environment, execute the test, and report on the differences from the previous execution.
You also need to make sure that your testing script is complete. Using IBM’s Code Coverage tools in Rational Developer for i is a great way to understand how good your test process is because it will tell you line-by-line what was executed. You can identify as a result of that coverage that there may not be enough data in your tables to support your test cases. What if you have logic that’s triggered on a particular customer class and you don’t have that class in any of your test data? Well, that block of code will never get executed. Likewise, you may need to ensure that an interactive program takes all possible paths through the application. Making sure that you have the data that you need is very important to this testing cycle that you’ll use over and over again.
I’ve included additional information about testing at the end of this article.
Best practice two: Make incremental changes
In refactoring, you’re going to make small changes and you’re going to test, so that’s the pattern. I have a number of different phases in this section. I put them in order of the way that I would approach it, but these are the kinds of things you want to consider. After every type of activity, after every phase, you have regression testing because you want to make sure that none of the functionality was inadvertently changed. The reason to have multiple phases is to be able to deliver value in each phase. The ability to stop refactoring when priorities change and still deliver some improvement is key to demonstrating the business value of what you are doing.
|Incremental Change Phase||Description|
|1. Convert RPG to free format||Improves maintainability, but not resilience, reusability or testability, so it’s more of a stepping stone, the thing that has to be done first. Another issue: there may be patterns in fixed format RPG that aren’t part of free format RPG. There may be things that aren’t supported: you can’t do a MOVE so what do you do? What are you going to do when you have opcodes for which there is no free format equivalent? There are tools that come with RDi (and other tools as well) that can help with this, but you have to keep in mind these unsupported patterns.|
|2. Introduction of sub-procedures into the code||Relatively small and self-contained, sub-procedures correspond to all four refactoring principles: maintainability, resilience, reusability and testability. This is a high-value activity. The main activities involved in this phase:
· Remove RPG Cycle/Use named entry procedure
· Convert Subroutines into sub-procedures
o Introduce error handling and instrumentation via templates or snippets (For reference see the presentation that I did in the COMMON webcast library.)
o Locally define work and counter variables to the sub-procedure
· Run regression tests
· Create unit tests
You now have sub-procedures that are encapsulated and everything is completely defined within that sub-procedure or within that sub-procedure’s interface, so you can write unit tests over that sub-procedure. Unit tests are a great way to ramp up your testing speed. You’ll be able to make changes once you have unit tests, since you’ll immediately know if something you did has changed the way it worked.
|3. Perform database abstraction||Maintainability, resilience, re-usability and testability are all checked off by this activity. Making database access separate from the program lets you implement changes to the underlying database more quickly. You can make sure that all database rules are being enforced consistently, and it makes it easier to test, since once you test the database abstraction program, all you have to do is integration testing on all of the programs that use it. The main activities:
· Move file declarations and file access to separate procedures in a separate module.
· Add compiler directives to make prototype and parameter declarations available via /copy
· Run regressions tests
Any other program that uses this same IO pattern now needs to be updated, but you’ve written the program one time. You have one routine and you just have to implement it in all of the other programs that call it.
|4. Convert global variables and data structures to templates||Now you have some procedures that have been created – linear, no main processing. You have a named routine for the entry point. Now you need to get rid of as many global variables as you can. The main activities:
· Add the Template keyword to every global variable and data structure
· Add declarations to each sub-procedure that references the template for variables or for parameters
· Run regression tests
· Create unit tests for sub-procedures
|5. Make the code easier to read and maintain||Maybe you have constants that appear in multiple programs. Those should be in a copy file. Maybe you have literals and you want to switch them over to named constants to make the code easier to read. You want to use named indicators if you choose to continue to use indicators. Maybe you’ll start using long variable names, and RDi has some great ways that you can change all of the variable names in your code and do it consistently. This is part of what will be shown in the next blog post.|
|6. Carry out reorganization and further encapsulation||This phase takes extra effort, which is why phase five above tried to make things easier to read. Some of the main activities in phase six:
· Create sub-procedures for each application requirement
· Create unit tests
· Create service programs
· Run regressions tests.
You may be in phase six for quite some time. Because once again, you want to be able to justify the value of what you’re doing. You may say that getting through phase four is all you need to do right now. In some cases, maybe just getting to phase two is enough. But phase six covers a broad range of activities where you would create a list of different goals that you want to achieve within the program. You create a punch list, and then based on the value of those, you can do them opportunistically.
Best practice three: Make no functional changes while you’re refactoring
This is going to be the hardest best practice to ensure and likely the thing you fight with most – no new features and no bug fixes are done during refactoring. Because if you do, your whole testing process is going to have to be changed since you no longer have that initial starting point. If you make changes, you have to redo the work you did the first time and make sure you’re going to create a new base state so that future refactoring has a known result to compare against.
No functional changes is one of the most important rules when you’re refactoring. You don’t want this to turn into the project that never ends, which is why it’s important to cycle through this, do a small change, test it, verify it. And if you have to stop right there, you can promote that change to production and then come back to it later. So that’s the important thing that you need to take away from this – small incremental changes and repeated testing are what’s necessary to ensure success when you’re refactoring.
A few words about comparing files. What are your options for testing? Since the goal is to build a library of reusable, automated tests, you are either going to use a tool like X-Datatest or write your own tools. There are a number of options on IBM i. There’s the QSYS2.COMPARE_FILE command that’s available on IBM i 7.4. You could do the compare physical file member(CMPPFM) command, but that’s really designed to compare source physical file members. There’s also a command in Qshell called CMP or compare – that’s another option you can use to build your testing routine. But let’s spend a few minutes on an approach for testing using the EXCEPT clause in SQL.
Suppose you have two tables, MyData1, MyData2, both with the same format and some records. And if you look at the SELECT clause on the left-hand side below, you’ll see that I’ve got a couple of files that I’m joining together and I have an EXCEPT clause in between them.
So, I’m selecting the records from MyData1 and from MyData2, and I’m joining it with the EXCEPT clause. This will tell me anything that’s not the same between those two files, and I’m also doing a union but in the reverse order, MyData2 compared to MyData1, since I could have some rows that are in MyData1 that are not in MyData2 and vice versa. By doing a union, I’m going to get a result field or a result file that shows which rows are different. In this case, the only column that I’m selecting from the rows is the key. I’m doing this so I’ll be able to drill down later, but this is a very easy way to compare tables against each other.
Ideally, you want to automate this technique by creating an SQL script that creates a view that does that EXCEPT clause and compares the before and after tables for every table that is in the test environment.
- Create an SQL script that:
- Creates a view that compares the before/after tables for each table used in your test
- Extracts the count of differences detected for each file and summarizes them in a “Results” file
- Run the script each time you test
- Check the Results to see if there have been any changes
- Use the specific view to drill down into any file with differences.
And here’s the SQL:
Create an SQL script (same as previous, but with a replace view at the top):
Create the Results table. This is where I’m going to write out the name of the file and how many differences were detected. Now, I’ve added a third column to this, just so that when I’m looking at this file, I know when this particular script was run:
Populate the table. It’s a simple matter of inserting into that table any differences that were detected. So, in this case, when I was comparing to my data table, I found three differences.
So now, as a result of this, I can look at this table, I could even summarize all of the differences and say ‘did I find any differences at all’? There’s no end to how you can set this up to help you do comparison. You could also create more views over this that will tell you how many rows were added, how many rows were deleted and how many rows were changed, because those are all things that you’re going to want to keep track of while you’re doing the test. Once you’ve established that the program is working the way you intend it, you know that that’s the set of tables you want to use going forward.
This has given you a very brief overview of how to set up automated testing. There’s still a lot that needs to be done, but keep in mind that having a way to automate your testing is going to be crucial to refactoring. And the effort that you put into automating testing is going to pay dividends over and over again because every time you have to change, you can run the test automatically to quickly validate that nothing was broken by any recent change.
In an upcoming post on the Fresche blog, I’m going to describe how to use RDi to accomplish what I’ve talked about in this post. See you there!
Ray Everhart is senior product manager of X-Analysis at Fresche Solutions. Everhart has spent years helping IBM i companies by assessing their RPG, COBOL and CA 2E (Synon) applications and processes to improve business outcomes. As product manager for X-Analysis, he works closely with IBM i customers to understand their business goals and technical needs in order to drive innovation within the product suite.