Git Is A Whole Lot More Than A Code Repository
March 16, 2022 Jeff Tickner
It is funny to think that two of the most transformative technologies to hit the datacenter in the past several decades are based on projects created by Linus Torvalds. The first, of course, is the Linux kernel, which is the heart of the Linux operating system and which first rolled out in 1991 but didn’t become a real server-class platform until the Dot Com Boom in the late 1990s. The second thing that Torvalds created, out of necessity to help better manage the development and patching of the open source Linux kernel, was the Git repository and version control system.
Torvalds created Git in several days in April 2005, and within two weeks the first merge of multiple branches of development were done on Git itself, and from that point on, Git managed its own development and by the end of the month it was shown to be patching the Linux code at speed. By June, the Linux kernel development project was on Git, and Torvalds tapped Junio Hamano to be Git’s maintainer – a job Hamano still holds to this day.
Git has become an industry standard on most platforms and aside from its many advantages in terms of version control and concurrent development, it also holds the key to the “generational transition” that we face on IBM i. There are open source people at companies who are wondering why the IBM i platform is not using Git, and the auditors are wondering the same thing We have reached a point where failing to implement Git on IBM i is a recognized business risk.
There is a reason for that.
The source control and the repository is not the big problem in application development. The challenge is the build – actually putting all the source changes in place so it can be compiled. To phrase that another way: Getting code into the repository is easy. Getting it back out is the real problem, and it is not just a technical one, but a philosophical one that takes some getting used to.
There is a radical difference between the way software change management works with the IBM i platform and how Git and its open source tools, like Better Object Builder, or BOB, which is a specific build system created for the IBM i platform to work with Git. BOB is an is open source package created by S4i Systems and has IBM and others contributing to it now. There are other build systems, but most of them, like BOB, are based on Make, which uses a text file to store instructions on how to compile objects from the repository. And they are very different from traditional software change management on the IBM i platform However, to scale on IBM i these tools require additional automation to handle the specifics of the platform. For enterprise applications on IBM i it is not practical to hand-maintain makefiles with the risk of error that this can incur. A higher level of automation is required. For example, in the ARCAD system, application metadata and dependency knowledge is automatically updated by the build and used to optimize the build and test automation capabilities of the tool. Only impacted components are recompiled, in the correct order that is required on IBM i.
The difference is that the software change management tools originally created for the IBM i platform were what we call pessimistic with regard to source changes, while the Git and add-on tools for it from the open source world are optimistic about source changes. In a pessimistic world, when a programmer works on a piece of software, that source member is locked so no one else can mess with it and if concurrent changes are made, they are overlaid on top of each other with the newest code winning out. It is up to the developers to keep track of what is going on inside of the source members and the promotion is just taking the current members and compiling them.
In the open source world, concurrent development – meaning many programmers are all making changes to source members at the same time, as was happening with the Linux kernel back in 2005 when Torvalds got frustrated and created his own source version control – means that Git needed an automatic merge function that could see where code has changed within a source member across different branches of code and just merge it all together and assume, optimistically, that it will all work out. You merge, you don’t overwrite. Traditional software change management managed at the member level and you had no idea what has changed in the code but you could see who changed the member when, while Git is managing at the line level and you can see all line changes – and additions of code between lines in the gold copy of the code – and merge them all into a new set of code.
Now here is where Git is very powerful. Because Git is watching code at a line level and is tracking all of the changes at that granularity in source members, it can spot conflicts when the same line is changed in a piece of code and flag that as a potential problem, and more importantly, Git offers what we think of as an unlimited Undo button. When something gets messed up, you can walk the code back to a known good state. You just can’t do that with traditional software change management on the IBM i platform.
There is another big difference between the traditional IBM i software change management approaches and the Git way. With the IBM i platform, if you put a fix in and something goes wrong, you roll back to the older code that was working. But with open source development, everybody is being agile, everybody is making changes, and you just keep rolling forward, in this case with a fix for whatever went wrong because you don’t want to lose all of those new features everybody was working on.
Now, branches are where Git gets really interesting. Branches are collections of versions of source code being worked on by individuals or groups of developers, and just like Git can merge changes to a source member at the line level, Git can also manage merges of many branches of code into the master branch, or trunk, of code. Branches are not releases, per se, but when branches are merged into the gold standard trunk, then this would be used to create a release of the full piece of software.
To be precise, a branch is not really a complete copy of the code, but what is called a reverse delta of the changes from the trunk, and that means a branch is lightweight and you can create many, many branches as part of a large software project and still manage it. There is no performance penalty for having many branches, and if you want to cherry pick changes and apply them, you can do that, too. And Git scales up, too. So if you have a big project with 1,500 changes in 1,000 members, you don’t have to open all of those files. You can do a pull request and get a summary of all the changes in all of the branches – and see the deltas of all of the deltas, and do approval and peer review of all of those code changes a lot more quickly.
This branching capability of Git is what allows for relatively risk-free product management that assumes – and downright encourages – concurrent development on a large scale, meaning across lots of people and lots of code. That Undo function can mitigate the risk of concurrent development. And after all this merging is done inside of Git, you can invoke ARCAD to automatically recompile the changed RPG and COBOL code on the IBM i platform and deal with all of the dependencies on the IBM i platform. Once that Build is complete you are back in the normal promotion process that traditional change management provides.
What is vital in the implementation of Git on IBM i is to smooth the transition from the old traditional way of managing changes to the new Git models. This transition requires a deep integration with Git, flexible enough to handle a variety of different workflows that adapt to the IBM i teams and process in place. Here at ARCAD Software, for eight years now we have been refining the ARCAD integration with Git to achieve that deep level of integration that enables traditional IBM i teams and those already familiar with Git to collaborate easily and share a common code repository across all their development platforms. Using the ARCAD system our customers work smoothly with a mix of 5250 and RDi developers branching and merging smoothly over Git.
For tips and techniques as to how to make a success of Git on IBM i, read our White Paper:
[White Paper] Driving DevOps Maturity on IBM i
Jeff Tickner is chief technology officer for North America at ARCAD Software.
This content is sponsored by ARCAD Software.