• The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
Menu
  • The Four Hundred
  • Subscribe
  • Media Kit
  • Contributors
  • About Us
  • Contact
  • Guru: AI Pair Programming In RPG With Continue

    March 10, 2025 Gregory Simmons

    In my last article, I shared a brief introduction to the GitHub Copilot extensions. These extensions provide an easy way to get up and running with an AI coding assistant to aid you in your RPG development. Being cloud based, it’s lightweight in terms of system resources, but it does cost a little money per month, per user.

    For this article, I would like to share with you what I have learned about a newer extension for VS Code named Continue. Continue runs locally on your PC, is 100 percent free, and as of this writing, is the leading open-source AI code assistant.

    While at TechXchange this past October, I had the pleasure of meeting Adam Shedivy, a software developer for IBM. He was generous with his time and gave me my first introduction to a self-hosted AI code assistant with the Continue extension.

    To get started, you need to download a tool that allows you to run lage language models (LLMs) locally on your computer. There are several, but for better or worse, I chose Ollama. It seems to be pretty popular, can run on Windows, macOS and Linux, and open-source. Anyway, you can download Ollama here: https://ollama.com/download/windows

    Then, in VS Code, go to the extensions marketplace, search for, and then install the Continue extension. Next, I installed one of the large language models (LLMs) to use. You can learn about the variances of LLMs within the Qwen 2.5 Coder series of models here: https://ollama.com/library/qwen2.5-coder I’m currently using both the 1.5b and 7b versions. To install the 7b version, I ran this command in a Powershell terminal within VS Code: ollama run qwen2.5-coder:7b

    These LLMs can get quite large, so I don’t recommend downloading this while connected via your phone’s hotspot and waiting for a flight in Laguardia airport (guilty). Wait until you’re home or in the office, wherever you’ve got a good snappy connection. Once done, you can double check that the LLM is loaded with the list command:

    Next, I added the model into the Continue extension:

    On the ‘Add Chat model’ screen, change the provider to Ollama and change the model to autodetect, then click Connect. Then in the dropdown list, you will see an option for autodetect – qwen2.5-coder:7B. In this screen capture, you can see that I have been tinkering with some other models, which have been autodetected as well.

    Now, you’re ready to ask the model anything! The Continue extension supports starting a question with ‘@’ and a subject to add context. This can greatly improve the usefulness of the answers the model returns for you. For example, if you want the response to your request to be contextually basted on DB2 SQL, you could start your request with @Db2i. There is a lot to explore here, but perhaps we’ll save that for another time.

    The LLM for autocomplete can and probably should be different than the one used in chat. Based on Adam’s insight and confirmed by experimentation, the smaller of the LLMs is quicker to prompt for an autocomplete suggestion. However, the larger of the LLMs can offer a more robust response when chatting with them. And from what I have seen thus far, the 1.5b version of the Qwen2.5-coder prompts with code completion suggestions that are just as good as the other versions. And the speed difference is very noticeable!

    To get ‘tab to complete’ functionality similar to GitHub Copilot, you need to edit your Continue config.json file. In VS Code, press F1, then type config.json and press enter. Get a new line and paste this snippet of JSON:

    {
      "tabAutocompleteModel": {
        "title": "Qwen2.5-Coder 1.5B", 
        "provider": "ollama",
        "model": "qwen2.5-coder:1.5b"
      }
    }
    

    After saving that change, I opened up a source member in my library and code completion suggestions were fairly quick as well as pretty accurate.

    A discussion of running LLMs locally on your PC would be incomplete without mentioning performance. When I began exploring running Ollama and LLMs locally, I started out by going for what I perceived as the ‘most powerful’ one within the Qwen offering and installed qwen2.5-coder:32b. My laptop is a Dell Precision 3570 with 12th Gen Intel(R) Core(TM) i7-1255U 1.70 GHz and 64GB of RAM and I never had enough patience to wait for a response from the 32b LLM. I then loaded the 7b LLM and started getting responses, but I wanted to see if we could speed it up.

    Anytime you’re talking about performance, you need a ‘measuring stick’. To set the base line and test for an improvement in performance, I returned to my PowerShell terminal within VS Code and instructed Ollama to run the model in verbose mode. Then asked it my test question of ‘How much rain is the equivalent of 6 inches of snow?’

    I received an interesting response discussing the density of the snow, weight of water, etc. Since I ran the model with the –verbose switch, I also get some statistics for the performance. The two I focused on where total duration and eval rate:

    total duration:                      40.2912856s
    eval rate:                                4.63 tokens/s

    Okay, good, that set my base line of measurement. Now, on a fresh install, these LLMs in the Qwen family were using half of the cores available and no more than half of the system RAM. There’s a lengthy discussion about this here: https://github.com/ollama/ollama/issues/2496. But I thought I would try and adjust the num_thread parameter to my number of cores; 10.

    Then I asked the exact same question and while the response was a little different, it still talked about weight of rainfall and density of snow, etc. The metrics showed a considerable improvement:

    total duration:                      27.6182394s
    eval rate:                               6.82 tokens/s

    This gives us a good starting point. We now know how to load and use the Continue extension, how to evaluate the various LLMs, and how to tweak the performance. I encourage you to do your own research on your system, try the different LLMs and find which combination works best for you.

    When compared to a paid code assistant, such as CoPilot, Continue is completely free, but does take a little more setup and may also inspire you to upgrade your RAM. One upside, however, for those who are concerned about processing your AI requests on the cloud (as with Copilot), Continue allows you to keep all of your AI requests locally.

    Until next time, happy (assisted) coding.

    Gregory Simmons is a Project Manager with PC Richard & Son. He started on the IBM i platform in 1994, graduated with a degree in Computer Information Systems in 1997 and has been working on the OS/400 and IBM i platform ever since. He has been a registered instructor with the IBM Academic Initiative since 2007, an IBM Champion and holds a COMMON Application Developer certification. When he’s not trying to figure out how to speed up legacy programs, he enjoys speaking at technical conferences, running, backpacking, hunting, and fishing.

    RELATED STORIES

    Guru: AI Pair Programming In RPG With GitHub Copilot

    Guru: RPG Receives Enumerator Operator

    Guru: RPG Select Operation Gets Some Sweet Upgrades

    Guru: Growing A More Productive Team With Procedure Driven RPG

    Guru: With Procedure Driven RPG, Be Precise With Options(*Exact)

    Guru: Testing URLs With HTTP_GET_VERBOSE

    Guru: Fooling Around With SQL And RPG

    Guru: Procedure Driven RPG And Adopting The Pillars Of Object-Oriented Programming

    Guru: Getting Started With The Code 4 i Extension Within VS Code

    Guru: Procedure Driven RPG Means Keeping Your Variables Local

    Guru: Procedure Driven RPG With Linear-Main Programs

    Guru: Speeding Up RPG By Reducing I/O Operations, Part 2

    Guru: Speeding Up RPG By Reducing I/O Operations, Part 1

    Guru: Watch Out For This Pitfall When Working With Integer Columns

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Tags: Tags: 400guru, DB2 SQL, FHG, Four Hundred Guru, GitHub, IBM i, JSON, Ollama, PowerShell, RPG, VS Code

    Sponsored by
    Service Express

    Want to level up your IBM i skills? Don’t miss iAdmin!

    Join us virtually March 26–27 to gain expert insights into IBM i best practices from seasoned professionals.

    This is a great opportunity to learn tips and tricks to enhance your IBM infrastructure. Sessions will cover various topics, including security, OS upgrades, IaaS, system administration and more.

    Register now to receive an exclusive iAdmin swag box delivered to your door.

    The event is free, and the knowledge you’ll gain is invaluable.

    Secure your spot today!

    Share this:

    • Reddit
    • Facebook
    • LinkedIn
    • Twitter
    • Email

    Rocket CDC Tool Pushes Data Out Of IBM i What The Marketplace Study Says About IBM i Migrations And Outlook

    Leave a Reply Cancel reply

TFH Volume: 35 Issue: 9

This Issue Sponsored By

  • Maxava
  • DRV Tech
  • Service Express
  • WorksRight Software
  • Raz-Lee Security

Table of Contents

  • Not April Fools: More Price Increases For Power Systems Coming
  • What The Marketplace Study Says About IBM i Migrations And Outlook
  • Guru: AI Pair Programming In RPG With Continue
  • Rocket CDC Tool Pushes Data Out Of IBM i
  • IBM i PTF Guide, Volume 27, Number 10

Content archive

  • The Four Hundred
  • Four Hundred Stuff
  • Four Hundred Guru

Recent Posts

  • POWERUp 2025 –Your Source For IBM i 7.6 Information
  • Maxava Consulting Services Does More Than HA/DR Project Management – A Lot More
  • Guru: Creating An SQL Stored Procedure That Returns A Result Set
  • As I See It: At Any Cost
  • IBM i PTF Guide, Volume 27, Number 19
  • IBM Unveils Manzan, A New Open Source Event Monitor For IBM i
  • Say Goodbye To Downtime: Update Your Database Without Taking Your Business Offline
  • i-Rays Brings Observability To IBM i Performance Problems
  • Another Non-TR “Technology Refresh” Happens With IBM i TR6
  • IBM i PTF Guide, Volume 27, Number 18

Subscribe

To get news from IT Jungle sent to your inbox every week, subscribe to our newsletter.

Pages

  • About Us
  • Contact
  • Contributors
  • Four Hundred Monitor
  • IBM i PTF Guide
  • Media Kit
  • Subscribe

Search

Copyright © 2025 IT Jungle