Guru: Read a Data Area As a One-Row Table with SQL, Take Two
August 26, 2019 Ted Holt
Fifteen years ago, reader W.G. asked me about the possibility of treating a data area as a one-row table (a physical file with one record) in an SQL query. The question intrigued me because in my System/36 days, I had often wished that I could access the local data area (LDA) as a one-record data file in a query.
Today, thanks to Scott Forstie and his team at IBM, I update my response to W.G. with more information. It’s not that the technique I presented in 2004 is outdated — it’s as relevant as ever — but that the fine folks at IBM have given us yet another tool, and to my way of thinking, you can never have too many tools.
The tool is a user-defined table function (UDTF) called DATA_AREA_INFO. My UDTF — GetPlantInfo — accesses a certain data area, one with the name of PLANT. IBM’s table function accesses any data area. My table function returns a single row with columns named NUMBER and LOCATION. DATA_AREA_INFO returns one column named DATA_AREA_VALUE, which you must break into separate values. GetPlantInfo accepts no arguments (or parameters, if you prefer), because it doesn’t need any. DATA_AREA_INFO accepts three arguments:
- The name of the data area. Special values *LDA, *PDA and *GDA are supported.
- The library that contains the data area. Special values *LIBL and *CURLIB are supported.
- A YES/NO value that tells whether or not to ignore errors.
Retrieving the plant ID number and plant location from the data area is easy.
select substr(data_area_value, 1, 4) as Number, substr(data_area_value, 5, 20) as Location from table(qsys2.data_area_info('PLANT', '*LIBL'));
Let’s look at how you might use DATA_AREA_INFO as I used it in the original article. Let’s use the PLANT data area in a query for row selection. This is the query I published 15 years ago:
select pxact.PlantID, pxact.name, plant.location from pxact join table(GeTPlantInfo()) as plant on plant.number = pxact.PlantID
When this query runs, it selects only the PXACT rows that have the ID of the plant specified in the data area.
Here’s the same query using DATA_AREA_INFO:
select substr(da.data_area_value, 1, 4) as Number, pxact.name, substr(da.data_area_value, 5, 20) as Location from pxact join table(qsys2.data_area_info('PLANT', '*LIBL')) as da on substr(da.data_area_value, 1, 4) = pxact.PlantID;
A little messy, perhaps, but nothing we can’t deal with. How might we clean it up?
One way would be to use a common table expression to take all the messy substringing out of the main query.
with plant as (select substr(data_area_value, 1, 4) as Number, substr(data_area_value, 5, 20) as Location from table(qsys2.data_area_info('PLANT', '*LIBL'))) select pxact.PlantID, pxact.name, plant.location from pxact join plant on plant.number = pxact.PlantID;
There’s by no means anything wrong with this approach. However, since this sort of thing is likely to be used throughout a system, why not make something more permanent, such as a user-defined table function just for plant information?
create function GetPlantInfo () returns table (Number char(4), Location char(20)) language sql reads sql data no external action deterministic disallow parallel return select substr(data_area_value, 1, 4) as Number, substr(data_area_value, 5, 20) as Location from table(qsys2.data_area_info('PLANT', '*LIBL'))
Yes, folks, we’re come full circle and arrived in 2004. We can run the same query we ran then. The difference is that we’ve used DATA_AREA_INFO instead of RPG to build our own UDTF.
Regardless of whether you create a UDTF over DATA_AREA_INFO or choose to do the substringing, it’s probably a good idea to retrieve the data area values only once in a program. Here’s a stored procedure (a program) that needs the plant number and location in various queries. It uses the GetPlantInfo (either my original one or the new one built over DATA_AREA_INFO) to place the plant ID number and location into variables, which can be used in the SQL statements that follow.
create or replace procedure MyProc begin declare v_PlantID char( 4); declare v_PlantLocation char(20); select plant.Number, plant.Location into v_PlantID, v_PlantLocation from table(GeTPlantInfo()) as plant; . . . statements use v_PlantID and v_Location . . . end;
There you have it. Two ways to access a data area in SQL queries.
For more information, visit DeveloperWorks. Also, Simon Hutchinson wrote about this topic recently on his blog, RPGPGM.COM. He also covered the DATA_AREA_INFO view, which is beyond the scope of this article.
Finally, Nick Litton took my original code and modified it for his purposes. I am always thrilled when someone takes something I’ve done and improves it, modifies it for other purposes, or both.