fhg
Volume 7, Number 43 -- December 12, 2007

Use SQL User-Defined Functions to Avoid Data Repetition Problems

Published: December 12, 2007

by Michael Sansoterra

By now, most System i and AS/400 developers know they can enhance the SQL language in DB2 for i5/OS by creating their own user-defined functions (UDFs). In this tip I'd like to focus on one common problem encountered when writing queries and how a simple UDF template can often solve this problem.

By way of review, in case you're not familiar with them, UDFs are pieces of code that can bolt onto SQL. UDFs usually contain business logic for doing complex tasks that would be difficult or cumbersome to do using the SQL built-in functions. There are a variety of UDFs, but for now we will only consider a scalar UDF, which is a UDF that accepts zero or more input parameters and returns a single output parameter.

Now let's review this typical problem encountered when writing queries. To begin, a user requests a simple report that is formatted for Excel.

Take the following sales order example:

Select OrderId,OrderDate, 
       CustomerId,ItemId,ItemDesc,Qty,ExtAmount
  From Order O
  Join OrderDetail D On D.OrderId=O.OrderId
Where D.ItemType='ELECTRONIC'
   And D.Qty>10

You finish the query, dump the data to Excel and you're done in under half an hour. Now the user comes back and wants to see all the serial numbers associated with each item number. You rewrite the query:

   Select OrderId,OrderDate, 
          CustomerId,ItemId,ItemDesc,Qty,ExtAmount,
          SerialNo
     From Order O
     Join OrderDetail D On D.OrderId=O.OrderId
Left Join Serial S On S.OrderId=D.OrderId
                  And S.ItemId=D.ItemId
    Where D.ItemType='ELECTRONIC'
      And D.Qty>10

Now the user is getting the desired serial numbers. However, the unintended side effect is that the detail line is getting repeated for each serial number. If a quantity of 11 is sold for an item, then theoretically there should be 11 serial numbers that give you the same detail line repeated 11 times. The order detail data is the primary focus of the query so repeating detail data to show serial numbers just creates a mess.

What can we do about this? Well, one solution is to let the user eliminate the duplicates in Excel, which can be time consuming. If you have a fancy Excel creation utility you may have the luxury of removing the duplicates, but usually that requires extra programming.

Another possible solution is to combine this ancillary repetitive data all into one column. In this scenario the serial number data would be grouped into one large column, each serial number being concatenated together by a delimiter of some kind. This solution will keep the order detail data unique so that data isn't repeated due to a join with the serial number file. A UDF is well suited to perform this task.

Shown below is a sample UDF written with the SQL procedural language that will receive the "OrderId" and "ItemId" as parameters. The serial numbers for the given order and item combination will be joined together (delimited by a space) and returned in one large variable character (VARCHAR) column.

Create Function GetSerialList
(@OrderId Int,
 @ItemId  Int)
Returns VarChar(1024)
Language SQL
Reads SQL Data
Set Option Commit=*None, UsrPrf=*Owner, DatFmt=*ISO 
Begin
    Declare @List VarChar(1024) Not Null Default '';
    Declare @i    Int           Not Null Default 0;

    For SerialList AS SerialNumbers Cursor For
    Select RTrim(SerialNo) As Serial
      From Serial
     Where OrderId=@OrderId
       And ItemId=@ItemId
    Do 
        Set @i=i+1;
        If @i>1 Then 
            Set @List=@List||'' ';
        End If;
        Set @List=@List||Serial; 
    End For; 
    Return List;
End

This function uses the SQL FOR statement as a control statement that provides a template for loop processing. Every row returned from the FOR statement's cursor is processed by any statements a developer wants to sandwich between the DO and END FOR statements. These statements can make use of declared variables or columns defined in the cursor's SELECT clause. If you have column expressions in your SELECT, make sure you assign alias names so that they can be used within the row processing code.

In our sample code, the only necessary functions to perform are to increment a counter variable (@i) and to append the serial number (column Serial) from the cursor to the @List variable. When all rows are processed, the FOR statement is exited automatically and the @List variable is returned as the function's result. In fact, this template can be summarized as follows in four easy steps:

  1. Declare counter and list variable
  2. Declare FOR statement with appropriate cursor
  3. Within the FOR statement build a delimited list
  4. Return the list

Now back to our original query. We can use the GETSERIALLIST UDF to give us all serial numbers in a single column:

Select OrderId,OrderDate, 
       CustomerId,ItemId,ItemDesc,Qty,ExtAmount,
       GetSerialList(OrderId,ItemId) As SerialNos
  From Order O
  Join OrderDetail D On D.OrderId=O.OrderId
Where D.ItemType='ELECTRONIC'
   And D.Qty>10

Now instead of duplicating rows we simply create a large column for holding repetitive data. Of course, depending on the type and length of repetitive data, jamming it all in a single column may not always work. However, users I have worked with most often welcome this solution. A few more examples are:

  • Combining multi-line order comments into a single column
  • Combining item numbers into a delimited list for an order summary query
  • Combining release dates for a blanket purchase order line item
  • Creating a list of possible vendors that can supply a given item

Using this simple UDF template to combine repetitive data into a single column often provides an easy and effective way to deliver data to the user without having to muck up a query with additional joins.

Michael Sansoterra is a programmer/analyst for i3 Business Solutions, an IT services firm based in Grand Rapids, Michigan. You can contact him through our contact page.




                     Post this story to del.icio.us
               Post this story to Digg
    Post this story to Slashdot


Sponsored By
HELP/SYSTEMS

Discover Robot/SECURITY, the i5/OS security monitoring
and auditing software, from Help/Systems

                                                            · Profile Exchange
                                                            · Exit Point Monitoring
                                                            · System Authority Auditing
                                                            · QAUDJRN Monitoring
                                                            · System Security Analysis

Robot/SECURITY is the only security software that
combines five critical System i security tools in one package.

Visit our Web site at http://www.helpsystems.com/400g-sec/
to learn more about Robot/SECURITY.


Senior Technical Editor: Ted Holt
Technical Editors: Howard Arner, Joe Hertvik, Shannon O'Donnell, Kevin Vandever
Contributing Technical Editors: Joel Cochran, Wayne O. Evans, Raymond Everhart,
Bruce Guetzkow, Brian Kelly, Marc Logemann, David Morris
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message.

Sponsored Links

looksoftware:  Present your core System i applications in Outlook, Google and Notes
COMMON:  Join us at the annual 2008 conference, March 30 - April 3, in Nashville, Tennessee
NowWhatJobs.net:  NowWhatJobs.net is the resource for job transitions after age 40


 

IT Jungle Store Top Book Picks

The System i RPG & RPG IV Tutorial and Lab Exercises: List Price, $59.95
The System i Pocket RPG & RPG IV Guide: List Price, $69.95
The iSeries Pocket Database Guide: List Price, $59.00
The iSeries Pocket Developers' Guide: List Price, $59.00
The iSeries Pocket SQL Guide: List Price, $59.00
The iSeries Pocket Query Guide: List Price, $49.00
The iSeries Pocket WebFacing Primer: List Price, $39.00
Migrating to WebSphere Express for iSeries: List Price, $49.00
iSeries Express Web Implementer's Guide: List Price, $59.00
Getting Started with WebSphere Development Studio for iSeries: List Price, $79.95
Getting Started With WebSphere Development Studio Client for iSeries: List Price, $89.00
Getting Started with WebSphere Express for iSeries: List Price, $49.00
WebFacing Application Design and Development Guide: List Price, $55.00
Can the AS/400 Survive IBM?: List Price, $49.00
The All-Everything Machine: List Price, $29.95
Chip Wars: List Price, $29.95


 
The Four Hundred
Database Tool Maker Joins the System i Market

State of the System i: Other Software Makers Weigh In

IDC Says Server Buyers Weigh Economy and Power in Q3

As I See It: What's Past Is Prologue

The Linux Beacon
AMD Stalled by a Bug in Barcelona Opterons

Red Hat Goes Grid and Real Time with Enterprise MRG Distro

IDC Says Server Buyers Weigh Economy and Power in Q3

As I See It: What's Past Is Prologue

Four Hundred Stuff
Above Security Takes i5/OS Log Aggregation to Heart

Shield's Remote Journal-Based DR Solution Matures at V2R1

Putting the 'i' Back Into PCI

Pat Townsend and BalaBit Pair Up to Cover System i Logs

Big Iron
Sine Nomine Shows Off Solaris on System z

Top Mainframe Stories From Around the Web

Chats, Webinars, Seminars, Shows, and Other Happenings

System i PTF Guide
December 8, 2007: Volume 9, Number 49

December 1, 2007: Volume 9, Number 48

November 24, 2007: Volume 9, Number 47

November 17, 2007: Volume 9, Number 46

November 10, 2007: Volume 9, Number 45

November 3, 2007: Volume 9, Number 44

The Windows Observer
Windows Anti-Piracy Program Gets Stronger, Weaker with Vista SP1

Exchange Server 2007 SP1 Goes RTM

SAP-Microsoft Mega-Merger Rumor Surfaces, Then Dies

Be My Guest

The Unix Guardian
Sine Nomine Shows Off Solaris on System z

Q&A with Jim Herring: The View from the Top

Sun to Release xVM Virtualization Under GPL v3 License

Be My Guest

Four Hundred Monitor
Four Hundred Monitor's
Full iSeries Events Calendar

THIS ISSUE SPONSORED BY:

Help/Systems
ProData Computer Services
DRV Technologies


Printer Friendly Version


TABLE OF CONTENTS
Use SQL User-Defined Functions to Avoid Data Repetition Problems

Stuff I Didn't Publish This Year

Admin Alert: Getting Started with Trial Capacity on Demand, Part 2

Four Hundred Guru

BACK ISSUES

From the IT Jungle Forums
Choose Logical File Format with SQL

IBM 6400 on LPT1 prints junk

Reallocate disk space from one LPAR to another

How to retrieve a workstation ID

Finding *OUTFILE Template Files





 
Subscription Information:
You can unsubscribe, change your email address, or sign up for any of IT Jungle's free e-newsletters through our Web site at http://www.itjungle.com/sub/subscribe.html.

Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc., 50 Park Terrace East, Suite 8F, New York, NY 10034

Privacy Statement