SQL Paging With Limit And Offset In DB2 For i
January 12, 2016 Michael Sansoterra
LIMIT and OFFSET are two new query options that will please every developer who builds user interfaces (UI) involving potentially large result sets. Handling UI queries that return a large number of rows can be annoying as they may take too long to process, create heavy network traffic, and require a web or desktop client to cache the result set. Further, it’s unlikely a user will review all the rows! But what if DB2 returns only what is needed by giving back one slice of the entire result set on demand?
This is where the new LIMIT and OFFSET features come in. LIMIT and OFFSET allow a DB2 developer to extract just a portion of a larger result set. IBM i 7.1 TR11 or IBM i 7.2 TR3 is required. If you’re on an older version of IBM i, you can still implement the somewhat uncomely alternative SQL paging methods illustrated in these prior tips: Enable Row Set Paging in a Client/Server Environment Using SQL and Paging Cursors And Position To.
LIMIT n is an alternative syntax to the ubiquitous and laborious FETCH FIRST n ROWS ONLY (a.k.a. fetch-first clause). It simply limits the number of rows that a query returns as illustrated here:
-- Only return first 10 rows with highest subtotals SELECT * FROM SalesOrderHeader SOH ORDER BY SubTotal DESC LIMIT 10 -- Same as FETCH FIRST 10 ROWS ONLY
When the above query is run, DB2 will limit the result set size to the first 10 rows. For the record, in a future version of DB2, I’d like to see LIMIT available for UPDATE and DELETE statements.
The OFFSET clause instructs DB2 where to start returning rows within the query result. For example, if a query returns 1000 rows, specifying OFFSET 100 instructs DB2 to skip the first 100 rows of the query results:
SELECT ROW_NUMBER() OVER(ORDER BY SUBTOTAL DESC) AS ROW_ID, SOH.* FROM SalesOrderHeader SOH ORDER BY SUBTOTAL DESC OFFSET 100 ROWS -- Ignore First 100 Rows in the Result
So for the above query, DB2 will build the complete result set internally but only return rows in the set beginning with row 101 (the internal row numbering DB2 uses is primarily determined by the ORDER BY clause).
Incidentally, OFFSET cannot be used except in the context of a prepared or pre-compiled statement. Some interactive tools, such as “Run SQL Scripts” will process a query containing OFFSET, whereas green screen tools such as STRSQL and STRQMQRY do not support it. Also, to ensure consistent results, OFFSET should only be used with an ORDER BY statement, otherwise DB2 may give conflicting results when the same statement is run multiple times. If an OFFSET is requested beyond the number of available rows in the result set, then DB2 will return an empty result set. Finally, to return rows right at the beginning of a result set, specify OFFSET 0.
LIMIT And OFFSET Together
By combining OFFSET and LIMIT in a query, you can direct DB2 to give back a specific subset of rows from a larger result set. Say you’re designing a web-based user interface that allows a customer to inquire on their order history. The maximum number of orders to be shown on a single page is 10, with the user having the ability to click a button to get the next 10 (or prior 10) orders.
Figure 1 below illustrates this common UI design as implemented by the standard Google web search navigation:
Let’s say a customer with 150,000 orders does an inquiry. Should the web UI cache all of these orders all at once, or should it only get 10 orders as needed? If the UI design opts for the latter “paged” approach, a query like the following can be used to retrieve a specific subset of the 150,000 rows:
SELECT SalesOrderId,OrderDate,DueDate,ShipDate, Status,CustomerId,SubTotal,TaxAmt FROM SalesOrderHeader SOH WHERE CustomerId=@CustomerId ORDER BY SalesOrderId DESC LIMIT @ROWS_PER_PAGE -- Variable = 10 OFFSET @PAGE_START_ROW -- Variable = 10 * Page Number
If variable @ROWS_PER_PAGE is 10, and variable @PAGE_START_ROW is 30 then DB2 will return 10 rows (rows 31-40) out of all the customer’s orders. In other words, it returns the user the UI results for “page 4.”
Putting the query in the context of a reusable stored procedure yields something like this:
CREATE OR REPLACE PROCEDURE SALES_BY_CUSTOMER (@CustomerId INT, @Rows_Per_Page INT, @Page_Start_Row INT) RESULT SETS 1 BEGIN DECLARE PAGING_CUSTOMER CURSOR WITH RETURN TO CLIENT FOR SELECT SalesOrderId,OrderDate,DueDate,ShipDate, Status,CustomerId,SubTotal,TaxAmt FROM SalesOrderHeader SOH WHERE CustomerId=@CustomerId ORDER BY SalesOrderId DESC LIMIT @ROWS_PER_PAGE OFFSET @PAGE_START_ROW ; OPEN PAGING_CUSTOMER; SET RESULT SETS CURSOR PAGING_CUSTOMER; END;
The procedure can be invoked as follows:
CALL SALES_BY_CUSTOMER ( @CustomerId=>29734, @Rows_Per_Page=>10, @Page_Start_Row=>10);
In this case, the result set will contain orders for customer ID 29734, up to 10 rows, starting with row 11 in the result set. Using this procedure design, it’s up to the client to supply the page size information and to calculate the starting row for the desired page.
The final thing to consider is how to retrieve the total number of rows in the result set. This information is useful if you want to display to the user how many pages are available like Google does as shown in Figure 1. If there are 50 results at 10 rows per page, then the UI would show the user there are five pages of results so the user can navigate directly to any of the five pages.
In order to determine how many pages there are, we need to know how many rows the query will return regardless of the subset imposed by LIMIT and OFFSET. This information can be supplied to the client by adding a column to the result set that contains the total number of rows. Incidentally, it would be splendid if DB2 would return the total rows for a query using OFFSET via GET DIAGNOSTICS but it doesn’t.
This revised SALES_BY_CUSTOMER procedure implements a new column called TOTAL_ROWS:
CREATE OR REPLACE PROCEDURE SALES_BY_CUSTOMER (@CustomerId INT, @Rows_Per_Page INT, @Page_Start_Row INT) RESULT SETS 1 BEGIN DECLARE PAGING_CUSTOMER CURSOR WITH RETURN TO CLIENT FOR WITH CTE_Customer_Order AS ( SELECT SalesOrderId,OrderDate,DueDate,ShipDate, Status,CustomerId,SubTotal,TaxAmt, ROW_NUMBER() OVER(ORDER BY SalesOrderId) AS Row_Id FROM SalesOrderHeader SOH WHERE CustomerId=@CustomerId ) SELECT CO.*, (SELECT MAX(Row_Id) FROM CTE_Customer_Order) AS Total_Rows FROM CTE_Customer_Order CO ORDER BY SalesOrderId DESC LIMIT @ROWS_PER_PAGE OFFSET @PAGE_START_ROW ; OPEN PAGING_CUSTOMER; SET RESULT SETS CURSOR PAGING_CUSTOMER; END;
This enhanced example uses the ROW_NUMBER function to assign an incremental row ID to each row in a common table expression (CTE) result set. The final query (based on the CTE) then uses a column expression to retrieve the highest (a.k.a. max) row ID and is aliased as Total_Rows. This Total_Rows column allows the client to calculate how many total pages there are in the result set and display the page count to the user accordingly.
By the way, IBM‘s Scott Forstie pointed out a DB2 bug using GET DIAGNOSTICS with DB2_NUMBER_ROWS that incorrectly returns the total number of rows (as opposed to the limited number of rows) from a query even when a OFFSET/LIMIT restriction has been imposed. However, the correct and expected behavior is that GET DIAGNOSTICS with DB2_NUMBER_ROWS returns the number of rows returned as restricted by LIMIT.
In other words, if the query normally returns 100 rows but LIMIT specifies to return only 10, GET DIAGNOSTICS should return 10, not 100. So, if you get the total number of rows from a query using GET DIAGNOSTICS, don’t count on that behavior to continue.