Wednesday, August 25, 2010

Pagination with Sql Query Result

When query database, you don't want to get all the result set of records, because of possible huge number. Users have to see them one page at a time anyways. What you do is providing page number and page size as parameters and get the right record set for viewing. In another case with what are called top-N, bottom-N and inner-N reporting, you also need winowing out certain range of records. To accomplish this, you can work out with different brand dbs by using slightly different clauses.

MySQL

With MySql, LIMIT clause provide the simplest solution. As shown in following example:

-- Get record from 501 to 600
SELECT * FROM EMPLOYEES WHERE id >10000 ORDER BY last_name LIMIT 100 OFFSET 500

SQL Server

With SQL Server, TOP is useful to only get certain records listed in the front:

SELECT TOP 20 * FROM EMPLOYEES WHERE ID>10000 ORDER BY LAST_NAME

However, it dosen't help if you want retrieve a page other than the first. What SQL Server has to offer for this problem is the ROW_NUMBER(), a system function for generating analytic data. Here is an example:

SELECT * FROM (
SELECT FIRST_NAME, EMAIL, ROW_NUMBER() OVER (ORDER BY LAST_NAME DESC) AS RN
FROM EMPLOYEES) WHERE RN BETWEEN 501 AND 600
) ORDER BY RN

Note, the inner query result is ordered by the order defined in ROW_NUMBER function. In case you have more than one ROW_NUMBER() functions in a query, as demonstrated in the following code snippet, the result is ordered based on the last function.

select * from (
select name, email, ROW_NUMBER() over (order by name desc) as rn,
ROW_NUMBER() over (order by name asc) as rn2
from contact) result where rn between 7 and 10
order by rn desc

I thought this solution is not a clean cut to pagination, because you end up getting a nested query and have to refactor the existing query to make it run. For example, you already have a query, now you just want to add pagination. Even SQL allows * and aggregation functions together as shown in following code snippet, that allows you to just add ROW_NUMBER() column without requiring to change much on the SELECT clause, you have to remove the ORDER BY clause from the original place and put it into ROW_NUMBER function's OVER clause or put into the outer query statement, because normally ORDER BY clause is not allowed in subquery.

select *, ROW_NUMBER() over (order by name desc) as rn from contact

Oracle

Don't mention that Oracle offers ROW_NUMBER() analytic function as with SQL SVR. But Oracle has a psuedo column ROWNUM that comes with any query. I believe it is most efficient way and can cover most cases of query pagination. Here is an example:

SELECT * FROM EMPLOYEES WHERE ROWNUM>50 AND ROWNUM<101 ORDER BY LAST_NAME

Most importantly, to add pagination to existing query you only need to add a condition on ROWNUM to the WHERE clause, no more further efforts.

More on ROW_NUMBER()

Surely with ROW_NUMBER() in SQL SVR and Oracle, you can do much more than pagination. Following example can give a heads up, it gets first five employees from all departments they have top smallest id in their departments:

SELECT * FROM
(
SELECT DEPARTMENT_NAME, LAST_NAME, EMPLOYEE_ID, ROW_NUMER()
OVER (PARTITION BY DEPARTMENT_NAME ORDER BY EMPLOYEE_ID) AS RN_ID
FROM EMPOYEES
)
WHERE RN_ID <6

Pagination with LINQ to SQL

To end this topic, I would like to have a look at LINQ to SQL. You must know it has skip(m) and take(n) feature (following code snippet). But, what it ends up translated into SQL?

var query = (from employee in Employees
select employee).Skip(10).Take(20);

Under the hood it is translated into something like this:

SELECT FIRST_NAME, LAST_NAME, EMAIL
FROM (
SELECT ROW_NUMBER() OVER (
ORDER BY FIRST_NAME, LAST_NAME, EMAIL) ROW_NUMBER,
FIRST_NAME,
LAST_NAME,
EMAIL
FROM EMPLOYEES
)
WHERE ROW_NUMBER BETWEEN @P0+1 AND @P0+@P1
ORDER BY ROW_NUMBER

Where @p0 is the value specified within Skip(), while @p1 is that within Take().

No comments: