Usages Of MSSQL ROW_NUMBER()
Posted by 5/13/2018 05:16:00 PM with No comments
on ROW_NUMBER is an analytic function. It assigns a unique number to each row to which it is applied (either each row in the partition or each row returned by the query), in the ordered sequence of rows specified in the order_by_clause, beginning with 1.
If you want to know more about the definition you can find many resources on the internet talking about ROW_NUMBER. The main focus of this article would be about the usages with providing some code samples.
All code samples for this article could be found on this GitHub repository.
Creating Test Table:
First create a test database and run the script below to create our test table.Inserting Data:
Run this script to insert some data in our test table.Basic Syntax:
The basic syntax of ROW_NUMBER is as follows.As you can see the syntax consists of "ROW_NUMBER()" followed by an "OVER" clause. In the "OVER" clause you should specify the ordering of your records which is essential before requesting from SQL server to number your rows. This is done as in the "OVER (ORDER BY Name, Subject)" clause. In this case, we are ordering the records by "Name" then "Subject" and both in ascending order.
Running this query we will get the results below.
This is not all. Also, we can use the "PARTITION BY" clause in addition to the "ORDER BY" clause as in the code below.
In this code sample, we added the "PARTITION BY" clause which divides/partitions/splits the records into partitions/groups according to the criteria defined before starting the numbering process. In this case, we decided to partition the records using the "Name" column. This means that the records will be divided into groups where the records in each group have the same value in the "Name" column. Then, the row numbering process starts which means that the numbering would be reset at the beginning of each group. In easy words we can say that the "PARTITION BY" clause divides the records into groups and then the whole process (ROW_NUMBER, SUM, ....) will be applied separately on each group.
So, running this query we will get the results below.
Deleting Duplicate Records:
As you can see we have some duplicate records in our test table and by duplicate we mean that some records have the same "Name, Subject" combination. For sure they have different "Id" but assume that according to our business requirements these records should be treated as duplicates and this is caused by a wrong implementation which we need to fix. So, now we need to delete all duplicates and leave only one of each.As you can see in the code sample, we are calculating the row number while partitioning with "Name, Subject" and ordering with "Name, Subject". This means that the numbering will be reset for each group of "Name, Subject" combination which means that if we have some student taking some subject and this record has duplicates, there would be row numbers greater than 1. Then, using this piece of info we can detect the duplicate records to be deleted.
Running this query we will get the results below.
Looping On Records (Without Using Cursors):
You can also make use of ROW_NUMBER in looping through records as in the code sample below.Running this query we will get the results below.
Paging:
You can also make use of ROW_NUMBER in paging records as in the code sample below.Running this query we will get the results below.
And if we set @pageSize to 1 and @pageIndex to 1, we will get the results below.
And if we set @pageSize to 5 and @pageIndex to 1, we will get the results below.
And if we set @pageSize to 5 and @pageIndex to 6, we will get the results below.
Paging By Group:
You can also make use of ROW_NUMBER in paging records by groups so that you can process each group separately. This is done as in the code sample below.Running this query we will get the results below.
I am now done and I encourage you to do your research on "ROW_NUMBER" as it has many usages. The ones I mentioned here are just some examples but I am sure there are a lot of other examples and usages.
Categories: Reusable Code, SQL