Ranking Functions in MS SQL Server 2012

Ranking functions return a ranking value for each row in the result set that may or may not specify a partition. The following are common characteristics of ranking functions:

  • Scope of Partition. If the PARTITION BY clause is specified, the ranking function is applied to cover rows within that partition, otherwise if unspecified, the partition is assumed to be the whole result set.
  • Sorting within Partition and Result Set. The row’s rank number is determined by the order_by_clause of the OVER clause. The ORDER BY clause of the SELECT statement orders the entire result set, including the rank order column. These ORDER BY clauses do not have to coincide.
  • Sorting Key for Result Set. If the PARTITION BY clause is specified, it is best to use this for the ORDER BY clause of the SELECT statement. If not specified, the ORDER BY clause of the OVER clause is probably best as the same as the ORDER BY clause of the SELECT statement.
  • Filtering Rows. To filter by the ranking function, you need to use either a) a subquery, b) a CTE or c) a table variable or temp table to generate the initial record set. As explained in this StackOver Q&A, “Windowing functions are not performed until the data is actually selected which is after the WHERE [and HAVING] clause [are dealt with]. So if you try to use a row_number in a WHERE clause the value is not yet assigned.”.

In MS SQL Server 2012, there are four ranking functions and their differences are tabulated below:

Syntax ROW_NUMBER ( ) OVER ( [ PARTITION BY value_expression , … [ n ] ] order_by_clause )
Notes
  • Number Sequence. Returns the sequential number of a row within the scope specified in the PARTITION BY clause, starting at 1 for the first row in each partition. There are no duplicates and skips in the order number.
  • Ties. ROW_NUMBER() order ignores ties and just assign them consecutively that is why there is no guarantee they will be the same unless order by columns and partitioned column are unique separately and in combination.
  • Usage. Generally used to generate a serial reference number for each row, like used in page processing.
Syntax RANK ( ) OVER ( [ partition_by_clause ] order_by_clause)
Notes
  • Number Sequence. Returns a non-sequential rank number of each row within the scope specified in the PARTITION BY clause, starting at 1 for the first row in each partition.
  • Ties. The rank number of rows with ties above it are counted separately (tied ranks above are not treated as one) but the rank number of rows it has ties with are counted collectively per rank (tied ranks are treated as one ), thus the RANK function may return several duplicate rank numbers and skips in the rank numbers.
Syntax DENSE_RANK ( ) OVER ( [ partition_by_clause ]  order_by_clause  )
Notes
  • Number Sequence. Returns the sequential rank number of each row within the scope specified in the PARTITION BY clause, starting at 1 for the first row in each partition.
  • Ties. The rank number of rows with ties are counted collectively per rank as one rank, thus the DENSE_RANK function may return several duplicate rank numbers but without skips in the rank numbers.
Syntax NTILE (integer_expression) OVER ( [ partition_by_clause ] order_by_clause )
Notes
  • Grouping. The rows in the scope are separated into “X” (the integer_expression) number of groups . If the rows are not evenly distributed, the first few groups will have an extra row than the last or succeeding rows.
  • Number Sequence. Returns the sequential order number of groups of rows within the scope specified in the PARTITION BY clause, starting at 1 for the first row in each partition. There may or may not have duplicates in ntile numbers and there are no skips.
  • Ties. Ties are ignored and not considered in the order/ntile numbers. Rows with identical values may fall on different groupings or ntile. See StackOverflow Q&A.

According to MS, all ranking functions are non-deterministic.