IS [NOT] DISTINCT FROM (The Distinct Predicate)

As you probably know comparing with NULL value is not possible. NULL is not a value. IS [NOT] DISTINCT compares the equality of two expressions and guarantees a true or false result, even if one or both operands are NULL.

IS [NOT] DISTINCT FROM is a predicate used in the search condition of WHERE clauses and HAVING clauses, the join conditions of FROM clauses, and other constructs where a Boolean value is required.

Syntax:

expression IS [NOT] DISTINCT FROM expression

  • expression – Any valid expression. The expression can be a column, a constant, a function, a variable, a scalar subquery, or any combination of column names, constants, and functions connected by an operator or operators, or a subquery.

Let's try it on an example.

We will use the database WideWorldImporters in which there is a table Sales.Orders in which, among other things, the column PickingCompletedWhen represents the time when the order was picked up. For unpicked orders, the value of the PickingCompletedWhen column is NULL.
In order to improve performance, we will create an index on that column.

CREATE INDEX pickingdateidx ON Sales.Orders (PickingCompletedWhen);

Let's retrieve all unpicked orders.

Example:

DECLARE @dt datetime2 = NULL
SELECT * FROM Sales.Orders WHERE PickingCompletedWhen = @dt;

Although we have unpicked orders in the database, the result of this query is an empty set. The reason for this is, as we mentioned before, that comparison with NULL is not possible.

To solve the problem the old way, we can introduce a comparison using ISNULL.

Example:

DECLARE @dt AS DATE = NULL;
SELECT * FROM Sales.Orders WHERE ISNULL(PickingCompletedWhen, '99991231') = ISNULL(@dt, '99991231');

Table 'Orders'. Scan count 1, logical reads 9626...
SQL Server Execution Times: CPU time = 16 ms

The result is now here, but even though we previously created an index on that column to speed up the query, the SQL server still did an index scan (which means it read the entire table). This is because the engine had to calculate the value of each row with the ISNULL function.

Another way is to use a combination with OR and IS NULL.

Example:

DECLARE @dt AS DATE = NULL;
SELECT * FROM Sales.Orders WHERE PickingCompletedWhen = @dt OR (PickingCompletedWhen IS NULL AND @dt IS NULL);
Table 'Orders'. Scan count 1, logical reads 692...
SQL Server Execution Times: CPU time = 0 ms

Logical reads are now smaller, query is faster, but in the execution plan we still can see the table scan.

Now, let's try with the new operator.

Example:

DECLARE @dt datetime2 = NULL;
SELECT * FROM Sales.Orders WHERE PickingCompletedWhen IS NOT DISTINCT FROM @dt;
Table 'Orders'. Scan count 1, logical reads 9465...
SQL Server Execution Times: CPU time = 0 ms


We have a result and guess what else we have? We have an index seek!

Not that it makes much sense, but in this way we can retrieve all the picked-up orders.

Example:

SELECT * FROM Sales.Orders WHERE PickingCompletedWhen IS DISTINCT FROM NULL;

If we want to retrieve orders that have a value different from some value in the pick-up date, unlike the standard <> operator, we will also get NULL values.

Example:

SELECT OrderID, PickingCompletedWhen FROM Sales.Orders
WHERE PickingCompletedWhen <> '2013-01-01 12:00:00.0000000' 
ORDER BY OrderID;

SELECT OrderID, PickingCompletedWhen FROM Sales.Orders
WHERE PickingCompletedWhen IS DISTINCT FROM '2013-01-01 12:00:00.0000000' 
ORDER BY OrderID;

And finally, by using IS NOT DISTINCT FROM, we can, for example, retrieve all orders on one date.

Example:

SELECT OrderID, PickingCompletedWhen FROM Sales.Orders
WHERE PickingCompletedWhen = '2013-01-01 12:00:00.0000000';

SELECT OrderID, PickingCompletedWhen FROM Sales.Orders
WHERE PickingCompletedWhen IS NOT DISTINCT FROM '2013-01-01 12:00:00.0000000';

*You can download the complete SQL Script with all examples from the post here: SQL Script.sql (1.73 kb)