MS SQL Advice - COUNT(*) vs EXISTS

 

If you want to check for data existence in a table (e.g. if there are invoices on a concrete date) you could use COUNT(*) or the EXISTS statement. I found various theories on the internet and even in some SQL books what is the best approach, so I decided to test this by myself (spoiler alert: the books are on the side of using EXISTS).

Let’s turn statistics on, execution plan and run a simple query to check if there are invoices on a declared date.

Example:

IF (SELECT COUNT(*) FROM [Sales].[Invoices] I WHERE I.[InvoiceDate] = '2016-03-14') > 0 PRINT ('There are invoices');
IF EXISTS(SELECT I.InvoiceID FROM [Sales].[Invoices] I WHERE I.[InvoiceDate] = '2016-03-14') PRINT ('There are invoices');
/*
1 - Table 'Invoices'. Scan count 1, logical reads 10612, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
CPU time = 47 ms,  elapsed time = 60 ms.
2 - Table 'Invoices'. Scan count 1, logical reads 10612, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
CPU time = 78 ms,  elapsed time = 85 ms.
*/

From my point of view, the two executions are almost the same (in small difference of execution times but with same execution plan). Let’s take a table with more rows and try to do something similar.

Example:

IF (SELECT COUNT(*) FROM [Sales].[InvoiceLines] I WHERE I.[InvoiceID] = 47) > 0 PRINT ('There are invoice lines');
IF EXISTS(SELECT I.InvoiceID FROM [Sales].[InvoiceLines] I WHERE I.[InvoiceID] = 47) PRINT ('There are invoice lines');

Again, the two executions are almost the same! Let's change a little bit the query and declare a variable to store the result of the count(*) operation.

Example:

DECLARE @numberOfRows INT = (SELECT COUNT(*) FROM [Sales].[InvoiceLines] I WHERE I.Quantity = 1); IF @numberOfRows > 0 PRINT ('There are invoice lines');
IF EXISTS(SELECT I.InvoiceID FROM [Sales].[InvoiceLines] I WHERE I.Quantity = 1) PRINT ('There are invoice lines');

You can see that the situation is now completely different! The statement using exists is way more efficient that the one with COUNT (*).

The conclusion could be as follows. If you are not sure what approach to use, use the one with EXISTS. There are less possibilities that you will run in slower execution plan. This also proved that is important the way you are writing your queries.

*MS SQL Server sample database "WideWorldImporters" can be downloaded at: https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0

 *You can download the complete SQL Script with all examples from the post here: SQL Script.sql (1.02 kb)