Summary: in this tutorial, you will learn how to use the SQL DISTINCT operator to eliminate duplicate rows in the result set.
SQL DISTINCT clause overview
The result set of a SELECT statement may contain duplicate rows. To eliminate the duplicates, you use the DISTINCT
operator as follows:
SELECT DISTINCT select_list
FROM table_name;
Code language: SQL (Structured Query Language) (sql)
Notice you can use the DISTINCT
operator in the SELECT
statement only.
The SELECT
statement uses the values of the columns specified after the DISTINCT
operator for evaluating the uniqueness of the rows in the result set.
If you specify one column, the database uses the values in the column to evaluate the uniqueness.
If you specify multiple columns, the database engine evaluates the uniqueness of rows based on the combination of values in those columns.
SQL DISTINCT operator examples
Let’s take a look at some examples of using the DISTINCT
operator in the SELECT
statement.
1) Using SQL DISTINCT with one column example
The following query gets employee’s city in the employees
table:
SELECT city
FROM employees;
Code language: SQL (Structured Query Language) (sql)
Output:
+----------+
| city |
+----------+
| Seattle |
| Tacoma |
| Kirkland |
| Redmond |
| London |
| London |
| London |
| Seattle |
| London |
+----------+
9 rows in set (0.00 sec)
Code language: JavaScript (javascript)
The result set contains duplicate city i.e., London
appears four times, which indicates that some employees located in the same city.
To remove the duplicate cities, you can use the DISTINCT
operator as shown in the following query:
SELECT DISTINCT
city
FROM
employees;
Code language: SQL (Structured Query Language) (sql)
Output:
+----------+
| city |
+----------+
| Seattle |
| Tacoma |
| Kirkland |
| Redmond |
| London |
+----------+
5 rows in set (0.00 sec)
Code language: JavaScript (javascript)
2) Using the SQL DISTINCT with multiple columns example
To find a list of unique cities and countries from the employees
table, you can specify the city
and country
columns after the DISTINCT
operator as shown in the following query:
SELECT DISTINCT
city, country
FROM
employees;
Code language: SQL (Structured Query Language) (sql)
Output:
+----------+---------+
| city | country |
+----------+---------+
| Seattle | USA |
| Tacoma | USA |
| Kirkland | USA |
| Redmond | USA |
| London | UK |
+----------+---------+
5 rows in set (0.01 sec)
Code language: JavaScript (javascript)
In this example, the DISTINCT
uses the combination of values in the city and country columns to determine the uniqueness of rows in the result set.
Using the SQL DISTINCT with NULL
The DISTINCT
operator treats NULL
duplicate. It means that the two NULLs
are the same. Therefore, if the SELECT
statement returns NULL
s, the DISTINCT
returns only one NULL
.
The following SELECT
statement selects all the regions from the region
column of the employees
table:
SELECT region FROM employees;
Code language: SQL (Structured Query Language) (sql)
Output:
+--------+
| region |
+--------+
| WA |
| WA |
| WA |
| WA |
| NULL |
| NULL |
| NULL |
| WA |
| NULL |
+--------+
9 rows in set (0.00 sec)
Code language: PHP (php)
As you can see from the output, the query returns duplicate regions. For example, NULL appears four times.
The following SELECT statement uses the DISTINCT operator to select unique regions from the employees table:
SELECT DISTINCT region
FROM employees;
Code language: SQL (Structured Query Language) (sql)
Output:
+--------+
| region |
+--------+
| WA |
| NULL |
+--------+
2 rows in set (0.00 sec)
Code language: JavaScript (javascript)
The result set now has unique regions.
DISTINCT vs. ALL
It is worth to mention the ALL
operator. Unlike the DISTINCT
operator, the ALL
operator includes all rows that contain duplicate values. The following queries return the same result set.
SELECT ALL select_list
FROM table_name;
Code language: SQL (Structured Query Language) (sql)
SELECT select_list
FROM table_name;
Code language: SQL (Structured Query Language) (sql)
The SELECT
statement uses the ALL
operator by default. Therefore, you don’t have to specify it explicitly in the statement.
Summary
- Use the
DISTINCT
operator to select the unique values from one or more columns.