SQL GROUP BY

Summary: in this tutorial, you will learn how to the SQL GROUP BY clause to group rows into a set of summary rows by the values of columns or expressions.

Introduction to SQL GROUP BY clause

The  GROUP BY clause is used to group rows returned by SELECT statement into a set of summary rows or groups based on values of columns or expressions. You can apply an aggregate function such as SUM, AVG, MIN, MAX or COUNT to each group to output the summary information.

The  GROUP BY clause is very useful when you want to analyze data in analytical fashion e.g., products were purchased by a customer or sold by a sale person by quarter. Therefore you often find the  GROUP BY clause applied in the data warehouse and business intelligence (BI) systems to produce the analytical reports.

The typical syntax of  GROUP BY is as follows:

SELECT 
    column1, column2, aggregate_function(expression)
FROM
    table_name
WHERE
    condition
GROUP BY column1;

SQL GROUP BY examples

Let’s take a look at the products table:

products table

SQL GROUP BY with SUM function example

To get the total units in stock for each product category, you use the  GROUP BY clause with the SUM function as follows:

SELECT 
    categoryid, SUM(unitsinstock)
FROM
    products
GROUP BY categoryid;
SQL GROUP BY SUM

The database engine performs the following steps:

  • First, examines the  GROUP BY clause and divides the products into groups based on the product category categoryid.
  • Second, calculates the total of units in stock by using the SUM function for each group.

SQL GROUP BY with COUNT function example

The following query selects the number of products in each product category by using the GROUP BY clause with the COUNT function.

SELECT 
    categoryid, COUNT(productid)
FROM
    products
GROUP BY categoryid;
SQL GROUP BY COUNT

SQL GROUP BY with AVG function

You can check the average number of units in stock for each product category by using the  GROUP BY clause and AVG function as the following query:

SELECT 
    categoryid, FLOOR(AVG(unitsinstock))
FROM
    products
GROUP BY categoryid;
SQL GROUP BY AVG

The  FLOOR function is used to get the largest integer value that is not greater than the argument.

SQL GROUP BY with MIN and MAX functions

Apply the same technique, you can select minimum and maximum units in stock for each product category as follows:

SELECT 
    categoryid, MIN(unitsinstock), MAX(unitsinstock)
FROM
    products
GROUP BY categoryid;

SQL GROUP BY MIN and MAX

SQL GROUP BY with ORDER BY example

The GROUP BY clause is used in conjunction with the ORDER BY clause to sort the groups. For example, you can sort product categories by the number of products as the following query:

SELECT 
    categoryid, COUNT(productid)
FROM
    products
GROUP BY categoryid
ORDER BY COUNT(productid) DESC;
SQL GROUP BY with ORDER BY

SQL GROUP BY multiple columns

You can group the result set by not only one column but also multiple columns. For example, if you want to know how many sale orders that were ordered by a customer and sold by a sale person, you can group the result set based on both customer and sale person.

The database diagram of related tables is as follows:

SQL GROUP BY - Database Diagram of Related Tables

The following query illustrates the idea:

SELECT 
    b.customerid,
    b.CompanyName,
    COUNT(a.orderid) AS 'Orders',
    CONCAT(e.lastname, e.firstname) as 'Sale Person'
FROM
    orders a
        INNER JOIN customers b ON a.customerid = b.customerid
        INNER JOIN employees e ON e.employeeid = a.employeeid

GROUP BY b.customerid, 
         a.employeeid
ORDER BY b.customerid ASC, 
        'Number of orders' DESC;Code language: PHP (php)
SQL GROUP BY multiple columns

In this tutorial, you have learned how to use the SQL  GROUP BY clause to divide rows into groups and apply the aggregate function to each group to produce summary output.