Summary: in this tutorial, you’ll learn how to use PostgreSQL GROUP BY
to group rows into groups based on values of one or more columns.
Getting started with PostgreSQL GROUP BY clause #
The PostgreSQL GROUP BY
clause allows you to group rows into groups based on values of one or more columns.
Here’s the syntax of the GROUP BY
clause:
SELECT column1, column2
FROM table_name
GROUP BY column1, column2;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this syntax:
- First, specify the table name from which you want to select data in the
FROM
clause. - Second, provide one or more columns where you want to group values in the
GROUP BY
clause. - Third, list the grouped columns in the
SELECT
clause.
It’s crucial to note that the columns in the SELECT
clause must appear in the GROUP BY
clause.
You’ll encounter an error if you specify a column in the SELECT
clause that does not appear in the GROUP BY
clause.
Basic PostgreSQL GROUP BY clause example #
Suppose we have the following inventories
table:
id | product_name | quantity | warehouse_id | brand_id |
---|---|---|---|---|
1 | iPhone 15 | 55 | 1 | 1 |
2 | iPhone 14 Pro Max | 15 | 2 | 1 |
3 | iPhone 13 | 35 | 2 | 1 |
4 | Galaxy S24 | 50 | 1 | 2 |
5 | Galaxy Note 23 | 40 | 2 | 2 |
6 | Galaxy Z Fold 6 | 25 | 2 | 2 |
7 | Galaxy Z Flip 6 | 15 | 2 | 2 |
8 | Pixel 9 | 50 | 1 | 3 |
9 | Pixel 8 Pro | 25 | 2 | 3 |
10 | Pixel Fold | 60 | 1 | 3 |
SQL script to create tables and populate sample data
CREATE TABLE warehouses(
warehouse_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
warehouse_name VARCHAR(255) NOT NULL UNIQUE
);
CREATE TABLE brands(
brand_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
brand_name VARCHAR(255) NOT NULL UNIQUE
);
CREATE TABLE inventories(
id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
product_name VARCHAR(255) NOT NULL,
quantity INT NOT NULL,
warehouse_id int not null,
brand_id int not null,
FOREIGN KEY (warehouse_id) REFERENCES warehouses(warehouse_id) ON DELETE CASCADE,
FOREIGN KEY (brand_id) REFERENCES brands(brand_id) ON DELETE CASCADE
);
INSERT INTO warehouses (warehouse_name)
VALUES ('San Jose'),
('San Francisco');
INSERT INTO brands (brand_name)
VALUES ('Apple'),
('Samsung'),
('Google');
INSERT INTO inventories (product_name, quantity, warehouse_id, brand_id)
VALUES
('iPhone 15', 55, 1, 1),
('iPhone 14 Pro Max', 15, 2, 1),
('iPhone 13', 35, 2, 1),
('Galaxy S24', 50, 1, 2),
('Galaxy Note 23', 40, 2, 2),
('Galaxy Z Fold 6', 25, 2, 2),
('Galaxy Z Flip 6', 15, 2, 2),
('Pixel 9', 50, 1, 3),
('Pixel 8 Pro', 25, 2, 3),
('Pixel Fold', 60, 1, 3);
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
The following statement uses the GROUP BY
clause to group rows in the products
table by the values in the warehouse_id
column:
SELECT
warehouse_id
FROM
inventories
GROUP BY
warehouse_id;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
warehouse_id
--------------
2
1
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
How it works:
First, the FROM
clause starts examining all rows from the inventories
table.
Second, the GROUP BY
clause selects unique values in the warehouse_id
column. For each warehouse_id
value, the GROUP BY
gathers rows into a group. So we have two groups with warehouse_id
1
and 2
:
warehouse_id | product_name | quantity | brand_id |
---|---|---|---|
1 | iPhone 15 | 55 | 1 |
Galaxy S24 | 50 | 2 | |
Pixel 9 | 50 | 3 | |
Pixel Fold | 60 | 3 | |
2 | Galaxy Note 23 | 40 | 2 |
Galaxy Z Fold 6 | 25 | 2 | |
Galaxy Z Flip 6 | 15 | 2 | |
iPhone 14 Pro Max | 15 | 1 | |
iPhone 13 | 35 | 1 | |
Pixel 8 Pro | 25 | 3 |
Finally, the SELECT
clause selects the grouped column warehouse_id
:
warehouse_id |
---|
1 |
2 |
The GROUP BY
clause is more useful when used with aggregate functions.
Aggregate functions #
An aggregate function takes a group of values, performs a calculation, and returns a single value.
Here are the most commonly used aggregate functions:
MIN
function returns the minimum value in a set.MAX
function returns the maximum value in a set.SUM
function returns the total of a set.AVG
function returns the average value of a set.COUNT
function returns the number of items of a set.
MIN function #
The following statement uses the MIN()
statement to find the lowest quantity in the inventories
table:
SELECT
MIN(quantity)
FROM
inventories;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
min
-----
15
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
How the query works:
- First, the
FROM
clause examines every row in theinventories
table. - Second, the
MIN
function returns the lowest quantity of all the rows.
MAX Function #
The following example uses the MAX()
function to find the highest quantity of a product in the inventory:
SELECT
MAX(quantity)
FROM
inventories;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
max
-----
60
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
SUM Function #
The following SELECT
statement uses the SUM()
function to calculate the total quantity of all products in the inventories
table:
SELECT
SUM(quantity)
FROM
inventories;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
sum
-----
370
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
AVG Function #
The following statement uses the AVG()
function to calculate the average quantity of all products in the inventories
table:
SELECT
AVG(quantity)
FROM
inventories;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
avg
---------------------
37.0000000000000000
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
COUNT Function #
The following statement uses the COUNT()
function to return the number of rows in the inventories
table:
SELECT
COUNT(*)
FROM
inventories;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
count
-------
10
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Using PostgreSQL GROUP BY clause with aggregate functions #
Here’s the syntax for using a GROUP BY
clause with aggregate functions:
SELECT
column1,
aggregate_function (column2)
FROM
table_name
GROUP BY
column1;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this syntax:
First, the GROUP BY
clause groups the rows from the table into groups based on the column1
.
Second, the aggregate function performs a calculation and returns a single value for each group:
Third, the SELECT
clause returns the grouped column (column1
) and an aggregated value for each group.
For example, the following shows how to calculate the total quantity for each warehouse:
SELECT
warehouse_id,
SUM(quantity)
FROM
inventories
GROUP BY
warehouse_id;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
warehouse_id | sum
--------------+-----
2 | 155
1 | 215
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this example:
The FROM
clause starts examining all rows in the inventories
table.
The GROUP BY
clause divides all the rows in the inventories into two groups:
warehouse_id | sum | product_name | quantity | brand_id |
---|---|---|---|---|
1 | 215 | iPhone 15 | 55 | 1 |
Galaxy S24 | 50 | 2 | ||
Pixel 9 | 50 | 3 | ||
Pixel Fold | 60 | 3 | ||
2 | 155 | Galaxy Note 23 | 40 | 2 |
Galaxy Z Fold 6 | 25 | 2 | ||
Galaxy Z Flip 6 | 15 | 2 | ||
iPhone 14 Pro Max | 15 | 1 | ||
iPhone 13 | 35 | 1 | ||
Pixel 8 Pro | 25 | 2 |
The SUM
function then calculates the total of the quantity for each group.
The SELECT
clause returns the warehouse_id
and total quantity for each group:
Using PostgreSQL GROUP BY with INNER JOIN #
You can use a join to merge rows of multiple tables and use the GROUP BY
clause to group rows into groups.
For example, the following query returns the warehouse name and the total quantity of products for each warehouse:
SELECT
warehouse_name,
SUM(quantity)
FROM
inventories
INNER JOIN warehouses USING (warehouse_id)
GROUP BY
warehouse_name;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
warehouse_name | sum
----------------+-----
San Jose | 215
San Francisco | 155
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
How it works.
First, the FROM
examines all rows in the inventories
table.
Next, the INNER JOIN
merge rows of the inventories
table with the rows in the warehouses
table based on the values in the warehouse_id
column:
id | product_name | quantity | warehouse_id | brand_id | warehouse_id | warehouse_name |
---|---|---|---|---|---|---|
1 | iPhone 15 | 55 | 1 | 1 | 1 | San Jose |
2 | iPhone 14 Pro Max | 15 | 2 | 1 | 2 | San Francisco |
3 | iPhone 13 | 35 | 2 | 1 | 2 | San Francisco |
4 | Galaxy S24 | 50 | 1 | 2 | 1 | San Jose |
5 | Galaxy Note 23 | 40 | 2 | 2 | 2 | San Francisco |
6 | Galaxy Z Fold 6 | 25 | 2 | 2 | 2 | San Francisco |
7 | Galaxy Z Flip 6 | 15 | 2 | 2 | 2 | San Francisco |
8 | Pixel 9 | 50 | 1 | 3 | 1 | San Jose |
9 | Pixel 8 Pro | 25 | 2 | 3 | 2 | San Francisco |
10 | Pixel Fold | 60 | 1 | 3 | 1 | San Jose |
Then, the GROUP BY
group merged rows by the values in the warehouse_name
:
warehouse_name | id | product_name | quantity | warehouse_id | brand_id |
---|---|---|---|---|---|
San Jose | 1 | iPhone 15 | 55 | 1 | 1 |
4 | Galaxy S24 | 50 | 1 | 2 | |
8 | Pixel 9 | 50 | 1 | 3 | |
10 | Pixel Fold | 60 | 1 | 3 | |
San Francisco | 2 | iPhone 14 Pro Max | 15 | 2 | 1 |
3 | iPhone 13 | 35 | 2 | 1 | |
5 | Galaxy Note 23 | 40 | 2 | 2 | |
6 | Galaxy Z Fold 6 | 25 | 2 | 2 | |
7 | Galaxy Z Flip 6 | 15 | 2 | 2 | |
9 | Pixel 8 Pro | 25 | 2 | 3 |
After that, the SUM
function calculates the total for each group:
warehouse_name | sum | id | product_name | quantity | warehouse_id | brand_id |
---|---|---|---|---|---|---|
San Jose | 215 | 1 | iPhone 15 | 55 | 1 | 1 |
4 | Galaxy S24 | 50 | 1 | 2 | ||
8 | Pixel 9 | 50 | 1 | 3 | ||
10 | Pixel Fold | 60 | 1 | 3 | ||
San Francisco | 155 | 2 | iPhone 14 Pro Max | 15 | 2 | 1 |
3 | iPhone 13 | 35 | 2 | 1 | ||
5 | Galaxy Note 23 | 40 | 2 | 2 | ||
6 | Galaxy Z Fold 6 | 25 | 2 | 2 | ||
7 | Galaxy Z Flip 6 | 15 | 2 | 2 | ||
9 | Pixel 8 Pro | 25 | 2 | 3 |
Finally, the SELECT
clause returns the warehouse names and total quantity of each:
warehouse_name | sum |
---|---|
San Jose | 215 |
San Francisco | 200 |
Group by multiple columns #
The following statement uses the GROUP BY
clause to group rows in the inventories
table by warehouse_id
and brand_id
:
SELECT
warehouse_id,
brand_id,
SUM(quantity)
FROM
inventories
GROUP BY
warehouse_id,
brand_id;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
warehouse_id | brand_id | sum
--------------+----------+-----
1 | 1 | 55
2 | 3 | 25
1 | 3 | 110
2 | 2 | 80
1 | 2 | 50
2 | 1 | 50
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this example, the combination of values in the warehouse_id
and brand_id
column forms a group. The SUM()
function returns the total of product quantity for each group.
Summary #
- Use the
GROUP BY
clause to group rows in an SQL query into summary rows based on one or more columns. - Use aggregate functions with the
GROUP BY
clause to calculate a value for each group. - Use
JOIN
withGROUP BY
to merge rows from multiple table and group rows into groups.