PostgreSQL GROUP BY

Summary: in this tutorial, you’ll learn how to use PostgreSQL GROUP BY to group rows into groups based on values of one or more columns.

Getting started with PostgreSQL GROUP BY clause #

The PostgreSQL GROUP BY clause allows you to group rows into groups based on values of one or more columns.

Here’s the syntax of the GROUP BY clause:

SELECT column1, column2
FROM table_name
GROUP BY column1, column2;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

In this syntax:

First, specify the table name from which you want to select data in the FROM clause.
Second, provide one or more columns where you want to group values in the GROUP BY clause.
Third, list the grouped columns in the SELECT clause.

It’s crucial to note that the columns in the SELECT clause must appear in the GROUP BY clause.

You’ll encounter an error if you specify a column in the SELECT clause that does not appear in the GROUP BY clause.

Basic PostgreSQL GROUP BY clause example #

Suppose we have the following inventories table:

id	product_name	quantity	warehouse_id	brand_id
1	iPhone 15	55	1	1
2	iPhone 14 Pro Max	15	2	1
3	iPhone 13	35	2	1
4	Galaxy S24	50	1	2
5	Galaxy Note 23	40	2	2
6	Galaxy Z Fold 6	25	2	2
7	Galaxy Z Flip 6	15	2	2
8	Pixel 9	50	1	3
9	Pixel 8 Pro	25	2	3
10	Pixel Fold	60	1	3

SQL script to create tables and populate sample data

CREATE TABLE warehouses(
   warehouse_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
   warehouse_name VARCHAR(255) NOT NULL UNIQUE 
);

CREATE TABLE brands(
   brand_id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
   brand_name VARCHAR(255) NOT NULL UNIQUE   
);


CREATE TABLE inventories(
   id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
   product_name VARCHAR(255) NOT NULL,
   quantity INT NOT NULL,
   warehouse_id int not null,
   brand_id int not null,
   FOREIGN KEY (warehouse_id) REFERENCES warehouses(warehouse_id) ON DELETE CASCADE,
   FOREIGN KEY (brand_id) REFERENCES brands(brand_id) ON DELETE CASCADE
);



INSERT INTO warehouses (warehouse_name) 
VALUES ('San Jose'),
       ('San Francisco');


INSERT INTO brands (brand_name) 
VALUES ('Apple'),
('Samsung'),
('Google');



INSERT INTO inventories (product_name, quantity, warehouse_id, brand_id) 
VALUES 
('iPhone 15', 55, 1, 1),
('iPhone 14 Pro Max', 15,  2, 1),
('iPhone 13', 35, 2, 1),
('Galaxy S24', 50,  1, 2),
('Galaxy Note 23', 40,  2, 2),
('Galaxy Z Fold 6', 25, 2, 2),
('Galaxy Z Flip 6', 15, 2, 2),
('Pixel 9', 50,  1, 3),
('Pixel 8 Pro', 25, 2, 3),
('Pixel Fold', 60, 1, 3);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

The following statement uses the GROUP BY clause to group rows in the products table by the values in the warehouse_id column:

SELECT
  warehouse_id
FROM
  inventories
GROUP BY
  warehouse_id;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

 warehouse_id
--------------
            2
            1Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

How it works:

First, the FROM clause starts examining all rows from the inventories table.

Second, the GROUP BY clause selects unique values in the warehouse_id column. For each warehouse_id value, the GROUP BY gathers rows into a group. So we have two groups with warehouse_id 1 and 2:

warehouse_id	product_name	quantity	brand_id
1	iPhone 15	55	1
	Galaxy S24	50	2
	Pixel 9	50	3
	Pixel Fold	60	3
2	Galaxy Note 23	40	2
	Galaxy Z Fold 6	25	2
	Galaxy Z Flip 6	15	2
	iPhone 14 Pro Max	15	1
	iPhone 13	35	1
	Pixel 8 Pro	25	3

Finally, the SELECT clause selects the grouped column warehouse_id :

warehouse_id
1
2

The GROUP BY clause is more useful when used with aggregate functions.

Aggregate functions #

An aggregate function takes a group of values, performs a calculation, and returns a single value.

Here are the most commonly used aggregate functions:

MIN function returns the minimum value in a set.
MAX function returns the maximum value in a set.
SUM function returns the total of a set.
AVG function returns the average value of a set.
COUNT function returns the number of items of a set.

MIN function #

The following statement uses the MIN() statement to find the lowest quantity in the inventories table:

SELECT
  MIN(quantity)
FROM
  inventories;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

 min
-----
  15Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

How the query works:

First, the FROM clause examines every row in the inventories table.
Second, the MIN function returns the lowest quantity of all the rows.

MAX Function #

The following example uses the MAX() function to find the highest quantity of a product in the inventory:

SELECT
  MAX(quantity)
FROM
  inventories;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

 max
-----
  60Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

SUM Function #

The following SELECT statement uses the SUM() function to calculate the total quantity of all products in the inventories table:

SELECT
  SUM(quantity)
FROM
  inventories;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

 sum
-----
 370Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

AVG Function #

The following statement uses the AVG() function to calculate the average quantity of all products in the inventories table:

SELECT
  AVG(quantity)
FROM
  inventories;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

         avg
---------------------
 37.0000000000000000Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

COUNT Function #

The following statement uses the COUNT() function to return the number of rows in the inventories table:

SELECT
  COUNT(*)
FROM
  inventories;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

 count
-------
    10Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Using PostgreSQL GROUP BY clause with aggregate functions #

Here’s the syntax for using a GROUP BY clause with aggregate functions:

SELECT
  column1,
  aggregate_function (column2)
FROM
  table_name
GROUP BY
  column1;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

In this syntax:

First, the GROUP BY clause groups the rows from the table into groups based on the column1.

Second, the aggregate function performs a calculation and returns a single value for each group:

Third, the SELECT clause returns the grouped column (column1) and an aggregated value for each group.

For example, the following shows how to calculate the total quantity for each warehouse:

SELECT
  warehouse_id,
  SUM(quantity)
FROM
  inventories
GROUP BY
  warehouse_id;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

 warehouse_id | sum
--------------+-----
            2 | 155
            1 | 215Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

In this example:

The FROM clause starts examining all rows in the inventories table.

The GROUP BY clause divides all the rows in the inventories into two groups:

warehouse_id	sum	product_name	quantity	brand_id
1	215	iPhone 15	55	1
		Galaxy S24	50	2
		Pixel 9	50	3
		Pixel Fold	60	3
2	155	Galaxy Note 23	40	2
		Galaxy Z Fold 6	25	2
		Galaxy Z Flip 6	15	2
		iPhone 14 Pro Max	15	1
		iPhone 13	35	1
		Pixel 8 Pro	25	2

The SUM function then calculates the total of the quantity for each group.

The SELECT clause returns the warehouse_id and total quantity for each group:

Using PostgreSQL GROUP BY with INNER JOIN #

You can use a join to merge rows of multiple tables and use the GROUP BY clause to group rows into groups.

For example, the following query returns the warehouse name and the total quantity of products for each warehouse:

SELECT
  warehouse_name,
  SUM(quantity)
FROM
  inventories
  INNER JOIN warehouses USING (warehouse_id)
GROUP BY
  warehouse_name;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

 warehouse_name | sum
----------------+-----
 San Jose       | 215
 San Francisco  | 155Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

How it works.

First, the FROM examines all rows in the inventories table.

Next, the INNER JOIN merge rows of the inventories table with the rows in the warehouses table based on the values in the warehouse_id column:

id	product_name	quantity	warehouse_id	brand_id	warehouse_id	warehouse_name
1	iPhone 15	55	1	1	1	San Jose
2	iPhone 14 Pro Max	15	2	1	2	San Francisco
3	iPhone 13	35	2	1	2	San Francisco
4	Galaxy S24	50	1	2	1	San Jose
5	Galaxy Note 23	40	2	2	2	San Francisco
6	Galaxy Z Fold 6	25	2	2	2	San Francisco
7	Galaxy Z Flip 6	15	2	2	2	San Francisco
8	Pixel 9	50	1	3	1	San Jose
9	Pixel 8 Pro	25	2	3	2	San Francisco
10	Pixel Fold	60	1	3	1	San Jose

Then, the GROUP BY group merged rows by the values in the warehouse_name:

warehouse_name	id	product_name	quantity	warehouse_id	brand_id
San Jose	1	iPhone 15	55	1	1
	4	Galaxy S24	50	1	2
	8	Pixel 9	50	1	3
	10	Pixel Fold	60	1	3
San Francisco	2	iPhone 14 Pro Max	15	2	1
	3	iPhone 13	35	2	1
	5	Galaxy Note 23	40	2	2
	6	Galaxy Z Fold 6	25	2	2
	7	Galaxy Z Flip 6	15	2	2
	9	Pixel 8 Pro	25	2	3

After that, the SUM function calculates the total for each group:

warehouse_name	sum	id	product_name	quantity	warehouse_id	brand_id
San Jose	215	1	iPhone 15	55	1	1
		4	Galaxy S24	50	1	2
		8	Pixel 9	50	1	3
		10	Pixel Fold	60	1	3
San Francisco	155	2	iPhone 14 Pro Max	15	2	1
		3	iPhone 13	35	2	1
		5	Galaxy Note 23	40	2	2
		6	Galaxy Z Fold 6	25	2	2
		7	Galaxy Z Flip 6	15	2	2
		9	Pixel 8 Pro	25	2	3

Finally, the SELECT clause returns the warehouse names and total quantity of each:

warehouse_name	sum
San Jose	215
San Francisco	200

Group by multiple columns #

The following statement uses the GROUP BY clause to group rows in the inventories table by warehouse_id and brand_id:

SELECT
  warehouse_id,
  brand_id,
  SUM(quantity)
FROM
  inventories
GROUP BY
  warehouse_id,
  brand_id;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

 warehouse_id | brand_id | sum
--------------+----------+-----
            1 |        1 |  55
            2 |        3 |  25
            1 |        3 | 110
            2 |        2 |  80
            1 |        2 |  50
            2 |        1 |  50Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

In this example, the combination of values in the warehouse_id and brand_id column forms a group. The SUM() function returns the total of product quantity for each group.

Summary #

Use the GROUP BY clause to group rows in an SQL query into summary rows based on one or more columns.
Use aggregate functions with the GROUP BY clause to calculate a value for each group.
Use JOIN with GROUP BY to merge rows from multiple table and group rows into groups.

Quiz #

Was this tutorial helpful ?