Summary: in this tutorial, you’ll learn how to use PostgreSQL self-join to compare rows within the same table.
Introduction to PostgreSQL Self-join #
PostgreSQL self-join is a powerful technique that allows you to join a table to itself using an inner join, left join, or right join.
Since a table is joined to itself, this kind of join technique is called self-join. The self-join technique can be helpful when you want to compare rows within the same table.
PostgreSQL does not allow you to use the same table multiple times within the same SQL statement. If you do so, PostgreSQL will issue a syntax error.
To use the same table multiple times within a statement, you must assign different table aliases to the table. In this case, PostgreSQL will treat these aliases as separate tables.
Here’s the syntax of a self-join:
SELECT
column1,
column2,
column3
FROM
table_name t1
INNER JOIN table_name t2 ON t1.column1 = t2.column2;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this syntax:
- First, specify a table name in the
FROM
clause with the aliast1
. - Second, provide the same table name in the
INNER JOIN
clause with the aliast2
. - Third, use a condition to match rows within the table.
Besides the INNER JOIN
clause, you can use LEFT JOIN
or RIGHT JOIN
clause in the SELECT
statement.
PostgreSQL Self-Join examples #
Let’s take some examples of using PostgreSQL self-join.
Setting up a sample table #
First, create a new table called categories
:
CREATE TABLE categories (
category_id INT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
name VARCHAR(255) NOT NULL,
parent_id INT,
FOREIGN KEY (parent_id) REFERENCES categories (category_id) ON DELETE CASCADE
);
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In the categories
table, the parent_id
is a foreign key column that references the primary key column category_id
.
The top category will have NULL
in the parent_id
column.
A subcategory will have a value in the parent_id
column that is the same as a value in the category_id
column of an existing row.
Second, insert data into the categories
table:
INSERT INTO
categories (category_id, name, parent_id)
VALUES
(1, 'Electronics', NULL),
(2, 'Mobile Devices', 1),
(3, 'Smartphones', 2),
(4, 'Tablets', 2),
(5, 'Accessories', 2),
(6, 'Wearables', 2),
(7, 'Home Entertainment', 1),
(8, 'Televisions', 7),
(9, 'Audio Systems', 7),
(10, 'Computers', 1),
(11, 'Laptops', 10),
(12, 'Desktops', 10)
RETURNING *;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
category_id | name | parent_id
-------------+--------------------+-----------
1 | Electronics | NULL
2 | Mobile Devices | 1
3 | Smartphones | 2
4 | Tablets | 2
5 | Accessories | 2
6 | Wearables | 2
7 | Home Entertainment | 1
8 | Televisions | 7
9 | Audio Systems | 7
10 | Computers | 1
11 | Laptops | 10
12 | Desktops | 10
Code language: plaintext (plaintext)
PostgreSQL self-join using an inner join #
The following statement uses an INNER JOIN
to select categories and their parents:
SELECT
c.name category,
p.name parent
FROM
categories c
INNER JOIN categories p ON c.parent_id = p.category_id;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
category | parent
--------------------+--------------------
Mobile Devices | Electronics
Smartphones | Mobile Devices
Tablets | Mobile Devices
Accessories | Mobile Devices
Wearables | Mobile Devices
Home Entertainment | Electronics
Televisions | Home Entertainment
Audio Systems | Home Entertainment
Computers | Electronics
Laptops | Computers
Desktops | Computers
Code language: plaintext (plaintext)
The output does not show the top level category which has the parent_id
NULL
. To include the top level category, you need to use a LEFT JOIN
to perform a self-join.
PostgreSQL self-join using a left join #
The following statement uses a LEFT JOIN
to select categories and their parents, including the top category:
SELECT
c.name category,
p.name parent
FROM
categories c
LEFT JOIN categories p ON c.parent_id = p.category_id;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
category | parent
--------------------+--------------------
Electronics | NULL
Mobile Devices | Electronics
Smartphones | Mobile Devices
Tablets | Mobile Devices
Accessories | Mobile Devices
Wearables | Mobile Devices
Home Entertainment | Electronics
Televisions | Home Entertainment
Audio Systems | Home Entertainment
Computers | Electronics
Laptops | Computers
Desktops | Computers
Code language: plaintext (plaintext)
Summary #
- A self-join is a join where a table is joined to itself.
- Use a self-join when you want to compare rows within the same table.