PostgreSQL Self-join

Summary: in this tutorial, you’ll learn how to use PostgreSQL self-join to compare rows within the same table.

Introduction to PostgreSQL Self-join #

PostgreSQL self-join is a powerful technique that allows you to join a table to itself using an inner join, left join, or right join.

Since a table is joined to itself, this kind of join technique is called self-join. The self-join technique can be helpful when you want to compare rows within the same table.

PostgreSQL does not allow you to use the same table multiple times within the same SQL statement. If you do so, PostgreSQL will issue a syntax error.

To use the same table multiple times within a statement, you must assign different table aliases to the table. In this case, PostgreSQL will treat these aliases as separate tables.

Here’s the syntax of a self-join:

SELECT
  column1,
  column2,
  column3
FROM
  table_name t1
  INNER JOIN table_name t2 ON t1.column1 = t2.column2;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

In this syntax:

  • First, specify a table name in the FROM clause with the alias t1.
  • Second, provide the same table name in the INNER JOIN clause with the alias t2.
  • Third, use a condition to match rows within the table.

Besides the INNER JOIN clause, you can use LEFT JOIN or RIGHT JOIN clause in the SELECT statement.

PostgreSQL Self-Join examples #

Let’s take some examples of using PostgreSQL self-join.

Setting up a sample table #

First, create a new table called categories:

CREATE TABLE categories (
  category_id INT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  parent_id INT,
  FOREIGN KEY (parent_id) REFERENCES categories (category_id) ON DELETE CASCADE
);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

In the categories table, the parent_id is a foreign key column that references the primary key column category_id.

The top category will have NULL in the parent_id column.

A subcategory will have a value in the parent_id column that is the same as a value in the category_id column of an existing row.

Second, insert data into the categories table:

INSERT INTO
  categories (category_id, name, parent_id)
VALUES
  (1, 'Electronics', NULL),
  (2, 'Mobile Devices', 1),
  (3, 'Smartphones', 2),
  (4, 'Tablets', 2),
  (5, 'Accessories', 2),
  (6, 'Wearables', 2),
  (7, 'Home Entertainment', 1),
  (8, 'Televisions', 7),
  (9, 'Audio Systems', 7),
  (10, 'Computers', 1),
  (11, 'Laptops', 10),
  (12, 'Desktops', 10)
RETURNING *;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Output:

 category_id |        name        | parent_id
-------------+--------------------+-----------
           1 | Electronics        |      NULL
           2 | Mobile Devices     |         1
           3 | Smartphones        |         2
           4 | Tablets            |         2
           5 | Accessories        |         2
           6 | Wearables          |         2
           7 | Home Entertainment |         1
           8 | Televisions        |         7
           9 | Audio Systems      |         7
          10 | Computers          |         1
          11 | Laptops            |        10
          12 | Desktops           |        10Code language: plaintext (plaintext)

PostgreSQL self-join using an inner join #

The following statement uses an INNER JOIN to select categories and their parents:

SELECT
  c.name category,
  p.name parent
FROM
  categories c
  INNER JOIN categories p ON c.parent_id = p.category_id;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

      category      |       parent
--------------------+--------------------
 Mobile Devices     | Electronics
 Smartphones        | Mobile Devices
 Tablets            | Mobile Devices
 Accessories        | Mobile Devices
 Wearables          | Mobile Devices
 Home Entertainment | Electronics
 Televisions        | Home Entertainment
 Audio Systems      | Home Entertainment
 Computers          | Electronics
 Laptops            | Computers
 Desktops           | ComputersCode language: plaintext (plaintext)

The output does not show the top level category which has the parent_id NULL. To include the top level category, you need to use a LEFT JOIN to perform a self-join.

PostgreSQL self-join using a left join #

The following statement uses a LEFT JOIN to select categories and their parents, including the top category:

SELECT
  c.name category,
  p.name parent
FROM
  categories c
  LEFT JOIN categories p ON c.parent_id = p.category_id;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

      category      |       parent
--------------------+--------------------
 Electronics        | NULL
 Mobile Devices     | Electronics
 Smartphones        | Mobile Devices
 Tablets            | Mobile Devices
 Accessories        | Mobile Devices
 Wearables          | Mobile Devices
 Home Entertainment | Electronics
 Televisions        | Home Entertainment
 Audio Systems      | Home Entertainment
 Computers          | Electronics
 Laptops            | Computers
 Desktops           | ComputersCode language: plaintext (plaintext)

Summary #

  • A self-join is a join where a table is joined to itself.
  • Use a self-join when you want to compare rows within the same table.

Quiz #

Was this tutorial helpful ?