PostgreSQL Covering Index

Summary: in this tutorial, you’ll learn how to use PostgreSQL covering index to boost query performance.

PostgreSQL Covering Index Overview #

When a query retrieves data based on indexed columns, PostgreSQL looks up the matching rows in the index and accesses the table to retrieve the actual rows. This is a 2-steps process. The second step may slow the query if the table has significant data.

A covering index is a database index that includes all the columns needed to satisfy a query. It allows PostgreSQL to retrieve the required data directly from the index without accessing the table.

To create a covering index, you can use the CREATE INDEX statement with an INCLUDE clause to specify the extra columns to be stored in the index:

CREATE INDEX [index_name]
ON table_name(indexed_columns)
INCLUDE(extra_columns);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

In this syntax:

  • First, specify the name of the new index.
  • Second, list the table with indexed columns.
  • Third, add extra columns in the INCLUDE clause.

Since PostgreSQL needs to store data of the extra columns in the index, the index’s size will increase. Therefore, you should use covering indexes strategically.

You should use covering indexes for frequent read-heavy queries that return small columns, e.g., queries with grouping operations or analytical queries.

However, if the table has heavy writes, frequent updates, or large columns, you should avoid using covering indexes.

PostgreSQL Covering Index Example #

First, create a products table to store product data:

CREATE TABLE products (
  id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  price DEC(10, 2) NOT NULL
);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Second, create a function to insert n rows into the products table:

CREATE OR REPLACE FUNCTION insert_products (n INT) 
RETURNS VOID 
AS 
$$
DECLARE
   i INT;
BEGIN
   FOR i IN 1..n LOOP
      INSERT INTO products (name, price)
      VALUES (
       'Product_' || i, 
       ROUND((RANDOM() * 100 + 1)::numeric, 2)
      );
   END LOOP;
END;
$$ 
LANGUAGE plpgsql;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Third, insert 1000 rows into the products table:

SELECT insert_products (1000);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Fourth, create an index for the name column of the products table and include the price column:

CREATE INDEX ON products(name)
INCLUDE (price);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Fifth, find the products with the name 'Product_100':

EXPLAIN ANALYZE
SELECT name, price
FROM products 
WHERE name = 'Product_100';Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Output:

                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using products_name_price_idx on products  (cost=0.28..4.29 rows=1 width=17) (actual time=0.141..0.143 rows=1 loops=1)
   Index Cond: (name = 'Product_100'::text)
   Heap Fetches: 0
 Planning Time: 0.137 ms
 Execution Time: 0.173 msCode language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

The output indicates that the query uses an index only scan to retrieve entire data from the index. It does not fetch data from the table:

Heap Fetches: 0Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Summary #

  • Use covering indexes to boost query performance directly retrieving all data from indexes.

Quiz #

Was this tutorial helpful ?