Summary: in this tutorial, you’ll learn how to use PostgreSQL partial indexes to index only data that meets a specified condition.
Introduction to PostgreSQL partial indexes #
When you create an index that includes a column or a set of columns, PostgreSQL extracts all data from these columns for indexing.
PostgreSQL allows you to include only a subset of data in a table that meets a specified condition in an index. This index is called a partial index.
To create a partial index, you use the following CREATE INDEX
statement with a WHERE
clause:
CREATE INDEX [index_name]
ON table_name (column1, column2)
WHERE condition;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this syntax:
- First, specify an index name after the
CREATE INDEX
keywords. The index name is optional. - Second, provide the table name followed by one or more columns to include in the index.
- Third, define a condition in the
WHERE
clause to specify which rows should be included in the index. Only the rows that meet the condition will be included.
When you query data from the table, PostgreSQL will utilize the partial index only for rows that are included in the index.
PostgreSQL partial indexes example #
First, create a products
table:
CREATE TABLE products (
id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name VARCHAR(255) NOT NULL,
price DEC(10, 2) NOT NULL,
discontinued BOOL
);
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Next, create a function to insert rows into the products
table:
CREATE OR REPLACE FUNCTION insert_products (n INT)
RETURNS VOID
AS
$$
DECLARE
i INT;
BEGIN
FOR i IN 1..n LOOP
INSERT INTO products (name, price, discontinued)
VALUES (
'Product_' || i, -- Generating a simple name
ROUND((RANDOM() * 100 + 1)::numeric, 2), -- Corrected casting for rounding
CASE WHEN RANDOM() < 0.2 THEN TRUE ELSE FALSE END -- 20% chance of being discontinued
);
END LOOP;
END;
$$
LANGUAGE plpgsql;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Then, call the insert_products
function to insert 1000
rows into the products
table:
SELECT insert_products(1000);
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
After that, create an index that includes only active products:
CREATE INDEX ON products (price)
WHERE
discontinued = FALSE;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
This query creates a partial index products_price_idx
that includes only products with the discontinued false
.
Finally, retrieve the active products:
EXPLAIN SELECT
name,
price
FROM
products
WHERE
discontinued = FALSE
AND price < 100;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
QUERY PLAN
-----------------------------------------------------------------------------------
Bitmap Heap Scan on products (cost=9.57..18.66 rows=167 width=532)
Recheck Cond: ((price < '100'::numeric) AND (NOT discontinued))
-> Bitmap Index Scan on products_price_idx (cost=0.00..9.53 rows=167 width=0)
Index Cond: (price < '100'::numeric)
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
The output indicates that the query uses the products_price_idx
index.
If you query the discontinued products, PostgreSQL will not utilize the index. For example:
EXPLAIN SELECT
name,
price
FROM
products
WHERE
discontinued = TRUE
AND price < 100;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
QUERY PLAN
------------------------------------------------------------
Seq Scan on products (cost=0.00..11.75 rows=23 width=532)
Filter: (discontinued AND (price < '100'::numeric))
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Benefits of PostgreSQL partial indexes #
Partial indexes have the following benefits:
- Smaller index size – Partial indexes take only targeted rows for indexing. Therefore, they take up less disk space.
- Faster index scans and better query performance – When you execute queries that match the index condition, PostgreSQL has to scan a smaller index, which helps improve query performance.
- Reduced Write Overhead and better query performance: When you insert, update, and delete, PostgreSQL has to update a subset of rows, leading to fewer index updates.
When to use partial indexes #
In practice, you can use partial indexes in the following scenarios:
- A large table with queries consistently including the conditions matched the partial index.
- The filter conditions significantly reduce the number of rows for indexing, e.g., active users, live products, etc.
Summary #
- Use partial indexes to optimize queries by indexing only a subset of table rows.