PostgreSQL CUME_DIST Window Function

Summary: In this tutorial, you’ll learn how to use the PostgreSQL CUME_DIST() window function to calculate the cumulative distribution of a value in a set of values.

Getting Started with PostgreSQL CUME_DIST Window Function #

The CUME_DIST function returns the cumulative distribution of a value within a set of values. It provides the relative position of a value in a set, helping to identify the percentile rank of that value.

A percentile rank is a statistical measure that shows the relative position of a value within a dataset.

For instance, if a price is in the 80th percentile, it indicates that 80% of product prices are equal to or lower than that price.

Syntax #

Here’s the syntax of the CUME_DIST() window function:

CUME_DIST() OVER (
  [PARTITION BY partition_expression]
  [ORDER BY sort_expression]
)Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
  • PARTITION BY: Divides the result set into partitions where the function applies. This clause is optional. If you omit it, the function will treat the whole result set as a single partition.
  • ORDER BY: Specifies the order of rows in each partition.

The CUME_DIST function returns a value in the range of (0, 1).

PostgreSQL CUME_DIST Window Function Examples #

Let’s take some examples of using the CUME_DIST function with the products table:

PostgreSQL CUME_DIST Window Function - Sample Table

Basic PostgreSQL CUME_DIST Window Function Example #

The following SELECT statement uses the CUME_DIST() function to calculate the price percentile for each product:

SELECT
  product_name,
  price,
  CUME_DIST() OVER (
    ORDER BY
      price DESC
  ) AS price_distribution
FROM
  products;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

        product_name        |  price  | price_distribution
----------------------------+---------+--------------------
 Samsung QN900C Neo QLED    | 2999.99 |               0.04
 LG G3 OLED                 | 2499.99 |               0.12
 Sony Bravia XR A95K        | 2499.99 |               0.12
 LG OLED TV C3              | 1999.99 |               0.16
 Samsung Galaxy Z Fold 5    | 1799.99 |                0.2
 Lenovo ThinkPad X1 Carbon  | 1599.99 |               0.24
 Dell XPS 15                | 1499.99 |               0.28
 HP Spectre x360            | 1399.99 |               0.32
 Microsoft Surface Laptop 5 | 1299.99 |               0.48
 Apple iPhone 15 Pro Max    | 1299.99 |               0.48
 Sony HT-A7000 Soundbar     | 1299.99 |               0.48
 Apple iMac 24"             | 1299.99 |               0.48
 Apple iPhone 15            | 1099.99 |               0.56
 Apple iPad Pro 12.9        | 1099.99 |               0.56
 Samsung Galaxy S24         |  999.99 |               0.64
 Dell Inspiron 27           |  999.99 |               0.64
 Sony Xperia 1 VI           |  949.99 |               0.68
 Huawei Mate 60             |  899.99 |               0.72
 Xiaomi Mi 14               |  799.99 |               0.76
 Samsung Galaxy Tab S9      |  699.99 |                0.8
 Apple Watch Series 9       |  399.99 |               0.88
 Bose SoundLink Max         |  399.99 |               0.88
 Samsung Galaxy Watch 6     |  349.99 |               0.92
 Apple AirPods Pro 3        |  249.99 |               0.96
 Samsung Galaxy Buds Pro 2  |  199.99 |                  1Code language: plaintext (plaintext)

The output shows that over 80% of products have prices greater than 699.99, and only 20% have prices at 1799.99.

Calculating the Price Percentile for Each category #

The following statement uses the CUME_DIST() function to calculate the price percentile for products in each category:

SELECT
  category_id,
  product_name,
  price,
  CUME_DIST() OVER (
    PARTITION BY
      category_id
    ORDER BY
      price DESC
  )::DEC(3, 2) AS price_distribution
FROM
  products;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

category_id |        product_name        |  price  | price_distribution
-------------+----------------------------+---------+--------------------
           3 | Samsung Galaxy Z Fold 5    | 1799.99 |               0.14
           3 | Apple iPhone 15 Pro Max    | 1299.99 |               0.29
           3 | Apple iPhone 15            | 1099.99 |               0.43
           3 | Samsung Galaxy S24         |  999.99 |               0.57
           3 | Sony Xperia 1 VI           |  949.99 |               0.71
           3 | Huawei Mate 60             |  899.99 |               0.86
           3 | Xiaomi Mi 14               |  799.99 |               1.00
           4 | Apple iPad Pro 12.9        | 1099.99 |               0.50
           4 | Samsung Galaxy Tab S9      |  699.99 |               1.00
           5 | Apple AirPods Pro 3        |  249.99 |               0.50
           5 | Samsung Galaxy Buds Pro 2  |  199.99 |               1.00
           6 | Apple Watch Series 9       |  399.99 |               0.50
           6 | Samsung Galaxy Watch 6     |  349.99 |               1.00
           8 | Samsung QN900C Neo QLED    | 2999.99 |               0.25
           8 | Sony Bravia XR A95K        | 2499.99 |               0.75
           8 | LG G3 OLED                 | 2499.99 |               0.75
           8 | LG OLED TV C3              | 1999.99 |               1.00
           9 | Sony HT-A7000 Soundbar     | 1299.99 |               0.50
           9 | Bose SoundLink Max         |  399.99 |               1.00
          11 | Lenovo ThinkPad X1 Carbon  | 1599.99 |               0.25
          11 | Dell XPS 15                | 1499.99 |               0.50
          11 | HP Spectre x360            | 1399.99 |               0.75
          11 | Microsoft Surface Laptop 5 | 1299.99 |               1.00
          12 | Apple iMac 24"             | 1299.99 |               0.50
          12 | Dell Inspiron 27           |  999.99 |               1.00Code language: plaintext (plaintext)

In this example:

  • The PARTITION BY category_id divides the products by category_id into partitions.
  • The ORDER BY price DESC sorts the products by prices in each partition.

To get the category name, you can join the products table with the categories table:

PostgreSQL CUME_DIST Window Function - Sample Tables
SELECT
  category_name,
  product_name,
  price,
  CUME_DIST() OVER (
    PARTITION BY
      category_name
    ORDER BY
      price DESC
  )::DEC(3, 2) AS price_distribution
FROM
  products
  JOIN categories USING (category_id);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

 category_name |        product_name        |  price  | price_distribution
---------------+----------------------------+---------+--------------------
 Accessories   | Apple AirPods Pro 3        |  249.99 |               0.50
 Accessories   | Samsung Galaxy Buds Pro 2  |  199.99 |               1.00
 Audio Systems | Sony HT-A7000 Soundbar     | 1299.99 |               0.50
 Audio Systems | Bose SoundLink Max         |  399.99 |               1.00
 Desktops      | Apple iMac 24"             | 1299.99 |               0.50
 Desktops      | Dell Inspiron 27           |  999.99 |               1.00
 Laptops       | Lenovo ThinkPad X1 Carbon  | 1599.99 |               0.25
 Laptops       | Dell XPS 15                | 1499.99 |               0.50
 Laptops       | HP Spectre x360            | 1399.99 |               0.75
 Laptops       | Microsoft Surface Laptop 5 | 1299.99 |               1.00
 Smartphones   | Samsung Galaxy Z Fold 5    | 1799.99 |               0.14
 Smartphones   | Apple iPhone 15 Pro Max    | 1299.99 |               0.29
 Smartphones   | Apple iPhone 15            | 1099.99 |               0.43
 Smartphones   | Samsung Galaxy S24         |  999.99 |               0.57
 Smartphones   | Sony Xperia 1 VI           |  949.99 |               0.71
 Smartphones   | Huawei Mate 60             |  899.99 |               0.86
 Smartphones   | Xiaomi Mi 14               |  799.99 |               1.00
 Tablets       | Apple iPad Pro 12.9        | 1099.99 |               0.50
 Tablets       | Samsung Galaxy Tab S9      |  699.99 |               1.00
 Televisions   | Samsung QN900C Neo QLED    | 2999.99 |               0.25
 Televisions   | Sony Bravia XR A95K        | 2499.99 |               0.75
 Televisions   | LG G3 OLED                 | 2499.99 |               0.75
 Televisions   | LG OLED TV C3              | 1999.99 |               1.00
 Wearables     | Apple Watch Series 9       |  399.99 |               0.50
 Wearables     | Samsung Galaxy Watch 6     |  349.99 |               1.00Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Summary #

  • Use the CUME_DIST window function to calculate the cumulative distribution of a value in a set of values.

Quiz #

Was this tutorial helpful ?