Summary: in this tutorial, you’ll learn how to use PostgreSQL CUBE
to generate all possible aggregations of the specified columns.
Introduction to PostgreSQL CUBE #
The GROUP BY
clause allows you to group rows and calculate an aggregation of a single grouping set.
To calculate aggregations of all possible combinations of a set of columns, you can use the GROUP BY
clause with the CUBE
option.
Here’s the syntax of the GROUP BY
with the CUBE
:
SELECT
column1,
column2,
aggregate_function (column3)
FROM
table_name
GROUP BY
CUBE (column1, column2);
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this syntax, CUBE
will generate the following groupings:
(column1, column2)
: Individual groups bycolumn1
andcolumn2
.(column1, NULL)
: Groups for values incolumn2
.(NULL, column2)
: Groups for values incolumn1
.(NULL, NULL)
: A group for all columns.
PostgreSQL CUBE Examples #
We’ll use the inventory_reports
table to demonstrate the GROUP BY CUBE
:
Here’s the script for generating inventory_reports table:
CREATE TABLE inventory_reports(
warehouse VARCHAR NOT NULL,
brand VARCHAR NOT NULL,
quantity INT NOT NULL
);
INSERT INTO inventory_reports(warehouse, brand, quantity)
VALUES
('San Jose', 'Apple', 100),
('San Francisco', 'Apple', 200),
('Texas', 'Apple', 300),
('San Jose', 'Samsung', 50),
('San Francisco', 'Samsung', 100),
('Texas', 'Samsung', 150)
RETURNING *;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Using PostgreSQL CUBE with One Column #
The following query uses a GROUP BY
clause to calculate the total quantity of each brand:
SELECT
brand,
SUM(quantity) total_quantity
FROM
inventory_reports
GROUP BY
brand;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
brand | total_quantity
---------+----------------
Samsung | 300
Apple | 600
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
To include a row that calculates the total quantity for all brands, you can use the GROUP BY CUBE
as follows:
SELECT
brand,
SUM(quantity) total_quantity
FROM
inventory_reports
GROUP BY
CUBE (brand)
ORDER BY
brand NULLS LAST;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
brand | total_quantity
---------+----------------
Apple | 600
Samsung | 300
NULL | 900 -> grand total
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this example, the CUBE
generates two groupings:
(brand)
: Calculate the sum of quantity by brand.()
: Calculate the sum of the quantity of all brands.
Using CUBE with Multiple Columns #
The following query uses the GROUP BY CUBE
to generate the subtotals and total:
SELECT
brand,
warehouse,
SUM(quantity) total_quantity
FROM
inventory_reports
GROUP BY
CUBE (brand, warehouse)
ORDER BY
brand NULLS LAST,
warehouse NULLS LAST;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
brand | warehouse | total_quantity
---------+---------------+----------------
Apple | San Francisco | 200
Apple | San Jose | 100
Apple | Texas | 300
Apple | NULL | 600
Samsung | San Francisco | 100
Samsung | San Jose | 50
Samsung | Texas | 150
Samsung | NULL | 300
NULL | San Francisco | 300
NULL | San Jose | 150
NULL | Texas | 450
NULL | NULL | 900
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this example, the CUBE
generates four grouping sets:
(brand, warehouse)
: Calculate the total quantity by brand and warehouse.(brand, NULL)
: Calculate the subtotal per brand, summing all warehouses for that brand.NULL
in thewarehouse
column.(NULL, warehouse)
: Calculate the subtotal per warehouse, summing across all brands.NULL
in thebrand
column.(NULL, NULL)
: Calculate the grand total, summing all warehouses and brands together.
Differences Between GROUPING SETS, ROLLUP, and CUBE #
The following table explains the differences between GROUPING SETS
, ROLLUP
and CUBE
:
Option | Description |
---|---|
CUBE | Generates all possible combinations of specified columns. |
ROLLUP | Generates hierarchical summaries. |
GROUPING SETS | Generates specified grouping sets instead of generating all possibilities. |
Summary #
- Use PostgreSQL
CUBE
to create multi-dimensional summary by calculating all possible aggregations of the specified columns.