Summary: in this tutorial, you’ll learn how to use the PostgreSQL REGEXP_MATCH
function to extract the first match of a regular expression from a string.
PostgreSQL REGEXP_MATCH Function Overview #
In PostgreSQL, the REGEXP_MATCH
function allows you to extract the first match of a POSIX regular expression from a string.
The syntax of the REGEXP_MATCH
function is as follows:
REGEXP_MATCH(string, pattern [, flags])
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
string
: The input string to search.pattern
: The POSIX regular expression pattern to match.flags
(optional): Additional options to determine the behavior of the regular expression match, such as'i'
for case-insensitive matching.
The REGEXP_MATCH
function returns an array of text (TEXT[]
) containing the matched substring(s). It returns NULL
if the string
has no match.
PostgreSQL REGEXP_MATCH function example #
The following example uses the REGEXP_MATCH
function to search for the first substring that has one or more digits:
SELECT
REGEXP_MATCH('iPhone 15 was introduced in 2023', '\d+') matches;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
The \d+
regular expression matches one or more digits. Therefore, the function returns the first match, which is 15
, not 2023
.
matches
---------
{15}
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
The result is an array that contains the matched substring.
Extracting multiple groups #
You can use capturing groups ()
in the regex pattern to extract specific parts of the match:
SELECT
REGEXP_MATCH(
'iPhone 15 was introduced on 09-12-2023',
'(\d{2})-(\d{2})-(\d{4})'
) matches;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
matches
--------------
{09,12,2023}
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this example, the result array contains three digits corresponding to the three capturing groups in the regular expression.
To access the nth element of the result array, you use the square bracket notation with the index (1-based index).
For example, the following query returns the year from the result array:
SELECT
(REGEXP_MATCH(
'iPhone 15 was introduced on 09-12-2023',
'(\d{2})-(\d{2})-(\d{4})'
)) [3] released_year;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
released_year
---------------
2023
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Using REGEXP_MATCH with Table Data #
We’ll use the products
table from the inventory
database:

The following query uses the REGEXP_MATCH
function to extract the first number from product names in the products
table:
SELECT
product_name,
REGEXP_MATCH(product_name, '\d+') number
FROM
products;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
product_name | number
----------------------------+--------
Samsung Galaxy S24 | {24}
Apple iPhone 15 | {15}
Huawei Mate 60 | {60}
Xiaomi Mi 14 | {14}
Sony Xperia 1 VI | {1}
...
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
REGEXP_MATCH vs. REGEXP_MATCHES #
The REGEXP_MATCH
function returns the first match, while the REGEXP_MATCHES
function returns all matches as a set of an array.
For example, this query uses the REGEXP_MATCHES
function to return all substrings that contain one or more digits as a set of arrays of text:
SELECT
REGEXP_MATCHES('iPhone 15 was introduced in 2023', '\d+', 'g') matches;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
matches
---------
{15}
{2023}
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Common Use Cases #
The REGEXP_MATCH
function can be helpful in the following cases:
- Data Extraction: Extracting data from text such as phone numbers, emails, and error codes.
- Data Validation: Validate data based on a specific pattern.
Performance Considerations #
Here are some performance considerations when using the REGEXP_MATCH
function:
- Regex Complexity: If you use complex regular expression patterns, the function will be slow, especially with a table with many rows. Ensure to optimize your regular expression patterns for performance.
- Index Utilization: The queries with the
REGEXP_MATCH
function generally cannot utilize indexes, which results in full table scans. If the performance is critical, you can consider using functional indexes. - Flags: The flags like
'i'
may add overhead so ensure using it only when necessary.
Summary #
- Use the
function to extract the first match of a substring based on a POSIX regular expression.REGEXP_MATCH