Summary: in this tutorial, you’ll learn how to use the PostgreSQL SUBSTRING
function to extract the substring from a string.
Introduction to the PostgreSQL SUBSTRING function #
The PostgreSQL SUBSTRING()
function allows you to extract a substring of a string starting at a position (start
) and stopping after n
characters.
Here’s the basic syntax of the substring function:
SUBSTRING (string [FROM start] [FOR count])
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
The SUBSTRING()
function accepts three parameters:
string
: The input string you want to extract a substring.start
: The beginning position in the input string to start extracting. Thestart
is optional and defaults to 1, the beginning of the input string.count
: The number of characters to extract. Thecount
is optional and defaults to the length of the input string.
The SUBSTRING()
function returns the substring extracted from the input string.
Basic PostgreSQL SUBSTRING function examples #
The following query uses the SUBSTRING()
function to extract the domain extension:
SELECT
SUBSTRING(
'pgtutorial.com'
FROM
12 FOR 3
) extension;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
extension
-----------
com
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
To make the query more generic, you can combine the SUBSTRING
function with the POSITION
function:
SELECT
SUBSTRING(
'pgtutorial.com'
FROM
POSITION('.' IN 'pgtutorial.com') + 1 FOR 3
) extension;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
extension
-----------
com
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this example:
- First, the
POSITION
function returns the character’s starting position (.) in the domain name. - Second, the
SUBSTRING
extracts three characters starting from the character’s position (.) plus one.
The following example extracts the extension of the domain pgtutorial.com without using the count argument:
SELECT
SUBSTRING(
'pgtutorial.com'
FROM
POSITION('.' IN 'pgtutorial.com') + 1
) extension;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
In this example, the SUBSTRING()
function returns a substring starting from the character’s position (.
) to the end of the string.
The following query uses the SUBSTRING()
function to extract the domain name without extension:
SELECT
SUBSTRING(
'pgtutorial.com' FOR POSITION('.' IN 'pgtutorial.com') - 1
);
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
substring
------------
pgtutorial
Code language: plaintext (plaintext)
In this example:
- First, the
POSITION
returns the index of the character (.
) in the domain name. - Second, the
SUBSTRING
returns a substring from index1
(default) to the location before the character (.
).
Creating a user-defined function that uses the SUBSTRING function #
The following statement creates a user-defined function called extract_domain
that extracts domain and extension from a domain name using the SUBSTRING
function:
CREATE OR REPLACE FUNCTION extract_domain (domain_name TEXT)
RETURNS TABLE (
domain TEXT,
extension TEXT
) AS
$$
DECLARE
dot_index INT;
BEGIN
dot_index := POSITION('.' IN domain_name);
RETURN QUERY
SELECT
SUBSTRING(domain_name FROM 1 FOR dot_index - 1),
SUBSTRING(domain_name FROM dot_index + 1);
END;
$$ LANGUAGE plpgsql;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
How it works:
First, find the index of the dot character (.
) using the POSITION
function:
dot_index := POSITION('.' IN domain_name);
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Second, use the SUBSTRING()
function to extract the domain name and extension from the domain name based on the dot index.
The following statement calls the extract_domain
function to extract parts of the domain name pgtutorial.com
:
SELECT
domain,
extension
FROM
extract_domain ('pgtutorial.com');
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
domain | extension
------------+-----------
pgtutorial | com
Code language: plaintext (plaintext)
Extracting a substring based on a regular expression #
Here’s another form of the SUBSTRING()
function that extracts a substring from an input string based on a regular expression:
SUBSTRING( string FROM pattern )
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
This function returns the first substring in the string that matches the regular expression.
For example, the following query uses the SUBSTRING()
function to pull the domain name from the email "[email protected]"
:
SELECT
SUBSTRING(
'[email protected]'
FROM
'@(.*)'
) domain_name;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
domain_name
----------------
pgtutorial.com
Code language: plaintext (plaintext)
Heres’ the breakdown of the pattern @(.*)
:
@
: A @ symbol.(.*)
: A capturing group that matches any number of characters(.*)
.
In short, the regular expression @(.*)
matches everything after the character (@
).
Extracting a substring similar to a regular expression #
The following SUBSTRING()
function extracts a substring that matches a regular expression, which is similar to the LIKE
operator but supports regular expression-like syntax:
SUBSTRING (string SIMILAR pattern ESCAPE escape_character)
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this syntax:
string
: The input text.pattern
: A pattern with regex-like syntax.escape_character
: An optional escape character for the regular expression.
For example, the following query uses the SUBSTRING()
function to retrieve the product with a name ending with two digits:
SELECT
SUBSTRING(product_name SIMILAR '%\d{2}' ESCAPE '\') AS name
FROM
products
WHERE
SUBSTRING(product_name SIMILAR '%\d{2}' ESCAPE '\') IS NOT NULL;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
name
--------------------
Samsung Galaxy S24
Apple iPhone 15
Huawei Mate 60
Xiaomi Mi 14
Dell XPS 15
HP Spectre x360
Dell Inspiron 27
Code language: plaintext (plaintext)
In this example, the pattern %\d{2}
matches any string that ends with two digits.
Summary #
- Use PostgreSQL
SUBSTRING
functions to extract a substring from a string.