Summary: in this tutorial, you’ll learn how to use the PostgreSQL XML data type to store XML data in the database.
PostgreSQL XML Data Type Overview #
PostgreSQL supports built-in XML
data type that allows you to store well-formed XML documents and XML fragments.
To define a column of the XML
type, you use the following syntax:
column_name XML
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
The built-in XML data type has the following advantages over using TEXT
data type to store XML:
- Type safety: PostgreSQL validates XML, ensuring the XML data is valid.
- Built-in functions: PostgreSQL offers built-in functions to help you manipulate XML data effectively.
PostgreSQL XML Example #
First, create a new table called xproducts
that stores product data:
CREATE TABLE xproducts (
id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
data XML
);
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In the xproducts
table, the data
column has a type of XML that stores the XML data.
Second, insert a new row into the xproducts
table:
INSERT INTO
xproducts (data)
VALUES
(
XMLPARSE(
DOCUMENT '<?xml version="1.0" encoding="UTF-8"?><product>
<name>Samsung Galaxy S24</name>
<price>999.99</price>
<safety_stock>10</safety_stock>
</product>'
)
),
(
XMLPARSE(
DOCUMENT '<?xml version="1.0" encoding="UTF-8"?><product>
<name>Apple iPhone 15</name>
<price>1099.99</price>
<safety_stock>20</safety_stock>
</product>'
)
),
(
XMLPARSE(
DOCUMENT '<?xml version="1.0" encoding="UTF-8"?><product>
<name>Huawei Mate 60</name>
<price>899.99</price>
<safety_stock>30</safety_stock>
</product>'
)
);
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this statement:
DOCUMENT
instructs PostgreSQL that the following string is a complete XML document.XMLPARSE
function converts the string into an XML document.- The
INSERT
statement inserts a new row into thexproducts
table.
Third, retrieve the product names from the XML data using the xpath
function:
SELECT
xpath('/product/name/text()', data) AS name
FROM
xproducts;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
name
------------------------
{"Samsung Galaxy S24"}
{"Apple iPhone 15"}
{"Huawei Mate 60"}
Code language: plaintext (plaintext)
In this example:
- The XPath
'/product/name/text()'
returns the text value of the name node of the XML document. - The result set contains the rows where each is an array of XML values representing product names.
- Since each product has only one name, the array has only one element.
Fourth, retrieve the product names, and prices and cast them to the corresponding data types:
SELECT
(xpath('/product/name/text()', data)) [1]::TEXT name,
(xpath('/product/price/text()', data)) [1]::TEXT::DEC price,
(xpath('/product/safety_stock/text()', data)) [1]::TEXT::INT safety_stock
FROM
xproducts;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
name | price | safety_stock
--------------------+---------+--------------
Samsung Galaxy S24 | 999.99 | 10
Apple iPhone 15 | 1099.99 | 20
Huawei Mate 60 | 899.99 | 30
Code language: plaintext (plaintext)
Let’s break down this expression:
(xpath('/product/name/text()', data)) [1]::TEXT
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
In this expression:
(xpath('/product/name/text()', data))
returns an array.[1]
returns the first element.::TEXT
casts anXML
value to a value ofTEXT
data type.
The following expression casts an XML
value to a TEXT
value first before casting the TEXT
value to a decimal:
(xpath('/product/price/text()', data)) [1]::TEXT::DEC
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Similarly, The following expression casts an XML
value to a TEXT
value first before casting the text to an integer:
(xpath('/product/safety_stock/text()', data)) [1]::TEXT::INT
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Performance #
When you query data from the xproducts
table, PostgreSQL has to scan the whole table.
If the performance is critical and the xproducts
table has many rows, you can create an index for the XML column to improve performance.
First, create a functional-index that extracts the names of products as an array of text:
CREATE INDEX product_name_idx
ON xproducts
USING BTREE (cast(xpath('/product/name', data) AS TEXT[]));
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Second, create a function that insert rows into the xproducts
table:
CREATE FUNCTION insert_products (row_count INT)
RETURNS VOID
LANGUAGE SQL
AS
$$
INSERT INTO xproducts (data)
SELECT
XMLPARSE(DOCUMENT '<?xml version="1.0" encoding="UTF-8"?>
<product>
<name>' || 'Person' || generate_series || '</name>
<price>' || generate_series || '</price>
<safety_stock>' || generate_series || '</safety_stock>
</product>')
FROM generate_series(1, row_count);
$$;
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Third, call the insert_products
function to insert 500 rows into the xproducts
table:
SELECT insert_products(500);
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Finally, find the products with the name 'Product 160'
:
EXPLAIN ANALYZE
SELECT
*
FROM
xproducts
WHERE
cast(xpath('/product/name', data) AS TEXT[]) = '{<name>Product 160</name>}';
Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)
Output:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Index Scan using product_name_idx on xproducts (cost=0.27..8.28 rows=1 width=36) (actual time=0.048..0.048 rows=0 loops=1)
Index Cond: ((xpath('/product/name'::text, data, '{}'::text[]))::text[] = '{"<name>Product 160</name>"}'::text[])
Planning Time: 0.268 ms
Execution Time: 0.087 ms
Code language: plaintext (plaintext)
The output indicates that the query uses the index for filterintg rows in the xproducts
table.
Summary #
- Use the PostgreSQL
XML
data type to store XML documents or fragments in the database. - Use the
xpath()
function to retrieve a value from XML documents.