PostgreSQL XML Data Type

Summary: in this tutorial, you’ll learn how to use the PostgreSQL XML data type to store XML data in the database.

PostgreSQL XML Data Type Overview #

PostgreSQL supports built-in XML data type that allows you to store well-formed XML documents and XML fragments.

To define a column of the XML type, you use the following syntax:

column_name XMLCode language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

The built-in XML data type has the following advantages over using TEXT data type to store XML:

  • Type safety: PostgreSQL validates XML, ensuring the XML data is valid.
  • Built-in functions: PostgreSQL offers built-in functions to help you manipulate XML data effectively.

PostgreSQL XML Example #

First, create a new table called xproducts that stores product data:

CREATE TABLE xproducts (
  id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  data XML
);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

In the xproducts table, the data column has a type of XML that stores the XML data.

Second, insert a new row into the xproducts table:

INSERT INTO
  xproducts (data)
VALUES
  (
    XMLPARSE(
      DOCUMENT '<?xml version="1.0" encoding="UTF-8"?><product>
    <name>Samsung Galaxy S24</name>
    <price>999.99</price>
    <safety_stock>10</safety_stock>
</product>'
    )
  ),
  (
    XMLPARSE(
      DOCUMENT '<?xml version="1.0" encoding="UTF-8"?><product>
    <name>Apple iPhone 15</name>
    <price>1099.99</price>
    <safety_stock>20</safety_stock>
</product>'
    )
  ),
  (
    XMLPARSE(
      DOCUMENT '<?xml version="1.0" encoding="UTF-8"?><product>
    <name>Huawei Mate 60</name>
   <price>899.99</price>
    <safety_stock>30</safety_stock>
</product>'
    )
  );Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

In this statement:

  • DOCUMENT instructs PostgreSQL that the following string is a complete XML document.
  • XMLPARSE function converts the string into an XML document.
  • The INSERT statement inserts a new row into the xproducts table.

Third, retrieve the product names from the XML data using the xpath function:

SELECT
  xpath('/product/name/text()', data) AS name
FROM
  xproducts;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Try it

Output:

          name
------------------------
 {"Samsung Galaxy S24"}
 {"Apple iPhone 15"}
 {"Huawei Mate 60"}Code language: plaintext (plaintext)

In this example:

  • The XPath '/product/name/text()' returns the text value of the name node of the XML document.
  • The result set contains the rows where each is an array of XML values representing product names.
  • Since each product has only one name, the array has only one element.

Fourth, retrieve the product names, and prices and cast them to the corresponding data types:

SELECT
  (xpath('/product/name/text()', data)) [1]::TEXT name,
  (xpath('/product/price/text()', data)) [1]::TEXT::DEC price,
  (xpath('/product/safety_stock/text()', data)) [1]::TEXT::INT safety_stock
FROM
  xproducts;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Output:

        name        |  price  | safety_stock
--------------------+---------+--------------
 Samsung Galaxy S24 |  999.99 |           10
 Apple iPhone 15    | 1099.99 |           20
 Huawei Mate 60     |  899.99 |           30Code language: plaintext (plaintext)

Let’s break down this expression:

(xpath('/product/name/text()', data)) [1]::TEXTCode language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

In this expression:

  • (xpath('/product/name/text()', data)) returns an array.
  • [1] returns the first element.
  • ::TEXT casts an XML value to a value of TEXT data type.

The following expression casts an XML value to a TEXT value first before casting the TEXT value to a decimal:

(xpath('/product/price/text()', data)) [1]::TEXT::DECCode language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Similarly, The following expression casts an XML value to a TEXT value first before casting the text to an integer:

(xpath('/product/safety_stock/text()', data)) [1]::TEXT::INTCode language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Performance #

When you query data from the xproducts table, PostgreSQL has to scan the whole table.

If the performance is critical and the xproducts table has many rows, you can create an index for the XML column to improve performance.

First, create a functional-index that extracts the names of products as an array of text:

CREATE INDEX product_name_idx 
ON xproducts 
USING BTREE (cast(xpath('/product/name', data) AS TEXT[]));Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Second, create a function that insert rows into the xproducts table:

CREATE FUNCTION insert_products (row_count INT) 
RETURNS VOID 
LANGUAGE SQL 
AS 
$$
    INSERT INTO xproducts (data)
    SELECT
        XMLPARSE(DOCUMENT '<?xml version="1.0" encoding="UTF-8"?>
        <product>
            <name>' || 'Person' || generate_series || '</name>
            <price>' || generate_series || '</price>
            <safety_stock>' || generate_series || '</safety_stock>
        </product>')
    FROM generate_series(1, row_count);
$$;Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Third, call the insert_products function to insert 500 rows into the xproducts table:

SELECT insert_products(500);Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Finally, find the products with the name 'Product 160':

EXPLAIN ANALYZE
SELECT
  *
FROM
  xproducts
WHERE
  cast(xpath('/product/name', data) AS TEXT[]) = '{<name>Product 160</name>}';Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Output:

                                                         QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------
 Index Scan using product_name_idx on xproducts  (cost=0.27..8.28 rows=1 width=36) (actual time=0.048..0.048 rows=0 loops=1)
   Index Cond: ((xpath('/product/name'::text, data, '{}'::text[]))::text[] = '{"<name>Product 160</name>"}'::text[])
 Planning Time: 0.268 ms
 Execution Time: 0.087 msCode language: plaintext (plaintext)

The output indicates that the query uses the index for filterintg rows in the xproducts table.

Summary #

  • Use the PostgreSQL XML data type to store XML documents or fragments in the database.
  • Use the xpath() function to retrieve a value from XML documents.
Was this tutorial helpful ?