Backup
======

.. raw:: html

   <!-- SPDX-License-Identifier: CC-BY-4.0 -->

..  Note::
   This section covers **physical backups** in PostgreSQL. While PostgreSQL also supports logical backups using the `pg_dump`  utility, these are **not suitable for business continuity** and are **not managed** by CloudNativePG. If you still wish to use `pg_dump` , refer to the :ref:`*Troubleshooting / Emergency backup* section <Backup>` 

for guidance. 　

..  Note::
   Starting with version 1.26, native backup and recovery capabilities are being **progressively phased out** of the core operator and moved to official CNPG-I plugins. This transition aligns with CloudNativePG's shift towards a **backup-agnostic architecture**, enabled by its extensible interface—**CNPG-I**—which standardizes the management of **WAL archiving**, **physical base backups**, and corresponding **recovery processes**. 　

CloudNativePG currently supports **physical backups of PostgreSQL
clusters** in two main ways:

- **Via `CNPG-I <https://github.com/cloudnative-pg/cnpg-i/>`_  plugins**: the CloudNativePG Community officially
  supports the `**Barman Cloud Plugin** <https://cloudnative-pg.io/plugin-barman-cloud/>`_ 

for integration with object storage services.

- **Natively**, with support for:

  - :ref:`Object storage via Barman Cloud <Appendix B - Backup on object stores>` 

  *(although deprecated from 1.26 in favor of the Barman Cloud Plugin)*

  - :ref:`Kubernetes Volume Snapshots <Appendix A - Backup on volume snapshots>`  , if supported by the underlying storage class

Before selecting a backup strategy with CloudNativePG, it’s important to
familiarize yourself with the foundational concepts covered in the
:ref:`Main Concepts <Main Concepts>` 

section. These include WAL archiving, hot and cold backups, performing
backups from a standby, and more.

Main Concepts
-------------

PostgreSQL natively provides first class backup and recovery
capabilities based on file system level (physical) copy. These have been
successfully used for more than 15 years in mission critical production
databases, helping organizations all over the world achieve their
disaster recovery goals with Postgres.

In CloudNativePG, the backup infrastructure for each PostgreSQL cluster
is made up of the following resources:

- **WAL archive**: a location containing the WAL files (transactional
  logs) that are continuously written by Postgres and archived for data
  durability

- **Physical base backups**: a copy of all the files that PostgreSQL
  uses to store the data in the database (primarily the ``PGDATA`` and
  any tablespace)

CNPG-I provides a generic and extensible interface for managing WAL
archiving (both archive and restore operations), as well as the base
backup and corresponding restore processes.

WAL archive
^^^^^^^^^^^

The WAL archive in PostgreSQL is at the heart of **continuous backup**,
and it is fundamental for the following reasons:

- **Hot backups**: the possibility to take physical base backups from
  any instance in the Postgres cluster (either primary or standby)
  without shutting down the server; they are also known as online
  backups

- **Point in Time recovery** (PITR): the possibility to recover at any
  point in time from the first available base backup in your system

..  Warning::
   WAL archive alone is useless. Without a physical base backup, you cannot restore a PostgreSQL cluster. 　

In general, the presence of a WAL archive enhances the resilience of a
PostgreSQL cluster, allowing each instance to fetch any required WAL
file from the archive if needed (normally the WAL archive has higher
retention periods than any Postgres instance that normally recycles
those files).

This use case can also be extended to :ref:`Replica Clusters <Replica Clusters>`  , as they can simply
rely on the WAL archive to synchronize across long distances, extending
disaster recovery goals across different regions.

When you :ref:`configure a WAL archive <WAL archiving>`  , CloudNativePG provides out-of-the-box an
:ref:`PgBouncerPoolMode <PgBouncerPoolMode>`  ≤ 5 minutes for disaster recovery, even across regions.

..  Note::
   Our recommendation is to always setup the WAL archive in production. There are known use cases — normally involving staging and development environments — where none of the above benefits are needed and the WAL archive is not necessary. RPO in this case can be any value, such as 24 hours (daily backups) or infinite (no backup at all). 　

Cold and Hot backups
^^^^^^^^^^^^^^^^^^^^

Hot backups have already been defined in the previous section. They
require the presence of a WAL archive, and they are the norm in any
modern database management system.

**Cold backups**, also known as offline backups, are instead physical
base backups taken when the PostgreSQL instance (standby or primary) is
shut down. They are consistent per definition, and they represent a
snapshot of the database at the time it was shut down.

As a result, PostgreSQL instances can be restarted from a cold backup
without the need of a WAL archive, even though they can take advantage
of it, if available (with all the benefits on the recovery side
highlighted in the previous section).

In those situations with a higher RPO (for example, 1 hour or 24 hours),
and shorter retention periods, cold backups represent a viable option to
be considered for your disaster recovery plans.

Comparing Available Backup Options: Object Stores vs Volume Snapshots
---------------------------------------------------------------------

CloudNativePG currently supports two main approaches for physical
backups:

- **Object store–based backups**, via the `**Barman Cloud  Plugin** <https://cloudnative-pg.io/plugin-barman-cloud/>`_  or the
  :ref:`**deprecated native integration** <Appendix B - Backup on object stores>` 

- :ref:`**Volume Snapshots** <Appendix A - Backup on volume snapshots>`  , using the Kubernetes CSI interface and supported
  storage classes

..  Note::
   CNPG-I is designed to enable third parties to build and integrate their own backup plugins. Over time, we expect the ecosystem of supported backup solutions to grow. 　

Object Store–Based Backups
^^^^^^^^^^^^^^^^^^^^^^^^^^

Backups to an object store (e.g. AWS S3, Azure Blob, GCS):

- Always require WAL archiving

- Support hot backups only

- Do not support incremental or differential copies

- Support retention policies

Volume Snapshots
^^^^^^^^^^^^^^^^

Native volume snapshots:

- Do not require WAL archiving, though its use is still strongly
  recommended in production

- Support incremental and differential copies, depending on the
  capabilities of the underlying storage class

- Support both hot and cold backups

- Do not support retention policies

Choosing Between the Two
^^^^^^^^^^^^^^^^^^^^^^^^

The best approach depends on your environment and operational
requirements. Consider the following factors:

- **Object store availability**: Ensure your Kubernetes cluster can
  access a reliable object storage solution, including a stable
  networking layer.

- **Storage class capabilities**: Confirm that your storage class
  supports CSI volume snapshots with incremental/differential features.

- **Database size**: For very large databases (VLDBs), **volume
  snapshots are generally preferred** as they enable faster recovery due
  to copy-on-write technology—this significantly improves your
  :ref:`Recovery Time Objective (RTO) <Before you start>`  .

- **Data mobility**: Object store–based backups may offer greater
  flexibility for replicating or storing backups across regions or
  environments.

- **Operational familiarity**: Choose the method that aligns best with
  your team’s experience and confidence in managing storage.

Comparison Summary
^^^^^^^^^^^^^^^^^^

.. csv-table::
  :header: Feature,Object Store,Volume Snapshots
  :widths: 12,25,15
  :align: left
  :class: longtable

  **WAL archiving** ,Required,Recommended^1^
  **Cold backup** ,NO,OK
  **Hot backup** ,OK,OK
  **Incremental copy** ,NO,OK^2^
  **Differential copy** ,NO,OK^2^
  **Backup from a standby** ,OK,OK
  **Snapshot recovery** ,NO^3^,OK
  **Retention policies** ,OK,NO
  **Point-in-Time Recovery (PITR)** ,OK,Requires WAL archive
  **Underlying technology** ,Barman Cloud,Kubernetes API

--------------

   **Notes:** > > 1. WAL archiving must currently use an object store
   through a plugin (or the > deprecated native one). > 2. Availability
   of incremental and differential copies depends on the > capabilities
   of the storage class used for PostgreSQL volumes. > 3. Snapshot
   recovery can be emulated by using the >
   ``bootstrap.recovery.recoveryTarget.targetImmediate`` option.

Scheduled Backups
-----------------

Scheduled backups are the recommended way to implement a reliable backup
strategy in CloudNativePG. They are defined using the
``ScheduledBackup`` custom resource.

..  Note::
   For a complete list of configuration options, refer to the :ref:`ScheduledBackupSpec <ScheduledBackupSpec>` 

in the API reference. 　

Cron Schedule
^^^^^^^^^^^^^

The ``schedule`` field defines **when** the backup should occur, using a
*six-field cron expression* that includes seconds. This format follows
the `Go `cron` package specification <https://pkg.go.dev/github.com/robfig/cron#hdr-CRON_Expression_Format>`_  .

..  Warning::
   This format differs from the traditional Unix/Linux `crontab` —it includes a **seconds** field as the first entry. 　

Example of a daily scheduled backup:

.. code:: yaml

   apiVersion: postgresql.cnpg.io/v1
   kind: ScheduledBackup
   metadata:
     name: backup-example
   spec:
     schedule: "0 0 0 * * *"  # At midnight every day
     backupOwnerReference: self
     cluster:
       name: pg-backup
     # method: plugin, volumeSnapshot, or barmanObjectStore (default)

The schedule ``"0 0 0 * * *"`` triggers a backup every day at midnight
(00:00:00). In Kubernetes CronJobs, the equivalent expression would be
``0 0 * * *`` , since seconds are not supported.

Backup Frequency and RTO
^^^^^^^^^^^^^^^^^^^^^^^^

..  tip::
   The frequency of your backups directly impacts your **Recovery Time Objective** ( :ref:`Backup Frequency and RTO <Backup Frequency and RTO>`  ). 　

To optimize your disaster recovery strategy based on continuous backup:

- Regularly test restoring from your backups.

- Measure the time required for a full recovery.

- Account for the size of base backups and the number of WAL files that
  must be retrieved and replayed.

In most cases, a **weekly base backup** is sufficient. It is rare to
schedule full backups more frequently than once per day.

Immediate Backup
^^^^^^^^^^^^^^^^

To trigger a backup immediately when the ``ScheduledBackup`` is created:

.. code:: yaml

   spec:
     immediate: true

 Pause Scheduled Backups
^^^^^^^^^^^^^^^^^^^^^^^^

To temporarily stop scheduled backups from running:

.. code:: yaml

   spec:
     suspend: true

 Backup Owner Reference (``.spec.backupOwnerReference`` )
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Controls which Kubernetes object is set as the owner of the backup
resource:

- ``none`` : No owner reference (legacy behavior)

- ``self`` : The ``ScheduledBackup`` object becomes the owner

- ``cluster`` : The PostgreSQL cluster becomes the owner

On-Demand Backups
-----------------

On-demand backups allow you to manually trigger a backup operation at
any time by creating a ``Backup`` resource.

..  Note::
   For a full list of available options, see the :ref:`BackupSpec <BackupSpec>`  in the API reference. 　

Example: Requesting an On-Demand Backup
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To start an on-demand backup, apply a ``Backup`` request custom resource
like the following:

.. code:: yaml

   apiVersion: postgresql.cnpg.io/v1
   kind: Backup
   metadata:
     name: backup-example
   spec:
     method: barmanObjectStore
     cluster:
       name: pg-backup

In this example, the operator will orchestrate the backup process using
the ``barman-cloud-backup`` tool and store the backup in the configured
object store.

Monitoring Backup Progress
^^^^^^^^^^^^^^^^^^^^^^^^^^

You can check the status of the backup using:

.. code:: bash

   kubectl describe backup backup-example

While the backup is in progress, you’ll see output similar to:

.. code:: text

   Name:         backup-example
   Namespace:    default
   ...
   Spec:
     Cluster:
       Name:  pg-backup
   Status:
     Phase:       running
     Started At:  2020-10-26T13:57:40Z
   Events:        <none>

Once the backup has successfully completed, the ``phase`` will be set to
``completed`` , and the output will include additional metadata:

.. code:: text

   Name:         backup-example
   Namespace:    default
   ...
   Status:
     Backup Id:         20201026T135740
     Destination Path:  s3://backups/
     Endpoint URL:      http://minio:9000
     Phase:             completed
     S3 Credentials:
       Access Key Id:
         Name:  minio
         Key:   ACCESS_KEY_ID
       Secret Access Key:
         Name:  minio
         Key:   ACCESS_SECRET_KEY
     Server Name:       pg-backup
     Started At:        2020-10-26T13:57:40Z
     Stopped At:        2020-10-26T13:57:44Z

--------------

..  Note::
   On-demand backups do **not** include Kubernetes secrets for the PostgreSQL superuser or application user. You should ensure these secrets are included in your broader Kubernetes cluster backup strategy. 　

Backup Methods
--------------

CloudNativePG currently supports the following backup methods for
scheduled and on-demand backups:

- ``plugin`` – Uses a CNPG-I plugin (requires
  ``.spec.pluginConfiguration`` )

- ``volumeSnapshot`` – Uses native :ref:`Kubernetes volume snapshots <Appendix A - Backup on volume snapshots>` 

- ``barmanObjectStore`` – Uses :ref:`Barman Cloud for object storage <Appendix B - Backup on object stores>` 

*(deprecated starting with v1.26 in favor of the `Barman Cloud Plugin <https://cloudnative-pg.io/plugin-barman-cloud/>`_  , but
still the default for backward compatibility)*

Specify the method using the ``.spec.method`` field (defaults to
``barmanObjectStore`` ).

If your cluster is configured to support volume snapshots, you can
enable scheduled snapshot backups like this:

.. code:: yaml

   spec:
     method: volumeSnapshot

To use the Barman Cloud Plugin as the backup method, set
``method: plugin`` and configure the plugin accordingly. You can find an
example in the 
`Performing a Base Backup <https://cloudnative-pg.io/plugin-barman-cloud/docs/usage/#performing-a-base-backup>`_ 

Backup from a Standby
---------------------

Taking a base backup involves reading the entire on-disk data set of a
PostgreSQL instance, which can introduce I/O contention and impact the
performance of the active workload.

To reduce this impact, **CloudNativePG supports taking backups from a
standby instance**, leveraging PostgreSQL’s built-in capability to
perform backups from read-only replicas.

By default, backups are performed on the **most up-to-date replica** in
the cluster. If no replicas are available, the backup will fall back to
the **primary instance**.

..  Note::
   The examples in this section are focused on backup target selection and do not take the backup method (`spec.method` ) into account, as it is not relevant to the scope being discussed. 　

How It Works
^^^^^^^^^^^^

When ``prefer-standby`` is the target (the default behavior),
CloudNativePG will attempt to:

1. Identify the most synchronized standby node.

2. Run the backup process on that standby.

3. Fall back to the primary if no standbys are available.

This strategy minimizes interference with the primary’s workload.

..  Warning::
   Although the standby might not always be up to date with the primary, in the time continuum from the first available backup to the last archived WAL this is normally irrelevant. The base backup indeed represents the starting point from which to begin a recovery operation, including PITR. Similarly to what happens with `pg_basebackup <https://www.postgresql.org/docs/current/app-pgbasebackup.html>`_  , when backing up from an online standby we do not force a switch of the WAL on the primary. This might produce unexpected results in the short term (before `archive_timeout`  kicks in) in deployments with low write activity. 　

Forcing Backup on the Primary
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To always run backups on the primary instance, explicitly set the backup
target to ``primary`` in the cluster configuration:

.. code:: yaml

   apiVersion: postgresql.cnpg.io/v1
   kind: Cluster
   metadata:
     [...]
   spec:
     backup:
       target: "primary"

..  Warning::
   Be cautious when using `primary`  as the target for **cold backups using volume snapshots**, as this will require shutting down the primary instance temporarily—interrupting all write operations. The same caution applies to single-instance clusters, even if you haven't explicitly set the target. 　

Overriding the Cluster-Wide Target
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You can override the cluster-level target on a per-backup basis, using
either ``Backup`` or ``ScheduledBackup`` resources. Here’s an example of
an on-demand backup:

.. code:: yaml

   apiVersion: postgresql.cnpg.io/v1
   kind: Backup
   metadata:
     [...]
   spec:
     cluster:
       name: [...]
     target: "primary"

In this example, even if the cluster’s default target is
``prefer-standby`` , the backup will be taken from the primary instance.

Retention Policies
------------------

CloudNativePG is evolving toward a **backup-agnostic architecture**,
where backup responsibilities are delegated to external **CNPG-I
plugins**. These plugins are expected to offer advanced and customizable
data protection features, including sophisticated retention management,
that go beyond the built-in capabilities and scope of CloudNativePG.

As part of this transition, the ``spec.backup.retentionPolicy`` field in
the ``Cluster`` resource is **deprecated** and will be removed in a
future release.

For more details on available retention features, refer to your chosen
plugin’s documentation. For example: `Retention Policies <https://cloudnative-pg.io/plugin-barman-cloud/docs/retention/>`_  .

..  Note::
   Users are encouraged to rely on the retention mechanisms provided by the backup plugin they are using. This ensures better flexibility and consistency with the backup method in use.