Frequently Asked Questions (FAQ)
================================
.. raw:: html
Running PostgreSQL in Kubernetes
--------------------------------
**Everyone knows that stateful workloads like PostgreSQL cannot run in
Kubernetes. Why do you say the contrary?**
An `*independent research survey commissioned by the Data on KubernetesCommunity* `_ in September 2021 revealed that half of the respondents
run most of their production workloads on Kubernetes. 90% of them
believe that Kubernetes is ready for stateful workloads, and 70% of them
run databases in production. Databases like Postgres. However, according
to them, significant challenges remain, such as the knowledge gap
(Kubernetes and Cloud Native, in general, have a steep learning curve)
and the quality of Kubernetes operators. The latter is the reason why we
believe that an operator like CloudNativePG highly contributes to the
success of your project.
For database fanatics like us, a real game-changer has been the
introduction of the support for local persistent volumes in
`*Kubernetes 1.14 in April 2019* `_ .
**CloudNativePG is built on immutable application containers. What does
it mean?**
According to the microservice architectural pattern, a container is
designed to run a single application or process. As a result, such
container images are built to run the main application as the single
entry point (the so-called PID 1 process).
In Kubernetes terms, the application is referred to as workload.
Workloads can be stateless like a web application server or stateful
like a database. Mapping this concept to PostgreSQL, an immutable
application container is a single “postgres” process that is running and
tied to a single and specific version - the one in the immutable
container image.
No other processes such as SSH or systemd, or syslog are allowed.
Immutable Application Containers are in contrast with Mutable System
Containers, which are still a very common way to interpret and use
containers.
Immutable means that a container won’t be modified during its life: no
updates, no patches, no configuration changes. If you must update the
application code or apply a patch, you build a new image and redeploy
it. Immutability makes deployments safer and more repeatable.
For more information, please refer to `*"Why EDB chose immutable application containers"* `_ .
**What does Cloud Native mean?**
The Cloud Native Computing Foundation defines the term ” `*Cloud Native* `_
“. However, since the start of the Cloud Native PostgreSQL/CloudNativePG
operator at 2ndQuadrant, the development team has been interpreting
Cloud Native as three main concepts:
1. An existing, healthy, genuine, and prosperous DevOps culture, founded
on people, as well as principles and processes, which enables teams
and organizations (as teams of teams) to continuously change so to
innovate and accelerate the delivery of outcomes and produce value
for the business in safer, more efficient, and more engaging ways
2. A microservice architecture that is based on Immutable Application
Containers
3. A way to manage and orchestrate these containers, such as Kubernetes
Currently, the standard de facto for container orchestration is
Kubernetes, which automates the deployment, administration and
scalability of Cloud Native Applications.
Another definition of Cloud Native that resonates with us is the one
defined by Ibryam and Huß in `*"Kubernetes Patterns", published by O'Reilly* `_ :
Principles, Patterns, Tools to automate containerized microservices
at scale
**Can I run CloudNativePG on bare metal Kubernetes?**
Yes, definitely. You can run Kubernetes on bare metal. And you can
dedicate one or more physical worker nodes with locally attached storage
to PostgreSQL workloads for maximum and predictable I/O performance.
The actual Cloud Native PostgreSQL project, from which CloudNativePG
originated, was born after a pilot project in 2019 that benchmarked
storage and PostgreSQL on the same bare metal server, first directly in
Linux, and then inside Kubernetes. As expected, the experiment showed
only negligible performance impact introduced by the container running
in Kubernetes through local persistent volumes, allowing the Cloud
Native initiative to continue.
**Why should I use PostgreSQL replication instead of file system
replication?**
Please read the :ref:`Architecture: Synchronizing the state `
section.
**Why should I use an operator instead of running PostgreSQL as a
container?**
The most basic approach to running PostgreSQL in Kubernetes is to have a
pod, which is the smallest unit of deployment in Kubernetes, running a
Postgres container with no replica. The volume hosting the Postgres data
directory is mounted on the pod, and it usually resides on network
storage. In this case, Kubernetes restarts the pod in case of a problem
or moves it to another Kubernetes node.
The most sophisticated approach is to run PostgreSQL using an operator.
An operator is an extension of the Kubernetes controller and defines how
a complex application works in business continuity contexts. The
operator pattern is currently state of the art in Kubernetes for this
purpose. An operator simulates the work of a human operator in an
automated and programmatic way.
Postgres is a complex application, and an operator not only needs to
deploy a cluster (the first step), but also properly react after
unexpected events. The typical example is that of a failover.
An operator relies on Kubernetes for capabilities like self-healing,
scalability, replication, high availability, backup, recovery, updates,
access, resource control, storage management, and so on. It also
facilitates the integration of a PostgreSQL cluster in the log
management and monitoring infrastructure.
CloudNativePG enables the definition of the desired state of a
PostgreSQL cluster via declarative configuration. Kubernetes
continuously makes sure that the current state of the infrastructure
matches the desired one through reconciliation loops initiated by the
Kubernetes controller. If the desired state and the actual state don’t
match, reconciliation loops trigger self-healing procedures. That’s
where an operator like CloudNativePG comes into play.
**Are there any other operators for Postgres out there?**
Yes, of course. And our advice is that you look at all of them and
compare them with CloudNativePG before making your decision. You will
see that most of these operators use an external failover management
tool (Patroni or similar) and rely on StatefulSets.
Here is a non exhaustive list, in chronological order from their
publication on GitHub:
- `Crunchy Data Postgres Operator `_ (2017)
- `Zalando Postgres Operator `_ (2017)
- `Stackgres `_ (2020)
- `Percona Operator for PostgreSQL `_ (2021)
- `Kubegres `_ (2021)
.. figure:: /images/star_history.png
:width: 70%
:alt: Star History Chart
Star History Chart
Feel free to report any relevant missing entry as a PR.
.. Note::
The `Data on Kubernetes Community `_
(which includes some of our maintainers) is working on an independent
and vendor neutral project to list the operators called `Operator Feature Matrix `_ .
**You say that CloudNativePG is a fully declarative operator. What do
you mean by that?**
The easiest way is to explain declarative configuration through an
example that highlights the differences with imperative configuration.
In an imperative context, the state is defined as a series of tasks to
be executed in sequence. So, we can get a three-node PostgreSQL cluster
by creating the first instance, configuring the replication, cloning a
second instance, and the third one.
In a declarative approach, the state of a system is defined using
configuration, namely: there’s a PostgreSQL 13 cluster with two
replicas. This approach highly simplifies change management operations,
and when these are stored in source control systems like Git, it enables
the Infrastructure as Code capability. And Kubernetes takes it farther
than deployment, as it makes sure that our request is fulfilled at any
time.
**What are the required skills to run PostgreSQL on Kubernetes?**
Running PostgreSQL on Kubernetes requires both PostgreSQL and Kubernetes
skills in your DevOps team. The best experience is when database
administrators familiarize themselves with Kubernetes core concepts and
are able to interact with Kubernetes administrators.
Our advice is for everyone that wants to fully exploit Cloud Native
PostgreSQL to acquire the “Certified Kubernetes Administrator (CKA)”
status from the CNCF certification program.
**Why isn’t CloudNativePG using StatefulSets?**
CloudNativePG does not rely on ``StatefulSet`` resources, and instead
manages the underlying PVCs directly by leveraging the selected storage
class for dynamic provisioning. Please refer to the :ref:`Custom Pod Controller `
section for details and reasons behind this decision.
High availability
-----------------
**What happens to the PostgreSQL clusters when the operator pod dies or
it is not available for a certain amount of time?**
The CloudNativePG operator, among other things, is responsible for
self-healing capabilities. As such, they might not be available during
an outage of the operator.
However, assuming that the outage does not affect the nodes where
PostgreSQL clusters are running, the database will continue to serve
normal operations, through the relevant Kubernetes services. Moreover,
the :ref:`In-place updates of the instance manager ` , which runs inside each PostgreSQL pod will still
work, making sure that the database server is up, including accessory
services like logging, export of metrics, continuous archiving of WAL
files, etc.
To summarize:
an outage of the operator does not necessarily imply a PostgreSQL
database outage; it’s like running a database without a DBA or system
administrator.
**What are the reasons behind CloudNativePG not relying on a failover
management tool like Patroni, repmgr, or Stolon?**
Although part of the team that develops CloudNativePG has been heavily
involved in repmgr in the past, we decided to take a different approach
and directly extend the Kubernetes controller and rely on the Kubernetes
API server to hold the status of a Postgres cluster, and use it as the
only source of truth to:
- control High Availability of a Postgres cluster primarily via
automated failover and switchover, coordinating itself with the
:ref:`In-place updates of the instance manager `
- control the Kubernetes services, that is the entry points for your
applications
**Should I manually resync a former primary with the new one following a
failover?**
No. The operator does that automatically for you, and relies on
``pg_rewind`` to synchronize the former primary with the new one.
.. raw:: html