Frequently Asked Questions (FAQ) ================================ .. raw:: html Running PostgreSQL in Kubernetes -------------------------------- **Everyone knows that stateful workloads like PostgreSQL cannot run in Kubernetes. Why do you say the contrary?** An `*independent research survey commissioned by the Data on KubernetesCommunity* `_ in September 2021 revealed that half of the respondents run most of their production workloads on Kubernetes. 90% of them believe that Kubernetes is ready for stateful workloads, and 70% of them run databases in production. Databases like Postgres. However, according to them, significant challenges remain, such as the knowledge gap (Kubernetes and Cloud Native, in general, have a steep learning curve) and the quality of Kubernetes operators. The latter is the reason why we believe that an operator like CloudNativePG highly contributes to the success of your project. For database fanatics like us, a real game-changer has been the introduction of the support for local persistent volumes in `*Kubernetes 1.14 in April 2019* `_ . **CloudNativePG is built on immutable application containers. What does it mean?** According to the microservice architectural pattern, a container is designed to run a single application or process. As a result, such container images are built to run the main application as the single entry point (the so-called PID 1 process). In Kubernetes terms, the application is referred to as workload. Workloads can be stateless like a web application server or stateful like a database. Mapping this concept to PostgreSQL, an immutable application container is a single “postgres” process that is running and tied to a single and specific version - the one in the immutable container image. No other processes such as SSH or systemd, or syslog are allowed. Immutable Application Containers are in contrast with Mutable System Containers, which are still a very common way to interpret and use containers. Immutable means that a container won’t be modified during its life: no updates, no patches, no configuration changes. If you must update the application code or apply a patch, you build a new image and redeploy it. Immutability makes deployments safer and more repeatable. For more information, please refer to `*"Why EDB chose immutable application containers"* `_ . **What does Cloud Native mean?** The Cloud Native Computing Foundation defines the term ” `*Cloud Native* `_ “. However, since the start of the Cloud Native PostgreSQL/CloudNativePG operator at 2ndQuadrant, the development team has been interpreting Cloud Native as three main concepts: 1. An existing, healthy, genuine, and prosperous DevOps culture, founded on people, as well as principles and processes, which enables teams and organizations (as teams of teams) to continuously change so to innovate and accelerate the delivery of outcomes and produce value for the business in safer, more efficient, and more engaging ways 2. A microservice architecture that is based on Immutable Application Containers 3. A way to manage and orchestrate these containers, such as Kubernetes Currently, the standard de facto for container orchestration is Kubernetes, which automates the deployment, administration and scalability of Cloud Native Applications. Another definition of Cloud Native that resonates with us is the one defined by Ibryam and Huß in `*"Kubernetes Patterns", published by O'Reilly* `_ : Principles, Patterns, Tools to automate containerized microservices at scale **Can I run CloudNativePG on bare metal Kubernetes?** Yes, definitely. You can run Kubernetes on bare metal. And you can dedicate one or more physical worker nodes with locally attached storage to PostgreSQL workloads for maximum and predictable I/O performance. The actual Cloud Native PostgreSQL project, from which CloudNativePG originated, was born after a pilot project in 2019 that benchmarked storage and PostgreSQL on the same bare metal server, first directly in Linux, and then inside Kubernetes. As expected, the experiment showed only negligible performance impact introduced by the container running in Kubernetes through local persistent volumes, allowing the Cloud Native initiative to continue. **Why should I use PostgreSQL replication instead of file system replication?** Please read the :ref:`Architecture: Synchronizing the state ` section. **Why should I use an operator instead of running PostgreSQL as a container?** The most basic approach to running PostgreSQL in Kubernetes is to have a pod, which is the smallest unit of deployment in Kubernetes, running a Postgres container with no replica. The volume hosting the Postgres data directory is mounted on the pod, and it usually resides on network storage. In this case, Kubernetes restarts the pod in case of a problem or moves it to another Kubernetes node. The most sophisticated approach is to run PostgreSQL using an operator. An operator is an extension of the Kubernetes controller and defines how a complex application works in business continuity contexts. The operator pattern is currently state of the art in Kubernetes for this purpose. An operator simulates the work of a human operator in an automated and programmatic way. Postgres is a complex application, and an operator not only needs to deploy a cluster (the first step), but also properly react after unexpected events. The typical example is that of a failover. An operator relies on Kubernetes for capabilities like self-healing, scalability, replication, high availability, backup, recovery, updates, access, resource control, storage management, and so on. It also facilitates the integration of a PostgreSQL cluster in the log management and monitoring infrastructure. CloudNativePG enables the definition of the desired state of a PostgreSQL cluster via declarative configuration. Kubernetes continuously makes sure that the current state of the infrastructure matches the desired one through reconciliation loops initiated by the Kubernetes controller. If the desired state and the actual state don’t match, reconciliation loops trigger self-healing procedures. That’s where an operator like CloudNativePG comes into play. **Are there any other operators for Postgres out there?** Yes, of course. And our advice is that you look at all of them and compare them with CloudNativePG before making your decision. You will see that most of these operators use an external failover management tool (Patroni or similar) and rely on StatefulSets. Here is a non exhaustive list, in chronological order from their publication on GitHub: - `Crunchy Data Postgres Operator `_ (2017) - `Zalando Postgres Operator `_ (2017) - `Stackgres `_ (2020) - `Percona Operator for PostgreSQL `_ (2021) - `Kubegres `_ (2021) .. figure:: /images/star_history.png :width: 70% :alt: Star History Chart Star History Chart Feel free to report any relevant missing entry as a PR. .. Note:: The `Data on Kubernetes Community `_ (which includes some of our maintainers) is working on an independent and vendor neutral project to list the operators called `Operator Feature Matrix `_ .   **You say that CloudNativePG is a fully declarative operator. What do you mean by that?** The easiest way is to explain declarative configuration through an example that highlights the differences with imperative configuration. In an imperative context, the state is defined as a series of tasks to be executed in sequence. So, we can get a three-node PostgreSQL cluster by creating the first instance, configuring the replication, cloning a second instance, and the third one. In a declarative approach, the state of a system is defined using configuration, namely: there’s a PostgreSQL 13 cluster with two replicas. This approach highly simplifies change management operations, and when these are stored in source control systems like Git, it enables the Infrastructure as Code capability. And Kubernetes takes it farther than deployment, as it makes sure that our request is fulfilled at any time. **What are the required skills to run PostgreSQL on Kubernetes?** Running PostgreSQL on Kubernetes requires both PostgreSQL and Kubernetes skills in your DevOps team. The best experience is when database administrators familiarize themselves with Kubernetes core concepts and are able to interact with Kubernetes administrators. Our advice is for everyone that wants to fully exploit Cloud Native PostgreSQL to acquire the “Certified Kubernetes Administrator (CKA)” status from the CNCF certification program. **Why isn’t CloudNativePG using StatefulSets?** CloudNativePG does not rely on ``StatefulSet`` resources, and instead manages the underlying PVCs directly by leveraging the selected storage class for dynamic provisioning. Please refer to the :ref:`Custom Pod Controller ` section for details and reasons behind this decision. High availability ----------------- **What happens to the PostgreSQL clusters when the operator pod dies or it is not available for a certain amount of time?** The CloudNativePG operator, among other things, is responsible for self-healing capabilities. As such, they might not be available during an outage of the operator. However, assuming that the outage does not affect the nodes where PostgreSQL clusters are running, the database will continue to serve normal operations, through the relevant Kubernetes services. Moreover, the :ref:`In-place updates of the instance manager ` , which runs inside each PostgreSQL pod will still work, making sure that the database server is up, including accessory services like logging, export of metrics, continuous archiving of WAL files, etc. To summarize: an outage of the operator does not necessarily imply a PostgreSQL database outage; it’s like running a database without a DBA or system administrator. **What are the reasons behind CloudNativePG not relying on a failover management tool like Patroni, repmgr, or Stolon?** Although part of the team that develops CloudNativePG has been heavily involved in repmgr in the past, we decided to take a different approach and directly extend the Kubernetes controller and rely on the Kubernetes API server to hold the status of a Postgres cluster, and use it as the only source of truth to: - control High Availability of a Postgres cluster primarily via automated failover and switchover, coordinating itself with the :ref:`In-place updates of the instance manager ` - control the Kubernetes services, that is the entry points for your applications **Should I manually resync a former primary with the new one following a failover?** No. The operator does that automatically for you, and relies on ``pg_rewind`` to synchronize the former primary with the new one. .. raw:: html