Best Practice
In this section you will learn how to configure cert-manager to comply with popular security standards such as the CIS Kubernetes Benchmark, the NSA Kubernetes Hardening Guide, or the BSI Kubernetes Security Recommendations.
And you will learn about best practices for deploying cert-manager in production; such as those enforced by tools like Datree and its built in rules, and those documented by the likes of Learnk8s in their "Kubernetes production best practices" checklist.
Overview
The default cert-manager resources in the Helm chart or YAML manifests (Deployment, Pod, ServiceAccount etc) are designed for backwards compatibility rather than for best practice or maximum security. You may find that the default resources do not comply with the security policy on your Kubernetes cluster and in that case you can modify the installation configuration using Helm chart values to override the defaults.
Network Requirements and Network Policy
The network requirements of each cert-manager Pod are summarized below. Some network requirements depend on specific Issuer / ClusterIssuer configurations and / or specific configuration options.
When you have understood the network requirements of your cert-manager installation, you should consider implementing a "least privilege" network policy, using a Kubernetes Network (CNI) Plugin such as Calico.
The network policy should prevent untrusted clients from connecting to the cert-manager Pods and it should prevent cert-manager from connecting to untrusted servers.
An example of this recommendation is found in the Calico Documentation:
We recommend creating an implicit default deny policy for your Kubernetes pods, regardless of whether you use Calico or Kubernetes network policy. This ensures that unwanted traffic is denied by default.
You can use the Kubernetes builtin NetworkPolicy
resource,
which is portable because it is recognized by any of the Kubernetes Network (CNI) Plugins.
Or you may prefer to use the custom resources provided by your CNI software.
📖 Learn about the Kubernetes builtin NetworkPolicy API and see some example policies.
Network Requirements
Here is an overview of the network requirements:
-
UDP / TCP: cert-manager (all) -> Kubernetes DNS: All cert-manager components perform UDP DNS queries for both cluster and external domain names. Some DNS queries may use TCP.
-
TCP: Kubernetes (API server) -> cert-manager (webhook): The Kubernetes API server establishes HTTPS connections to the cert-manager webhook component. Read the cert-manager webhook troubleshooting guide to understand the webhook networking requirements.
-
TCP: cert-manager (webhook, controller, cainjector, startupapicheck) -> Kubernetes API server: The cert-manager webhook, controller, cainjector and startupapicheck establish HTTPS connections to the Kubernetes API server, to interact with cert-manager custom resources and Kubernetes resources. The cert-manager webhook is a special case; it connects to the Kubernetes API server to use the
SubjectAccessReview
API, to verify clients attempting to modifyApproved
orDenied
conditions ofCertificateRequest
resources. -
TCP: cert-manager (controller) -> HashiCorp Vault (authentication and resource API endpoints): The cert-manager controller may establish HTTPS connections to one or more Vault API endpoints, if you are using the Vault Issuer. The target host and port of the Vault endpoints are configured in Issuer or ClusterIssuer resources.
-
TCP: cert-manager (controller) -> Venafi TLS Protect (authentication and resource API endpoints): The cert-manager controller may establish HTTPS connections to one or more Venafi API endpoints, if you are using the Venafi Issuer. The target host and port of the Venafi API endpoints are configured in Issuer or ClusterIssuer resources.
-
TCP: cert-manager (controller) -> DNS API endpoints (for ACME DNS01): The cert-manager controller may establish HTTPS connections to DNS API endpoints such as Amazon Route53, and to any associated authentication endpoints, if you are using the ACME Issuer with DNS01 solvers.
-
UDP / TCP: cert-manager (controller) -> External DNS: If you use the ACME Issuer, the cert-manager controller may send DNS queries to recursive DNS servers, as part of the ACME challenge self-check process. It does this to ensure that the DNS01 or HTTP01 challenge is resolvable, before asking the ACME server to perform its checks.
In the case of DNS01 it may also perform a series of DNS queries to authoritative DNS servers, to compute the DNS zone in which to add the DNS01 challenge record. In the case of DNS01, cert-manager also supports DNS over HTTPS.
You can choose the host and port of the DNS servers, using the following controller flags:
--acme-http01-solver-nameservers
,--dns01-recursive-nameservers
, and--dns01-recursive-nameservers-only
. -
TCP: ACME (Let's Encrypt) -> cert-manager (acmesolver): If you use an ACME Issuer configured for HTTP01, cert-manager will deploy an
acmesolver
Pod, a Service and an Ingress (or Gateway API) resource in the namespace of the Issuer or in the cert-manager namespace if it is a ClusterIssuer. The ACME implementation will establish an HTTP connection to this Pod via your chosen ingress load balancer, so your network policy must allow this.ℹ️ The acmesolver Pod does not require access to the Kubernetes API server.
-
TCP: Metrics Server -> cert-manager (controller): The cert-manager controller has a metrics server which listens for HTTP connections on TCP port 9402. Create a network policy which allows access to this service from your chosen metrics collector.
Isolate cert-manager on dedicated node pools
cert-manager is a cluster scoped operator and you should treat it as part of your platform's control plane. The cert-manager controller creates and modifies Kubernetes Secret resources and the controller and cainjector both cache TLS Secret resources in memory. These are two reasons why you should consider isolating the cert-manager components from other less privileged workloads. For example, if an untrusted or malicious workload runs on the same Node as the cert-manager controller, and somehow gains root access to the underlying node, it may be able to read the private keys in Secrets that the controller has cached in memory.
You can mitigate this risk by running cert-manager on nodes that are reserved for trusted platform operators.
The Helm chart for cert-manager has parameters to configure the Pod tolerations
and nodeSelector
for each component.
The exact values of these parameters will depend on your particular cluster.
📖 Read Assigning Pods to Nodes in the Kubernetes documentation.
📖 Read about Taints and Tolerations in the Kubernetes documentation.
Example
This example demonstrates how to use:
taints
to repel non-platform Pods from Nodes which you have reserved for your platform's control-plane,
tolerations
to allow cert-manager Pods to run on those Nodes, and
nodeSelector
to place the cert-manager Pods on those Nodes.
Label the Nodes:
kubectl label node ... node-restriction.kubernetes.io/reserved-for=platform
Taint the Nodes:
kubectl taint node ... node-restriction.kubernetes.io/reserved-for=platform:NoExecute
Then install cert-manager using the following Helm chart values:
nodeSelector:kubernetes.io/os: linuxnode-restriction.kubernetes.io/reserved-for: platformtolerations:- key: node-restriction.kubernetes.io/reserved-foroperator: Equalvalue: platformwebhook:nodeSelector:kubernetes.io/os: linuxnode-restriction.kubernetes.io/reserved-for: platformtolerations:- key: node-restriction.kubernetes.io/reserved-foroperator: Equalvalue: platformcainjector:nodeSelector:kubernetes.io/os: linuxnode-restriction.kubernetes.io/reserved-for: platformtolerations:- key: node-restriction.kubernetes.io/reserved-foroperator: Equalvalue: platformstartupapicheck:nodeSelector:kubernetes.io/os: linuxnode-restriction.kubernetes.io/reserved-for: platformtolerations:- key: node-restriction.kubernetes.io/reserved-foroperator: Equalvalue: platform
ℹ️ This example uses
nodeSelector
to place the Pods but you could also useaffinity.nodeAffinity
.nodeSelector
is chosen here because it has a simpler syntax.ℹ️ The default
nodeSelector
valuekubernetes.io/os: linux
avoids placing cert-manager Pods on Windows nodes in a mixed OS cluster, so that must be explicitly included here too.📖 Read the Guide to isolating tenant workloads to specific nodes in the EKS Best Practice Guides, for an in-depth explanation of these techniques.
📖 Learn how to Isolate your workloads in dedicated node pools on Google Kubernetes Engine.
📖 Learn about Placing pods on specific nodes using node selectors, with RedHat OpenShift.
📖 Read more about the
node-restriction.kubernetes.io/
prefix and theNodeRestriction
admission plugin.ℹ️ On a multi-tenant cluster, consider enabling the
PodTolerationRestriction
plugin to limit which tolerations tenants may add to their Pods. You may also use that plugin to add default tolerations to thecert-manager
namespace, which obviates the need to explicitly set the tolerations in the Helm chart.ℹ️ Alternatively, you could use Kyverno to limit which tolerations Pods are allowed to use. Read Restrict control plane scheduling as a starting point.
High Availability
cert-manager has three long-running components: controller, cainjector, and webhook.
Each of these components has a Deployment and by default each Deployment has 1 replica
but this does not provide high availability.
The Helm chart for cert-manager has parameters to configure the replicaCount
for each Deployment.
In production we recommend the following replicaCount
parameters:
replicaCount: 2webhook:replicaCount: 3cainjector:replicaCount: 2
controller and cainjector
The controller and cainjector components use leader election to ensure that only one replica is active. This prevents conflicts which would arise if multiple replicas were reconciling the same API resources. So in these components you can use multiple replicas to achieve high availability but not for horizontal scaling.
Use two replicas to ensures that there is a standby Pod scheduled to a Node which is ready to take leadership, should the current leader encounter a disruption. For example, when the leader Pod is drained from its node. Or, if the leader Pod encounters an unexpected deadlock.
There is little justification for using more than 2 replicas of these components.
webhook
By default the cert-manager webhook Deployment has 1 replica, but in production you should use 3 or more. If the cert-manager webhook is unavailable, all API operations on cert-manager custom resources will fail, and this will disrupt any software that creates, updates or deletes cert-manager custom resources (including cert-manager itself), and it may cause other disruptions to your cluster. So it is especially important to keep at multiple replicas of the cert-manager webhook running at all times.
ℹ️ By contrast, if there is only a single replica of the cert-manager controller, there is less risk of disruption. For example, if the Node hosting the single cert-manager controller manager Pod is drained, there will be a delay while a new Pod is started on another Node, and any cert-manager resources that are created or changed during that time will not be reconciled until the new Pod starts up. But the controller manager works asynchronously anyway, so any applications which depend on the cert-manager custom resources will be designed to tolerate this situation. That being said, the best practice is to run 2 or more replicas of each controller if the cluster has sufficient resources.
📖 Read Ensure control plane stability when using webhooks in the Google Kubernetes Engine (GKE) documentation, for examples of how webhook disruptions might disrupt your cluster.
📖 Read The dark side of Kubernetes admission webhooks on the Cisco Tech Blog, to learn more about potential issues caused by webhooks and how you can avoid them.
Topology Spread Constraints
Consider using Topology Spread Constraints, to ensure that a disruption of a node or data center does not degrade the operation of cert-manager.
For high availability you do not want the replica Pods to be scheduled on the same Node, because if that node fails, both the active and standby Pods will exit, and there will be no further reconciliation of the resources by that controller, until there is another Node with sufficient free resources to run a new Pod, and until that Pod has become Ready.
It is also desirable for the Pods to be running in separate data centers (availability zones), if the cluster has nodes distributed between zones. Then, in the event of a failure at the data center hosting the active Pod , the standby Pod will immediately be available to take leadership.
Fortunately you may not need to do anything to achieve these goals because Kubernetes >= 1.24 has Built-in default constraints which should mean that the high availability scheduling described above will happen implicitly.
ℹ️ In case your cluster does not use Built-in default constraints. You can add Topology Spread Constraints to each of the cert-manager components using Helm chart values.
PodDisruptionBudget
For high availability you should also deploy a PodDisruptionBudget
resource with minAvailable=1
.
This ensures that a voluntary disruption, such as the draining of a Node, cannot proceed until at least one other replica has been successfully scheduled and started on another Node. The Helm chart has parameters to enable and configure a PodDisruptionBudget for each of the long-running cert-manager components. We recommend the following parameters:
podDisruptionBudget:enabled: trueminAvailable: 1webhook:podDisruptionBudget:enabled: trueminAvailable: 1cainjector:podDisruptionBudget:enabled: trueminAvailable: 1
📖 Read about Specifying a Disruption Budget for your Application in the Kubernetes documentation.
⚠️ These PodDisruptionBudget settings are only suitable for high availability deployments. You must increase the
replicaCount
of each Deployment to more than theminAvailable
value, otherwise the PodDisruptionBudget will prevent you from draining cert-manager Pods.
Priority Class Name
The reason for setting a priority class is summarized as follows in the Kubernetes blog Protect Your Mission-Critical Pods From Eviction With PriorityClass
:
Pod priority and preemption help to make sure that mission-critical pods are up in the event of a resource crunch by deciding order of scheduling and eviction.
If cert-manager is mission-critical to your platform,
then set a priorityClassName
on the cert-manager Pods
to protect them from preemption,
in situations where a Kubernetes node becomes starved of resources.
Without a priorityClassName
the cert-manager Pods may be evicted to free up resources for other Pods,
and this may cause disruption to any applications that rely on cert-manager.
Most Kubernetes clusters will come with two builtin priority class names:
system-cluster-critical
and system-node-critical
,
which are used for Kubernetes core components.
These can also be used for critical add-ons,
such as cert-manager.
We recommend using the following Helm chart values to set priorityClassName: system-cluster-critical
, for all cert-manager Pods:
global:priorityClassName: system-cluster-critical
On some clusters the ResourceQuota
admission controller may be configured to limit the use of certain priority classes to certain namespaces.
For example, Google Kubernetes Engine (GKE) will only allow priorityClassName: system-cluster-critical
for Pods in the kube-system
namespace,
by default.
📖 Read Kubernetes PR #93121 to see how and why this was implemented.
In such cases you will need to create a ResourceQuota
in the cert-manager
namespace:
# cert-manager-resourcequota.yamlapiVersion: v1kind: ResourceQuotametadata:name: cert-manager-critical-podsnamespace: cert-managerspec:hard:pods: 1GscopeSelector:matchExpressions:- operator: InscopeName: PriorityClassvalues:- system-node-critical- system-cluster-critical
kubectl apply -f cert-manager-resourcequota.yaml
📖 Read Protect Your Mission-Critical Pods From Eviction With
PriorityClass
, a Kubernetes blog post about how Pod priority and preemption help to make sure that mission-critical pods are up in the event of a resource crunch by deciding order of scheduling and eviction.📖 Read Guaranteed Scheduling For Critical Add-On Pods to learn why
system-cluster-critical
should be used for add-ons that are critical to a fully functional cluster.📖 Read Limit Priority Class consumption by default, to learn why platform administrators might restrict usage of certain high priority classes to a limited number of namespaces.
📖 Some examples of other critical add-ons that use the
system-cluster-critical
priority class name: NVIDIA GPU Operator, OPA Gatekeeper, Cilium.
Scalability
cert-manager has three long-running components: controller, cainjector, and webhook. The Helm chart does not include resource requests and limits for any of these, so you should supply resource requests and limits which are appropriate for your cluster.
controller and cainjector
The controller and cainjector components use leader election to ensure that only one replica is active. This prevents conflicts which would arise if multiple replicas were reconciling the same API resources. You cannot use horizontal scaling for these components. Use vertical scaling instead.
Memory
Use vertical scaling to assign sufficient memory resources to these components. The memory requirements will be higher on clusters with very many API resources or with large API resources. This is because each of the components reconciles one or more Kubernetes API resources, and each component will cache the metadata and sometimes the entire resource in memory, so as to reduce the load on the Kubernetes API server.
If your cluster contains a high volume of CertificateRequest
resources such as when using many ephemeral or short lived certificates rotated frequently,
you will need to increase the memory limit of the controller Pod.
You can also reduce the memory consumption of cainjector
by configuring it to only watch resources in the cert-manager
namespace,
and by configuring it to not watch Certificate
resources.
Here's how to configure the cainjector command line flags using Helm chart values:
cainjector:extraArgs:- --namespace=cert-manager- --enable-certificates-data-source=false
⚠️️ This optimization is only appropriate if
cainjector
is being used exclusively for the the cert-manager webhook. It is not appropriate ifcainjector
is also being used to manage the TLS certificates for webhooks of other software. For example, some Kubebuilder derived projects may depend oncainjector
to inject TLS certificates for their webhooks.
CPU
Use vertical scaling to assign sufficient CPU resources to the these components. The CPU requirements will be higher on clusters where there are very frequent updates to the resources which are reconciled by these components. Whenever a resource changes, it will be queued to be re-reconciled by the component. Higher CPU resources allow the component to process the queue faster.
webhook
The cert-manager webhook does not use leader election, so you can scale it horizontally by increasing the number of replicas. When the Kubernetes API server connects to the cert-manager webhook it does so via a Service which load balances the connections between all the Ready replicas. For this reason, there is a clear benefit to increasing the number of webhook replicas to 3 or more, on clusters where there is a high frequency of cert-manager custom resource interactions. Furthermore, the webhook has modest memory requirements because it does not use a cache. For this reason, the resource cost of scaling out the webhook is relatively low.
Use Liveness Probes
An example of this recommendation is found in the Datree Documentation: Ensure each container has a configured liveness probe:
Liveness probes allow Kubernetes to determine when a pod should be replaced. They are fundamental in configuring a resilient cluster architecture.
The cert-manager webhook and controller Pods do have liveness probes. The cainjector Pod does not have a liveness probe, yet. More information below.
webhook
The cert-manager webhook has a liveness probe which is enabled by default and the timings and thresholds can be configured using Helm values.
controller
📢 The cert-manager controller liveness probe was introduced in cert-manager release
1.12
and enabled by default in release1.14
. In case it causes problems in the field, Please get in touch.
The liveness probe for the cert-manager controller is an HTTP probe which connects
to the /livez
endpoint of a healthz server which listens on port 9443 and runs in its own thread.
The /livez
endpoint currently reports the combined status of the following sub-systems
and each sub-system has its own /livez
endpoint. These are:
/livez/leaderElection
: Returns an error if the leader election record has not been renewed or if the leader election thread has exited without also crashing the parent process./livez/clockHealth
: Returns an error if a clock skew is detected between the system clock and the monotonic clock used by Go to schedule timers.
ℹ️ In future more sub-systems could be checked by the
/livez
endpoint, similar to how Kubernetes ensure logging is not blocked and have health checks for each controller.📖 Read about how to access individual health checks and verbose status information (cert-manager uses the same healthz endpoint multiplexer as Kubernetes).
cainjector
The cainjector Pod does not have a liveness probe or a /livez
healthz endpoint,
but there is justification for it in the GitHub issue:
cainjector in a zombie state after attempting to shut down.
Please add your remarks to that issue if you have also experienced this specific problem,
and add your remarks to Helm: Allow configuration of readiness, liveness and startup probes for all created Pods if you have a general request for a liveness probe in cainjector.
Background Information
The cert-manager controller
process and the cainjector
process,
both use the Kubernetes leader election library,
to ensure that only one replica of each process can be active at any one time.
The Kubernetes control-plane components also use this library.
The leader election code runs in a loop in a separate thread (go routine). If it initially wins the leader election race and if it later fails to renew its leader election lease, it exits. If the leader election thread exits, all the other threads are gracefully shutdown and then the process exits. Similarly, if any of the other main threads exit unexpectedly, that will trigger the orderly shutdown of the remaining threads and the process will exit.
This adheres to the principle that Containers should crash when there's a fatal error. Kubernetes will restart the crashed container, and if it crashes repeatedly, there will be increasing time delays between successive restarts.
For this reason, the liveness probe should only be needed if there is a bug in this orderly shutdown process, or if there is a bug in one of the other threads which causes the process to deadlock and not shutdown.
📖 Read Configure Liveness, Readiness and Startup Probes in the Kubernetes documentation, paying particular attention to the notes and cautions in that document.
📖 Read Shooting Yourself in the Foot with Liveness Probes for more cautionary information about liveness probes.
Restrict Auto-Mount of Service Account Tokens
This recommendation is described in the Kyverno Policy Catalogue as follows:
Kubernetes automatically mounts ServiceAccount credentials in each Pod. The ServiceAccount may be assigned roles allowing Pods to access API resources. Blocking this ability is an extension of the least privilege best practice and should be followed if Pods do not need to speak to the API server to function. This policy ensures that mounting of these ServiceAccount tokens is blocked
The cert-manager components do need to speak to the API server but we still recommend setting automountServiceAccountToken: false
for the following reasons:
- Setting
automountServiceAccountToken: false
will allow cert-manager to be installed on clusters where Kyverno (or some other policy system) is configured to deny Pods that have this field set totrue
. The Kubernetes default value istrue
. - With
automountServiceAccountToken: true
, all the containers in the Pod will mount the ServiceAccount token, including side-car and init containers that might have been injected into the cert-manager Pod resources by Kubernetes admission controllers. The principle of least privilege suggests that it is better to explicitly mount the ServiceAccount token into the cert-manager containers.
So it is recommended to set automountServiceAccountToken: false
and manually add a projected Volume
to each of the cert-manager Deployment resources, containing the ServiceAccount token, CA certificate and namespace files that would normally be added automatically by the Kubernetes ServiceAccount controller,
and to explicitly add a read-only VolumeMount
to each of the cert-manager containers.
An example of this configuration is included in the Helm Chart Values file below.
Best Practice Helm Chart Values
Download the following Helm chart values file and supply it to helm install
, helm upgrade
, or helm template
using the --values
flag:
# Helm chart values which make cert-manager comply with CIS, BSI and NSA# security benchmarks and other best practices for deploying cert-manager in# production.## Read the rationale for these values in:# * https://cert-manager.io/docs/installation/best-practice/global:priorityClassName: system-cluster-criticalreplicaCount: 2podDisruptionBudget:enabled: trueminAvailable: 1automountServiceAccountToken: falseserviceAccount:automountServiceAccountToken: falsevolumes:- name: serviceaccount-tokenprojected:defaultMode: 0444sources:- serviceAccountToken:expirationSeconds: 3607path: token- configMap:name: kube-root-ca.crtitems:- key: ca.crtpath: ca.crt- downwardAPI:items:- path: namespacefieldRef:apiVersion: v1fieldPath: metadata.namespacevolumeMounts:- mountPath: /var/run/secrets/kubernetes.io/serviceaccountname: serviceaccount-tokenreadOnly: truewebhook:replicaCount: 3podDisruptionBudget:enabled: trueminAvailable: 1automountServiceAccountToken: falseserviceAccount:automountServiceAccountToken: falsevolumes:- name: serviceaccount-tokenprojected:defaultMode: 0444sources:- serviceAccountToken:expirationSeconds: 3607path: token- configMap:name: kube-root-ca.crtitems:- key: ca.crtpath: ca.crt- downwardAPI:items:- path: namespacefieldRef:apiVersion: v1fieldPath: metadata.namespacevolumeMounts:- mountPath: /var/run/secrets/kubernetes.io/serviceaccountname: serviceaccount-tokenreadOnly: truecainjector:extraArgs:- --namespace=cert-manager- --enable-certificates-data-source=falsereplicaCount: 2podDisruptionBudget:enabled: trueminAvailable: 1automountServiceAccountToken: falseserviceAccount:automountServiceAccountToken: falsevolumes:- name: serviceaccount-tokenprojected:defaultMode: 0444sources:- serviceAccountToken:expirationSeconds: 3607path: token- configMap:name: kube-root-ca.crtitems:- key: ca.crtpath: ca.crt- downwardAPI:items:- path: namespacefieldRef:apiVersion: v1fieldPath: metadata.namespacevolumeMounts:- mountPath: /var/run/secrets/kubernetes.io/serviceaccountname: serviceaccount-tokenreadOnly: truestartupapicheck:automountServiceAccountToken: falseserviceAccount:automountServiceAccountToken: falsevolumes:- name: serviceaccount-tokenprojected:defaultMode: 0444sources:- serviceAccountToken:expirationSeconds: 3607path: token- configMap:name: kube-root-ca.crtitems:- key: ca.crtpath: ca.crt- downwardAPI:items:- path: namespacefieldRef:apiVersion: v1fieldPath: metadata.namespacevolumeMounts:- mountPath: /var/run/secrets/kubernetes.io/serviceaccountname: serviceaccount-tokenreadOnly: true
Other
This list of recommendations is a work-in-progress. If you have other best practice recommendations please contribute to this page.