NEW: Get project updates onTwitterandMastodon

Release 1.16

cert-manager 1.16 includes various improvements to the metrics in the cert-manager components.

Breaking changes

  1. Helm schema validation may reject your existing Helm values files if they contain typos or unrecognized fields. For more details, refer to the Helm section below.
  2. Venafi Issuer may fail to renew certificates if the requested duration conflicts with the CA’s minimum or maximum policy settings in Venafi. For more details, refer to the Venafi Issuer section below.
  3. Venafi Issuer may fail to renew Certificates if the issuer has been configured for TPP with username-password authentication. For more details, refer to the Venafi Issuer section below.

Themes

Helm

The Helm chart now includes a JSON schema which will validate the values that you supply when installing the chart. This will help you to get your Helm values right first time. It will alert you to typos and unrecognized fields in your existing Helm values files. And it will make it easier for the cert-manager maintainers to maintain the Helm chart, avoiding typos and mistakes in the default values file.

⚠️ Helm schema validation may reject your existing Helm values files if they contain typos or unrecognized fields. You can use helm template to test your Helm values before you upgrade:

helm template cert-manager \
--repo https://charts.jetstack.io \
--version v1.16.0-beta.0 \
--values values.cert-manager.yaml

Here's an example of an error that would be caught by the schema validation:

# values.cert-manager.yaml
global:
logLevel: debug # ❗ Should be an integer.
Error: values don't meet the specifications of the schema(s) in the following chart(s):
cert-manager:
- global.logLevel: Invalid type. Expected: number, given: string

ℹ️ The schema files are generated by helm-tool, a utility which generates Helm docs, schema files and performs linting.

📖 Read Helm: Charts: Schema Files to learn more.

Extended Metrics

The webhook and cainjector components now have metrics servers, so that platform teams can monitor the performance of all the cert-manager components and gain more information about the underlying Go runtime in the event of a problem. Read the Prometheus Metrics page to learn more.

Venafi Issuer

We've made some important improvements to the Venafi Issuer.

If you use the Venafi Issuer with a TPP server with username-password authentication, cert-manager 1.16 now uses OAuth authentication instead of the deprecated API Key authentication. This is a potentially breaking change, because you may need to reconfigure your TPP server to enable OAuth authentication, and you may need to reconfigure the cert-manager service accounts in TPP to work with OAuth.

The desired certificate.spec.duration value will now be sent to the Venafi API server. The default value for certificate.spec.duration is 90 days, but you may have changed this in your Certificate resources. Your Venafi issuing template may be configured to ignore the requested From and To times, in which case nothing will change. Your Venafi issuing template may be configured with a maximum or a minimum duration, in which case your certificate requests may fail after you upgrade to cert-manager 1.16. Consider this carefully when upgrading to cert-manager 1.16.

When connecting to Venafi TPP, cert-manager can now load the CA certificate from a Secret resource. This allows you to manage the CA with familiar tools such as trust-manager.

Read the Venafi Issuer page to learn more.

Route53 DNS01 Solver Cleanup

The Route53 DNS01 solver code had become over-complicated due to its age and due to the variety of authentication methods that have been added over the years. When we upgraded to AWS SDK for Go V2in the last release, we did not have a good understanding of the new SDK and we were not able to test it thoroughly with all authentication methods. In this release we started putting that right.

In this release we have tidied up the code and added more logging so that it is easier to debug problems in the field. We have improved the documentation of the Route53 API fields, particularly the region field, where we have tried to describe where and how cert-manager uses that value.

We have relaxed the API validation so that the region field is now optional. cert-manager will now fall back to using the AWS_REGION environment variable of the controller Pod, regardless of which authentication mechanism is used.

Users who use IAM Roles for Service accounts or Pod Identity need not specify the region, but if your Issuer or ClusterIssuer does include a region (for the sake of satisfying the old API validation), that issuer region will be ignored, if the AWS_REGION environment variable is set.

cert-manager will now use regional STS endpoints, when using AssumeRole or when using a dedicated (non-mounted) Kubernetes ServiceAccount. The regional endpoint will be computed based on the Issuer region field, or the AWS_REGION environment variable.

ℹ️ This change only affects the AssumeRole configuration, which is used for cross-account authentication, and the AssumeRoleWithWebIdentity configuration, where the user supplies the name of a Kubernetes ServiceAccount. It does not affect you if you have configured the cert-manager ServiceAccount for IRSA, where the ServiceAccount token is mounted in to the cert-manager controller Pod. Regional STS endpoints were already being used in that case.

ℹ️ There are good reasons to use regional STS endpoints, summarized as follows on the Amazon AWS blog:

Although the global (legacy) AWS STS endpoint https://sts.amazonaws.com is highly available, it’s hosted in a single AWS Region — US East (N. Virginia) — and like other endpoints, it doesn’t provide automatic fail-over to endpoints in other Regions.

📖 Read Manage AWS STS in an AWS Region to learn about which regions support STS.

📖 Read AWS STS Regional endpoints, to learn how to configure the use of regional STS endpoints using environment variables.

Read the ACME Issuer Route53 page to learn more.

Memory Optimizations

We have continued our effort to reduce the memory footprint of cert-manager.

The cainjector no longer caches Secret data; instead it only caches the metadata of Secret resources. This significantly reduces its memory usage. It also reduces the load on the Kubernetes API server, when cainjector starts up, because it no longer needs to send all the data of all the Secret resources over the network.

cert-manager now uses client-go v0.31.0 which supports a new WatchListClient feature. This enables cert-manager to make use of the Streaming Lists feature of the Kubernetes API server. This reduces the load on the Kubernetes API server, because cert-manager components will no longer request complete unpaged lists of all API resources when they start up. And it reduces the peak memory use of the cert-manager components when they startup, because they no longer have to hold a duplicate unpaged list of resources in-memory while they add them to the client side cache. To use this feature, you first need to enable the WatchList feature in the Kubernetes API server, which is available since Kubernetes 1.27. Second, you need to enable the client-go WatchListClient feature in the cert-manager components. If you installed cert-manager using Helm, you can use the following Helm values:

# values.cert-manager.yaml
extraEnv:
- name: KUBE_FEATURE_WatchListClient
value: "true"
cainjector:
extraEnv:
- name: KUBE_FEATURE_WatchListClient
value: "true"
webhook:
extraEnv:
- name: KUBE_FEATURE_WatchListClient
value: "true"

You will see log messages reporting the state of the client-go feature gates, when cert-manager starts up. And if you increase the logging verbosity, you will see sendInitialEvents=true and resourceVersionMatch=NotOlderThan among the requests. For example:

Feature gate updated state [caller=features/envvar.go:169 enabled=true feature=WatchListClient]
GET https://10.96.0.1:443/api/v1/secrets?allowWatchBookmarks=true&labelSelector=%21controller.cert-manager.io%2Ffao&resourceVersionMatch=NotOlderThan&sendInitialEvents=true&timeout=6m49s&timeoutSeconds=409&watch=true 200 OK in 2 milliseconds [caller=transport/round_trippers.go:553]

Read Kubernetes API Concepts: Streaming Lists, to learn more. Read Introducing Feature Gates to Client-Go: Enhancing Flexibility and Control, to learn about enabling and disabling client-go features.

Logging

We have improved the signal-to-noise ratio in the logs.

The controller has a new feature gate: UseDomainQualifiedFinalizer. This changes the finalizer added to ACME Challenge resources, from finalizer.acme.cert-manager.io to acme.cert-manager.io/finalizer. The new finalizer name is compliant with Kubernetes standards, and will resolve warnings in cert-manager-controller pods of the form:

W0910 20:07:22.491920 1 warnings.go:70] metadata.finalizers: "finalizer.acme.cert-manager.io": prefer a domain-qualified finalizer name to avoid accidental conflicts with other finalizer writers

Read cert-manager component configuration: Feature gates to learn more.

cert-manager now uses client-go v0.31.0, which removes a lot of noisy errors from logs, of the form:

reflector.go: unable to sync list result: internal error: cannot cast object DeletedFinalStateUnknown

Read cert-manager issue 6753 to learn more.

Community

Thanks again to all open-source contributors with commits in this release, including: TODO

Thanks also to the following cert-manager maintainers for their contributions during this release: TODO

Equally thanks to everyone who provided feedback, helped users and raised issues on GitHub and Slack and joined our meetings!

Thanks also to the CNCF, which provides resources and support, and to the AWS open source team for being good community members and for their maintenance of the PrivateCA Issuer.

In addition, massive thanks to Venafi for contributing developer time and resources towards the continued maintenance of cert-manager projects.

Changes since v1.15.0

Feature

  • Add SecretRef support for Venafi TPP issuer CA Bundle (#7036, @sankalp-at-gh)
  • Add renewBeforePercentage alternative to renewBefore (#6987, @cbroglie)
  • Add a metrics server to the cainjector (#7194, @wallrj)
  • Add a metrics server to the webhook (#7182, @wallrj)
  • Add client certificate auth method for Vault issuer (#4330, @joshmue)
  • Add process and go runtime metrics for controller (#6966, @mindw)
  • Added app.kubernetes.io/managed-by: cert-manager label to the cert-manager-webhook-ca Secret (#7154, @jrcichra)
  • Allow the user to specify a Pod template when using GatewayAPI HTTP01 solver, this mirrors the behavior when using the Ingress HTTP01 solver. (#7211, @ThatsMrTalbot)
  • Create token request RBAC for the cert-manager ServiceAccount by default (#7213, @Jasper-Ben)
  • Feature: Append cert-manager user-agent string to all AWS API requests, including IMDS and STS requests. (#7295, @wallrj)
  • Feature: Log AWS SDK warnings and API requests at cert-manager debug level to help debug AWS Route53 problems in the field. (#7292, @wallrj)
  • Feature: The Route53 DNS solver of the ACME Issuer will now use regional STS endpoints computed from the region that is supplied in the Issuer spec or in the AWS_REGION environment variable. Feature: The Route53 DNS solver of the ACME Issuer now uses the "ambient" region (AWS_REGION or AWS_DEFAULT_REGION) if issuer.spec.acme.solvers.dns01.route53.region is empty; regardless of the flags --issuer-ambient-credentials and --cluster-issuer-ambient-credentials. (#7299, @wallrj)
  • Helm: adds JSON schema validation for the Helm values. (#7069, @inteon)
  • If the --controllers flag only specifies disabled controllers, the default controllers are now enabled implicitly. Added disableAutoApproval and approveSignerNames Helm chart options. (#7049, @inteon)
  • Make it easier to configure cert-manager using Helm by defaulting config.apiVersion and config.kind within the Helm chart. (#7126, @ThatsMrTalbot)
  • Now passes down specified duration to Venafi client instead of using the CA default only. (#7104, @Guitarkalle)
  • Reduce the memory usage of cainjector, by only caching the metadata of Secret resources. Reduce the load on the K8S API server when cainjector starts up, by only listing the metadata of Secret resources. (#7161, @wallrj)
  • The Route53 DNS01 solver of the ACME Issuer can now detect the AWS region from the AWS_REGION and AWS_DEFAULT_REGION environment variables, which is set by the IAM for Service Accounts (IRSA) webhook and by the Pod Identity webhook. The issuer.spec.acme.solvers.dns01.route53.region field is now optional. The API documentation of the region field has been updated to explain when and how the region value is used. (#7287, @wallrj)
  • Venafi TPP issuer can now be used with a username & password combination with OAuth. Fixes #4653. Breaking: cert-manager will no longer use the API Key authentication method which was deprecated in 20.2 and since removed in 24.1 of TPP. (#7084, @hawksight)
  • You can now configure the pod security context of HTTP-01 solver pods. (#5373, @aidy)

Bug or Regression

  • Adds support (behind a flag) to use a domain qualified finalizer. If the feature is enabled (which is not by default), it should prevent Kubernetes from reporting: metadata.finalizers: "finalizer.acme.cert-manager.io": prefer a domain-qualified finalizer name to avoid accidental conflicts with other finalizer writers (#7273, @jsoref)
  • BUGFIX Route53: explicitly set the aws-global STS region which is now required by the github.com/aws/aws-sdk-go-v2 library. (#7108, @inteon)
  • BUGFIX: fix issue that caused Vault issuer to not retry signing when an error was encountered. (#7105, @inteon)
  • BUGFIX: the dynamic certificate source used by the webhook TLS server failed to detect a root CA approaching expiration, due to a calculation error. This will cause the webhook TLS server to fail renewing its CA certificate. Please upgrade before the expiration of this CA certificate is reached. (#7230, @inteon)
  • Bugfix: Prevent aggressive Route53 retries caused by IRSA authentication failures by removing the Amazon Request ID from errors wrapped by the default credential cache. (#7291, @wallrj)
  • Bugfix: Prevent aggressive Route53 retries caused by STS authentication failures by removing the Amazon Request ID from STS errors. (#7259, @wallrj)
  • Bump grpc-go to fix GHSA-xr7q-jx4m-x55m (#7164, @SgtCoDFish)
  • Bump the go-retryablehttp dependency to fix CVE-2024-6104 (#7125, @SgtCoDFish)
  • Fix Azure DNS causing panics whenever authentication error happens (#7177, @eplightning)
  • Fix incorrect indentation of endpointAdditionalProperties in the PodMonitor template of the Helm chart (#7190, @wallrj)
  • Fixes ACME HTTP01 challenge behavior when using Gateway API to prevent unbounded creation of HTTPRoute resources (#7178, @miguelvr)
  • Handle errors arising from challenges missing from the ACME server (#7202, @bdols)
  • Helm BUGFIX: the cainjector ConfigMap was not mounted in the cainjector deployment. (#7052, @inteon)
  • Improve the startupapicheck: validate that the validating and mutating webhooks are doing their job. (#7057, @inteon)
  • The KeyUsages X.509 extension is no longer added when there are no key usages set (in accordance to RFC 5280 Section 4.2.1.3) (#7250, @inteon)
  • Update github.com/Azure/azure-sdk-for-go/sdk/azidentity to address CVE-2024-35255 (#7087, @dependabot[bot])

Other (Cleanup or Flake)

  • Old API versions were removed from the codebase. Removed: (acme.)cert-manager.io/v1alpha2 (acme.)cert-manager.io/v1alpha3 (acme.)cert-manager.io/v1beta1 (#7278, @inteon)
  • Upgrading to client-go v0.31.0 removes a lot of noisy reflector.go: unable to sync list result: internal error: cannot cast object DeletedFinalStateUnknown errors from logs. (#7237, @inteon)