Redeploying Certificates

Overview

OKD uses certificates to provide secure connections for the following components:

masters (API server and controllers)
etcd
nodes
registry
router

You can use Ansible playbooks provided with the installer to automate checking expiration dates for cluster certificates. Playbooks are also provided to automate backing up and redeploying these certificates, which can fix common certificate errors.

Possible use cases for redeploying certificates include:

The installer detected the wrong host names and the issue was identified too late.
The certificates are expired and you need to update them.
You have a new CA and want to create certificates using it instead.

Checking Certificate Expirations

You can use the installer to warn you about any certificates expiring within a configurable window of days and notify you about any certificates that have already expired. Certificate expiry playbooks use the Ansible role openshift_certificate_expiry.

Certificates examined by the role include:

master and node service certificates
Router and registry service certificates from etcd secrets
master, node, router, registry, and kubeconfig files for cluster-admin users
etcd certificates (including embedded)

Learn how to list all OpenShift TLS certificate expiration dates.

Role Variables

The openshift_certificate_expiry role uses the following variables:

Table 1. Core Variables
Variable Name	Default Value	Description
`openshift_certificate_expiry_config_base`	`/etc/origin`	Base OKD configuration directory.
`openshift_certificate_expiry_warning_days`	`365`	Flag certificates that will expire in this many days from now.
`openshift_certificate_expiry_show_all`	`no`	Include healthy (non-expired and non-warning) certificates in results.

Table 2. Optional Variables
Variable Name	Default Value	Description
`openshift_certificate_expiry_generate_html_report`	`no`	Generate an HTML report of the expiry check results.
`openshift_certificate_expiry_html_report_path`	`$HOME/cert-expiry-report.yyyymmddTHHMMSS.html`	The full path for saving the HTML report. Defaults to consist of home directory and timestamp suffix of the report file.
`openshift_certificate_expiry_save_json_results`	`no`	Save expiry check results as a JSON file.
`openshift_certificate_expiry_json_results_path`	`$HOME/cert-expiry-report.yyyymmddTHHMMSS.json`	The full path for saving the JSON report. Defaults to consist of home directory and timestamp suffix of the report file.

Running Certificate Expiration Playbooks

The OKD installer provides a set of example certificate expiration playbooks, using different sets of configuration for the openshift_certificate_expiry role.

These playbooks must be used with an inventory file that is representative of the cluster. For best results, run ansible-playbook with the -v option.

Using the easy-mode.yaml example playbook, you can try the role out before tweaking it to your specifications as needed. This playbook:

Produces JSON and stylized HTML reports in $HOME directory.
Sets the warning window very large, so you will almost always get results back.
Includes all certificates (healthy or not) in the results.

Change to the playbook directory and run the easy-mode.yaml playbook:

$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -v -i <inventory_file> \
    playbooks/openshift-checks/certificate_expiry/easy-mode.yaml

Other Example Playbooks

The other example playbooks are also available to run directly out of the /usr/share/ansible/openshift-ansible/playbooks/certificate_expiry/ directory.

Table 3. Other Example Playbooks
File Name	Usage
*default.yaml*	Produces the default behavior of the `openshift_certificate_expiry` role.
*html_and_json_default_paths.yaml*	Generates HTML and JSON artifacts in their default paths.
*longer_warning_period.yaml*	Changes the expiration warning window to 1500 days.
*longer-warning-period-json-results.yaml*	Changes the expiration warning window to 1500 days and saves the results as a JSON file.

To run any of these example playbooks:

$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -v -i <inventory_file> \
    playbooks/openshift-checks/certificate_expiry/<playbook>

Output Formats

As noted above, there are two ways to format your check report. In JSON format for machine parsing, or as a stylized HTML page for easy skimming.

HTML Report

An example of an HTML report is provided with the installer. You can open the following file in your browser to view it:

/usr/share/ansible/openshift-ansible/roles/openshift_certificate_expiry/examples/cert-expiry-report.html

JSON Report

There are two top-level keys in the saved JSON results: data and summary.

The data key is a hash where the keys are the names of each host examined and the values are the check results for the certificates identified on each respective host.

The summary key is a hash that summarizes the total number of certificates:

examined on the entire cluster
that are OK
expiring within the configured warning window
already expired

For an example of the full JSON report, see /usr/share/ansible/openshift-ansible/roles/openshift_certificate_expiry/examples/cert-expiry-report.json.

The summary from the JSON data can be easily checked for warnings or expirations using a variety of command-line tools. For example, using grep you can look for the word summary and print out the two lines after the match (-A2):

$ grep -A2 summary $HOME/cert-expiry-report.yyyymmddTHHMMSS.json
    "summary": {
        "warning": 16,
        "expired": 0

If available, the jq tool can also be used to pick out specific values. The first two examples below show how to select just one value, either warning or expired. The third example shows how to select both values at once:

$ jq '.summary.warning' $HOME/cert-expiry-report.yyyymmddTHHMMSS.json
16

$ jq '.summary.expired' $HOME/cert-expiry-report.yyyymmddTHHMMSS.json
0

$ jq '.summary.warning,.summary.expired' $HOME/cert-expiry-report.yyyymmddTHHMMSS.json
16
0

Redeployment playbooks restart control plane services and might cause cluster downtime. An error in one service can cause a playbook to fail and affect cluster health. If a playbook fails, you might need to resolve problems manually and restart the playbook. A playbook must finish all tasks sequentially to succeed.

Use the following playbooks to redeploy master, etcd, node, registry, and router certificates on all relevant hosts. You can redeploy all of them at once using the current CA, redeploy certificates for specific components only, or redeploy a newly generated or custom CA on its own.

Just like the certificate expiry playbooks, these playbooks must be run with an inventory file that is representative of the cluster.

In particular, the inventory must specify or override all host names and IP addresses set via the following variables such that they match the current cluster configuration:

openshift_public_hostname
openshift_public_ip
openshift_master_cluster_hostname
openshift_master_cluster_public_hostname

The playbooks you need are provided by:

# yum install openshift-ansible

The validity (length in days until they expire) for any certificates auto-generated while redeploying can be configured via Ansible as well. See Configuring Certificate Validity.

OKD CA and etcd certificates expire after five years. Signed OKD certificates expire after two years.

Redeploying All Certificates Using the Current OKD and etcd CA

The redeploy-certificates.yml playbook does not regenerate the OKD CA certificate. New master, etcd, node, registry, and router certificates are created using the current CA certificate to sign new certificates.

This also includes serial restarts of:

etcd
master services
node services

To redeploy master, etcd, and node certificates using the current OKD CA, change to the playbook directory and run this playbook, specifying your inventory file:

$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i <inventory_file> \
    playbooks/redeploy-certificates.yml

If the OKD CA was redeployed with the openshift-master/redeploy-openshift-ca.yml playbook you must add -e openshift_redeploy_openshift_ca=true to this command.

Redeploying a New or Custom OKD CA

The openshift-master/redeploy-openshift-ca.yml playbook redeploys the OKD CA certificate by generating a new CA certificate and distributing an updated bundle to all components including client kubeconfig files and the node’s database of trusted CAs (the CA-trust).

This also includes serial restarts of:

master services
node services
docker

Additionally, you can specify a custom CA certificate when redeploying certificates instead of relying on a CA generated by OKD.

When the master services are restarted, the registry and routers can continue to communicate with the master without being redeployed because the master’s serving certificate is the same, and the CA the registry and routers have are still valid.

To redeploy a newly generated or custom CA:

If you want to use a custom CA, set the following variable in your inventory file. To use the current CA, skip this step.

# Configure custom ca certificate
# NOTE: CA certificate will not be replaced with existing clusters.
# This option may only be specified when creating a new cluster or
# when redeploying cluster certificates with the redeploy-certificates
# playbook.
openshift_master_ca_certificate={'certfile': '</path/to/ca.crt>', 'keyfile': '</path/to/ca.key>'}

If the CA certificate is issued by an intermediate CA, the bundled certificate must contain the full chain (the intermediate and root certificates) for the CA in order to validate child certificates.

For example:

$ cat intermediate/certs/intermediate.cert.pem \
      certs/ca.cert.pem >> intermediate/certs/ca-chain.cert.pem

Change to the playbook directory and run the openshift-master/redeploy-openshift-ca.yml playbook, specifying your inventory file:
```
$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i <inventory_file> \
    playbooks/openshift-master/redeploy-openshift-ca.yml
```
With the new OKD CA in place, use the redeploy-certificates.yml playbook whenever you want to redeploy certificates that are signed by the new CA on all components.

When using the redeploy-certificates.yml playbook after the new OKD CA is in place, you must add -e openshift_redeploy_openshift_ca=true to the playbook command.

Redeploying a New etcd CA

The openshift-etcd/redeploy-ca.yml playbook redeploys the etcd CA certificate by generating a new CA certificate and distributing an updated bundle to all etcd peers and master clients.

This also includes serial restarts of:

etcd
master services

To redeploy a newly generated etcd CA:

Run the openshift-etcd/redeploy-ca.yml playbook, specifying your inventory file:

$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i <inventory_file> \
    playbooks/openshift-etcd/redeploy-ca.yml

After you run the playbooks/openshift-etcd/redeploy-ca.yml playbook for the first time, a compressed bundle containing the CA signers is persisted to /etc/etcd/etcd_ca.tgz. Because the CA signers are required for the generation of new etcd certificates, it is important that they are backed up.

If the playbook is run again, as a precaution it does not overwrite this bundle on disk. To run the playbook again, back up and move the bundle from this path and then run the playbook.

With the new etcd CA in place, you can then use the openshift-etcd/redeploy-certificates.yml playbook at your discretion whenever you want to redeploy certificates signed by the new etcd CA on etcd peers and master clients. Alternatively, you can use the redeploy-certificates.yml playbook to redeploy certificates for OKD components in addition to etcd peers and master clients.

The etcd certificate redeployment can result in copying the serial to all master hosts.

Redeploying master and Web Console Certificates

The openshift-master/redeploy-certificates.yml playbook redeploys master certificates and web console certificates. This also includes serial restarts of master services.

To redeploy master certificates and web console certificates, change to the playbook directory and run this playbook, specifying your inventory file:

$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i <inventory_file> \
    playbooks/openshift-master/redeploy-certificates.yml

If you use named certificates, you must update the named certificate parameters in the master-config.yaml file on each master node. If necessary, concatenate all of the required files that form your certificate chain for the certificate file that is provided to the certFile parameter.

Then, restart the OKD master services to apply the changes.

After running this playbook, you must regenerate any service signing certificate or key pairs by deleting existing secrets that contain service serving certificates or removing and re-adding annotations to appropriate services.

You can set the openshift_redeploy_service_signer=false parameter in the inventory file to skip the redeployment of the service signer certificate, if required. If you set openshift_redeploy_openshift_ca=true and openshift_redeploy_service_signer=true in the inventory file, the service signing certificate is redeployed when you redeploy the master certificates. If you set openshift_redeploy_openshift_ca=false or omit the parameter, the service signer certificate is never redeployed.

Redeploying Only Named Certificates

The openshift-master/redeploy-named-certificates.yml playbook redeploys only named certificates. Running this playbook also completes serial restarts of master services.

To redeploy named certificates only, change to the directory that contains the playbooks, and run this playbook.

$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i <inventory_file> \
    playbooks/openshift-master/redeploy-named-certificates.yml

The _ openshift_master_named_certificates_ parameter in ansible inventory file must contain certificates with the same name as in the master-config.yaml file. If the names of certfile and keyfile are changed, you must update the named certificate parameters in the master-config.yaml file on each master node and restart the api and controllers services. The cafile with the full ca chain is added to /etc/origin/master/ca-bundle.crt.

Redeploying etcd Certificates Only

The openshift-etcd/redeploy-certificates.yml playbook only redeploys etcd certificates including master client certificates.

This also include serial restarts of:

etcd
master services.

To redeploy etcd certificates, change to the playbook directory and run this playbook, specifying your inventory file:

$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i <inventory_file> \
    playbooks/openshift-etcd/redeploy-certificates.yml

Redeploying Node Certificates

By default, node certificates are valid for one year. OKD automatically rotates node certificates when they get close to expiring. If automatic approval is not configured, you must manually approve the certificate signing requests (CSRs).

If you need to redeploy certificates because the CA certificate was changed, you can use the playbooks/redeploy-certificates.yml playbook with the -e openshift_redeploy_openshift_ca=true flag. See Redeploying All Certificates Using the Current OpenShift Container Platform and etcd CA for details. When running this playbook, the CSRs are automatically approved.

Redeploying Registry or Router Certificates Only

The openshift-hosted/redeploy-registry-certificates.yml and openshift-hosted/redeploy-router-certificates.yml playbooks replace installer-created certificates for the registry and router. If custom certificates are in use for these components, see Redeploying Custom Registry or Router Certificates to replace them manually.

Redeploying Registry Certificates Only

To redeploy registry certificates, change to the playbook directory and run the following playbook, specifying your inventory file:

$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i <inventory_file> \
    playbooks/openshift-hosted/redeploy-registry-certificates.yml

Redeploying Router Certificates Only

To redeploy router certificates, change to the playbook directory and run the following playbook, specifying your inventory file:

$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i <inventory_file> \
    playbooks/openshift-hosted/redeploy-router-certificates.yml

Redeploying Custom Registry or Router Certificates

When nodes are evacuated due to a redeployed CA, registry and router pods are restarted. If the registry and router certificates were not also redeployed with the new CA, this can cause outages because they cannot reach the masters using their old certificates.

Redeploying Registry Certificates Manually

To redeploy registry certificates manually, you must add new registry certificates to a secret named registry-certificates, then redeploy the registry:

Switch to the default project for the remainder of these steps:
```
$ oc project default
```

If your registry was initially created on OKD 3.1 or earlier, it may still be using environment variables to store certificates (which has been deprecated in favor of using secrets).

Run the following and look for the OPENSHIFT_CA_DATA, OPENSHIFT_CERT_DATA, OPENSHIFT_KEY_DATA environment variables:
```
$ oc set env dc/docker-registry --list
```

If they do not exist, skip this step. If they do, create the following ClusterRoleBinding:

$ cat <<EOF |
apiVersion: v1
groupNames: null
kind: ClusterRoleBinding
metadata:
  creationTimestamp: null
  name: registry-registry-role
roleRef:
  kind: ClusterRole
  name: system:registry
subjects:
- kind: ServiceAccount
  name: registry
  namespace: default
userNames:
- system:serviceaccount:default:registry
EOF
oc create -f -

Then, run the following to remove the environment variables:

$ oc set env dc/docker-registry OPENSHIFT_CA_DATA- OPENSHIFT_CERT_DATA- OPENSHIFT_KEY_DATA- OPENSHIFT_master-

Set the following environment variables locally to make later commands less complex:

$ REGISTRY_IP=`oc get service docker-registry -o jsonpath='{.spec.clusterIP}'`
$ REGISTRY_HOSTNAME=`oc get route/docker-registry -o jsonpath='{.spec.host}'`

Create new registry certificates:

$ oc adm ca create-server-cert \
    --signer-cert=/etc/origin/master/ca.crt \
    --signer-key=/etc/origin/master/ca.key \
    --hostnames=$REGISTRY_IP,docker-registry.default.svc,docker-registry.default.svc.cluster.local,$REGISTRY_HOSTNAME \
    --cert=/etc/origin/master/registry.crt \
    --key=/etc/origin/master/registry.key \
    --signer-serial=/etc/origin/master/ca.serial.txt

Run oc adm commands only from the first master listed in the Ansible host inventory file, by default /etc/ansible/hosts.

Update the registry-certificates secret with the new registry certificates:

$ oc create secret generic registry-certificates \
    --from-file=/etc/origin/master/registry.crt,/etc/origin/master/registry.key \
    -o json --dry-run | oc replace -f -

Redeploy the registry:
```
$ oc rollout latest dc/docker-registry
```

Redeploying Router Certificates Manually

To redeploy router certificates manually, you must add new router certificates to a secret named router-certs, then redeploy the router:

Switch to the default project for the remainder of these steps:
```
$ oc project default
```
If your router was initially created on OKD 3.1 or earlier, it might still use environment variables to store certificates, which has been deprecated in favor of using service serving certificate secret.
1. Run the following command and look for the OPENSHIFT_CA_DATA, OPENSHIFT_CERT_DATA, OPENSHIFT_KEY_DATA environment variables:
  $ oc set env dc/router --list
2. If those variables exist, create the following ClusterRoleBinding:
  $ cat <<EOF | apiVersion: v1 groupNames: null kind: ClusterRoleBinding metadata: creationTimestamp: null name: router-router-role roleRef: kind: ClusterRole name: system:router subjects: - kind: ServiceAccount name: router namespace: default userNames: - system:serviceaccount:default:router EOF oc create -f -
3. If those variables exist, run the following command to remove them:
  $ oc set env dc/router OPENSHIFT_CA_DATA- OPENSHIFT_CERT_DATA- OPENSHIFT_KEY_DATA- OPENSHIFT_master-
Obtain a certificate.
- If you use an external Certificate Authority (CA) to sign your certificates, create a new certificate and provide it to OKD by following your internal processes.
- If you use the internal OKD CA to sign certificates, run the following commands:
  
  The following commands generate a certificate that is internally signed. It will be trusted by only clients that trust the OKD CA.
  $ cd /root $ mkdir cert ; cd cert $ oc adm ca create-server-cert \ --signer-cert=/etc/origin/master/ca.crt \ --signer-key=/etc/origin/master/ca.key \ --signer-serial=/etc/origin/master/ca.serial.txt \ --hostnames='*.hostnames.for.the.certificate' \ --cert=router.crt \ --key=router.key \
  These commands generate the following files:
  - A new certificate named router.crt.
  - A copy of the signing CA certificate chain, /etc/origin/master/ca.crt. This chain can contain more than one certificate if you use intermediate CAs.
  - A corresponding private key named router.key.

Create a new file that concatenates the generated certificates:

$ cat router.crt /etc/origin/master/ca.crt router.key > router.pem

This step is only valid if you are using a certificate signed by the OpenShift CA. If a custom certificate is used, a file with the correct CA chain should be used instead of /etc/origin/master/ca.crt.

Before you generate a new secret, back up the current one:

$ oc get -o yaml --export secret router-certs > ~/old-router-certs-secret.yaml

Create a new secret to hold the new certificate and key, and replace the contents of the existing secret:
```
$ oc create secret tls router-certs --cert=router.pem \ (1)
    --key=router.key -o json --dry-run | \
    oc replace -f -
```
1 router.pem is the file that contains the concatenation of the certificates that you generated.
Redeploy the router:
```
$ oc rollout latest dc/router
```
When routers are initially deployed, an annotation is added to the router’s service that automatically creates a service serving certificate secret named router-metrics-tls.

To redeploy router-metrics-tls certificates manually, that service serving certificate can be triggered to be recreated by deleting the secret, removing and re-adding annotations to the router service, then redeploying the router-metrics-tls secret:

Remove the following annotations from the router service:

$ oc annotate service router \
    service.alpha.openshift.io/serving-cert-secret-name- \
    service.alpha.openshift.io/serving-cert-signed-by-

Remove the existing router-metrics-tls secret.
```
$ oc delete secret router-metrics-tls
```

Re-add the annotations:

$ oc annotate service router \
    service.alpha.openshift.io/serving-cert-secret-name=router-metrics-tls

Managing Certificate Signing Requests

Cluster administrators can review certificate signing requests (CSRs) and approve or deny them.

Reviewing Certificate Signing Requests

You can review the list of certificate signing requests (CSRs).

Get the list of current CSRs:
```
$ oc get csr
```
View the details of a CSR to verify that it is valid:
```
$ oc describe csr <csr_name> (1)
```
1 <csr_name> is the name of a CSR from the list of current CSRs.

Approving Certificate Signing Requests

You can manually approve certificate signing requests (CSRs) by using the oc certificate approve command.

Approve a CSR:
```
$ oc adm certificate approve <csr_name> (1)
```
1 <csr_name> is the name of a CSR from the list of current CSRs.

Approve all pending CSRs:

$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve

Denying Certificate Signing Requests

You can manually deny certificate signing requests (CSRs) by using the oc certificate deny command.

Deny a CSR:
```
$ oc adm certificate deny <csr_name> (1)
```
1 <csr_name> is the name of a CSR from the list of current CSRs.

Configuring Automatic Approval of Certificate Signing Requests

You can configure automatic approval of node certificate signing requests (CSRs) by specifying adding the following parameter to your Ansible inventory file when installing your cluster:

openshift_master_bootstrap_auto_approve=true

Adding this parameter allows all CSRs generated by using the bootstrap credential or from a previously authenticated node with the same host name to be approved without any administrator intervention.

For more information, see Configuring Cluster Variables.