$ oc -n openshift-config patch cm admin-acks --patch '{"data":{"ack-<update_version_from>-kube-<kube_api_version>-api-removals-in-<update_version_to>":"true"}}' --type=merge
Follow these steps to perform the Control Plane Only cluster update and monitor the update through to completion.
Control Plane Only updates were previously known as EUS-to-EUS updates. Control Plane Only updates are only viable between even-numbered minor versions of OKD. |
When you update to all versions from 4.11 and later, you must manually acknowledge that the update can continue.
Before you acknowledge the update, verify that there are no Kubernetes APIs in use that are removed in the version you are updating to. For example, in OKD 4.17, there are no API removals. See "Kubernetes API removals" for more information. |
Run the following command:
$ oc -n openshift-config patch cm admin-acks --patch '{"data":{"ack-<update_version_from>-kube-<kube_api_version>-api-removals-in-<update_version_to>":"true"}}' --type=merge
where:
Is the cluster version you are moving from, for example, 4.14
.
Is kube API version, for example, 1.28
.
Is the cluster version you are moving to, for example, 4.15
.
Verify the update. Run the following command:
$ oc get configmap admin-acks -n openshift-config -o json | jq .data
{
"ack-4.14-kube-1.28-api-removals-in-4.15": "true",
"ack-4.15-kube-1.29-api-removals-in-4.16": "true"
}
In this example, the cluster is updated from version 4.14 to 4.15, and then from 4.15 to 4.16 in a Control Plane Only update. |
When updating from one y-stream release to the next, you must ensure that the intermediate z-stream releases are also compatible.
You can verify that you are updating to a viable release by running the |
Start the update:
$ oc adm upgrade --to=4.15.33
|
Requested update to 4.15.33 (1)
1 | The Requested update value changes depending on your particular update. |
You should check the cluster health often during the update. Check for the node status, cluster Operators status and failed pods.
Monitor the cluster update. For example, to monitor the cluster update from version 4.14 to 4.15, run the following command:
$ watch "oc get clusterversion; echo; oc get co | head -1; oc get co | grep 4.14; oc get co | grep 4.15; echo; oc get no; echo; oc get po -A | grep -E -iv 'running|complete'"
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.14.34 True True 4m6s Working towards 4.15.33: 111 of 873 done (12% complete), waiting on kube-apiserver
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.14.34 True False False 4d22h
baremetal 4.14.34 True False False 4d23h
cloud-controller-manager 4.14.34 True False False 4d23h
cloud-credential 4.14.34 True False False 4d23h
cluster-autoscaler 4.14.34 True False False 4d23h
console 4.14.34 True False False 4d22h
...
storage 4.14.34 True False False 4d23h
config-operator 4.15.33 True False False 4d23h
etcd 4.15.33 True False False 4d23h
NAME STATUS ROLES AGE VERSION
ctrl-plane-0 Ready control-plane,master 4d23h v1.27.15+6147456
ctrl-plane-1 Ready control-plane,master 4d23h v1.27.15+6147456
ctrl-plane-2 Ready control-plane,master 4d23h v1.27.15+6147456
worker-0 Ready mcp-1,worker 4d23h v1.27.15+6147456
worker-1 Ready mcp-2,worker 4d23h v1.27.15+6147456
NAMESPACE NAME READY STATUS RESTARTS AGE
openshift-marketplace redhat-marketplace-rf86t 0/1 ContainerCreating 0 0s
During the update the watch
command cycles through one or several of the cluster Operators at a time, providing a status of the Operator update in the MESSAGE
column.
When the cluster Operators update process is complete, each control plane nodes is rebooted, one at a time.
During this part of the update, messages are reported that state cluster Operators are being updated again or are in a degraded state. This is because the control plane node is offline while it reboots nodes. |
As soon as the last control plane node reboot is complete, the cluster version is displayed as updated.
When the control plane update is complete a message such as the following is displayed. This example shows an update completed to the intermediate y-stream release.
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.15.33 True False 28m Cluster version is 4.15.33
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.15.33 True False False 5d
baremetal 4.15.33 True False False 5d
cloud-controller-manager 4.15.33 True False False 5d1h
cloud-credential 4.15.33 True False False 5d1h
cluster-autoscaler 4.15.33 True False False 5d
config-operator 4.15.33 True False False 5d
console 4.15.33 True False False 5d
...
service-ca 4.15.33 True False False 5d
storage 4.15.33 True False False 5d
NAME STATUS ROLES AGE VERSION
ctrl-plane-0 Ready control-plane,master 5d v1.28.13+2ca1a23
ctrl-plane-1 Ready control-plane,master 5d v1.28.13+2ca1a23
ctrl-plane-2 Ready control-plane,master 5d v1.28.13+2ca1a23
worker-0 Ready mcp-1,worker 5d v1.28.13+2ca1a23
worker-1 Ready mcp-2,worker 5d v1.28.13+2ca1a23
In telco environments, software needs to vetted before it is loaded onto a production cluster. Production clusters are also configured in a disconnected network, which means that they are not always directly connected to the internet. Because the clusters are in a disconnected network, the OpenShift Operators are configured for manual update during installation so that new versions can be managed on a cluster-by-cluster basis. Perform the following procedure to move the Operators to the newer versions.
Check to see which Operators need to be updated:
$ oc get installplan -A | grep -E 'APPROVED|false'
NAMESPACE NAME CSV APPROVAL APPROVED
metallb-system install-nwjnh metallb-operator.v4.16.0-202409202304 Manual false
openshift-nmstate install-5r7wr kubernetes-nmstate-operator.4.16.0-202409251605 Manual false
Patch the InstallPlan
resources for those Operators:
$ oc patch installplan -n metallb-system install-nwjnh --type merge --patch \
'{"spec":{"approved":true}}'
installplan.operators.coreos.com/install-nwjnh patched
Monitor the namespace by running the following command:
$ oc get all -n metallb-system
NAME READY STATUS RESTARTS AGE
pod/metallb-operator-controller-manager-69b5f884c-8bp22 0/1 ContainerCreating 0 4s
pod/metallb-operator-controller-manager-77895bdb46-bqjdx 1/1 Running 0 4m1s
pod/metallb-operator-webhook-server-5d9b968896-vnbhk 0/1 ContainerCreating 0 4s
pod/metallb-operator-webhook-server-d76f9c6c8-57r4w 1/1 Running 0 4m1s
...
NAME DESIRED CURRENT READY AGE
replicaset.apps/metallb-operator-controller-manager-69b5f884c 1 1 0 4s
replicaset.apps/metallb-operator-controller-manager-77895bdb46 1 1 1 4m1s
replicaset.apps/metallb-operator-controller-manager-99b76f88 0 0 0 4m40s
replicaset.apps/metallb-operator-webhook-server-5d9b968896 1 1 0 4s
replicaset.apps/metallb-operator-webhook-server-6f7dbfdb88 0 0 0 4m40s
replicaset.apps/metallb-operator-webhook-server-d76f9c6c8 1 1 1 4m1s
When the update is complete, the required pods should be in a Running
state, and the required ReplicaSet
resources should be ready:
NAME READY STATUS RESTARTS AGE
pod/metallb-operator-controller-manager-69b5f884c-8bp22 1/1 Running 0 25s
pod/metallb-operator-webhook-server-5d9b968896-vnbhk 1/1 Running 0 25s
...
NAME DESIRED CURRENT READY AGE
replicaset.apps/metallb-operator-controller-manager-69b5f884c 1 1 1 25s
replicaset.apps/metallb-operator-controller-manager-77895bdb46 0 0 0 4m22s
replicaset.apps/metallb-operator-webhook-server-5d9b968896 1 1 1 25s
replicaset.apps/metallb-operator-webhook-server-d76f9c6c8 0 0 0 4m22s
Verify that the Operators do not need to be updated for a second time:
$ oc get installplan -A | grep -E 'APPROVED|false'
There should be no output returned.
Sometimes you have to approve an update twice because some Operators have interim z-stream release versions that need to be installed before the final version. |
After completing the first y-stream update, you must update the y-stream control plane version to the new EUS version.
Verify that the <4.y.z> release that you selected is still listed as a good channel to move to:
$ oc adm upgrade
Cluster version is 4.15.33
upgradeable=False
Reason: AdminAckRequired
Message: Kubernetes 1.29 and therefore OpenShift 4.16 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/7031404 for details and instructions.
Upstream is unset, so the cluster will use an appropriate default.
Channel: eus-4.16 (available channels: candidate-4.15, candidate-4.16, eus-4.16, fast-4.15, fast-4.16, stable-4.15, stable-4.16)
Recommended updates:
VERSION IMAGE
4.16.14 quay.io/openshift-release-dev/ocp-release@sha256:0521a0f1acd2d1b77f76259cb9bae9c743c60c37d9903806a3372c1414253658
4.16.13 quay.io/openshift-release-dev/ocp-release@sha256:6078cb4ae197b5b0c526910363b8aff540343bfac62ecb1ead9e068d541da27b
4.15.34 quay.io/openshift-release-dev/ocp-release@sha256:f2e0c593f6ed81250c11d0bac94dbaf63656223477b7e8693a652f933056af6e
If you update soon after the initial GA of a new Y-stream release, you might not see new y-stream releases available when you run the |
Optional: View the potential update releases that are not recommended. Run the following command:
$ oc adm upgrade --include-not-recommended
Cluster version is 4.15.33
upgradeable=False
Reason: AdminAckRequired
Message: Kubernetes 1.29 and therefore OpenShift 4.16 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/7031404 for details and instructions.
Upstream is unset, so the cluster will use an appropriate default.Channel: eus-4.16 (available channels: candidate-4.15, candidate-4.16, eus-4.16, fast-4.15, fast-4.16, stable-4.15, stable-4.16)
Recommended updates:
VERSION IMAGE
4.16.14 quay.io/openshift-release-dev/ocp-release@sha256:0521a0f1acd2d1b77f76259cb9bae9c743c60c37d9903806a3372c1414253658
4.16.13 quay.io/openshift-release-dev/ocp-release@sha256:6078cb4ae197b5b0c526910363b8aff540343bfac62ecb1ead9e068d541da27b
4.15.34 quay.io/openshift-release-dev/ocp-release@sha256:f2e0c593f6ed81250c11d0bac94dbaf63656223477b7e8693a652f933056af6e
Supported but not recommended updates:
Version: 4.16.15
Image: quay.io/openshift-release-dev/ocp-release@sha256:671bc35e
Recommended: Unknown
Reason: EvaluationFailed
Message: Exposure to AzureRegistryImagePreservation is unknown due to an evaluation failure: invalid PromQL result length must be one, but is 0
In Azure clusters, the in-cluster image registry may fail to preserve images on update. https://issues.redhat.com/browse/IR-461
The example shows a potential error that can affect clusters hosted in Microsoft Azure. It does not show risks for bare-metal clusters. |
When moving between y-stream releases, you must run a patch command to explicitly acknowledge the update.
In the output of the oc adm upgrade
command, a URL is provided that shows the specific command to run.
Before you acknowledge the update, verify that there are no Kubernetes APIs in use that are removed in the version you are updating to. For example, in OKD 4.17, there are no API removals. See "Kubernetes API removals" for more information. |
Acknowledge the y-stream release upgrade by patching the admin-acks
config map in the openshift-config
namespace.
For example, run the following command:
$ oc -n openshift-config patch cm admin-acks --patch '{"data":{"ack-4.15-kube-1.29-api-removals-in-4.16":"true"}}' --type=merge
configmap/admin-acks patched
After you have determined the full new release that you are moving to, you can run the oc adm upgrade –to=x.y.z
command.
Start the y-stream control plane update. For example, run the following command:
$ oc adm upgrade --to=4.16.14
Requested update to 4.16.14
You might move to a z-stream release that has potential issues with platforms other than the one you are running on. The following example shows a potential problem for cluster updates on Microsoft Azure:
$ oc adm upgrade --to=4.16.15
error: the update 4.16.15 is not one of the recommended updates, but is available as a conditional update. To accept the Recommended=Unknown risk and to proceed with update use --allow-not-recommended.
Reason: EvaluationFailed
Message: Exposure to AzureRegistryImagePreservation is unknown due to an evaluation failure: invalid PromQL result length must be one, but is 0
In Azure clusters, the in-cluster image registry may fail to preserve images on update. https://issues.redhat.com/browse/IR-461
The example shows a potential error that can affect clusters hosted in Microsoft Azure. It does not show risks for bare-metal clusters. |
$ oc adm upgrade --to=4.16.15 --allow-not-recommended
warning: with --allow-not-recommended you have accepted the risks with 4.14.11 and bypassed Recommended=Unknown EvaluationFailed: Exposure to AzureRegistryImagePreservation is unknown due to an evaluation failure: invalid PromQL result length must be one, but is 0
In Azure clusters, the in-cluster image registry may fail to preserve images on update. https://issues.redhat.com/browse/IR-461
Requested update to 4.16.15
Monitor the second part of the cluster update to the <y+1> version.
Monitor the progress of the second part of the <y+1> update. For example, to monitor the update from 4.15 to 4.16, run the following command:
$ watch "oc get clusterversion; echo; oc get co | head -1; oc get co | grep 4.15; oc get co | grep 4.16; echo; oc get no; echo; oc get po -A | grep -E -iv 'running|complete'"
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.15.33 True True 10m Working towards 4.16.14: 132 of 903 done (14% complete), waiting on kube-controller-manager, kube-scheduler
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.15.33 True False False 5d3h
baremetal 4.15.33 True False False 5d4h
cloud-controller-manager 4.15.33 True False False 5d4h
cloud-credential 4.15.33 True False False 5d4h
cluster-autoscaler 4.15.33 True False False 5d4h
console 4.15.33 True False False 5d3h
...
config-operator 4.16.14 True False False 5d4h
etcd 4.16.14 True False False 5d4h
kube-apiserver 4.16.14 True True False 5d4h NodeInstallerProgressing: 1 node is at revision 15; 2 nodes are at revision 17
NAME STATUS ROLES AGE VERSION
ctrl-plane-0 Ready control-plane,master 5d4h v1.28.13+2ca1a23
ctrl-plane-1 Ready control-plane,master 5d4h v1.28.13+2ca1a23
ctrl-plane-2 Ready control-plane,master 5d4h v1.28.13+2ca1a23
worker-0 Ready mcp-1,worker 5d4h v1.27.15+6147456
worker-1 Ready mcp-2,worker 5d4h v1.27.15+6147456
NAMESPACE NAME READY STATUS RESTARTS AGE
openshift-kube-apiserver kube-apiserver-ctrl-plane-0 0/5 Pending 0 <invalid>
As soon as the last control plane node is complete, the cluster version is updated to the new EUS release. For example:
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.16.14 True False 123m Cluster version is 4.16.14
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.16.14 True False False 5d6h
baremetal 4.16.14 True False False 5d7h
cloud-controller-manager 4.16.14 True False False 5d7h
cloud-credential 4.16.14 True False False 5d7h
cluster-autoscaler 4.16.14 True False False 5d7h
config-operator 4.16.14 True False False 5d7h
console 4.16.14 True False False 5d6h
#...
operator-lifecycle-manager-packageserver 4.16.14 True False False 5d7h
service-ca 4.16.14 True False False 5d7h
storage 4.16.14 True False False 5d7h
NAME STATUS ROLES AGE VERSION
ctrl-plane-0 Ready control-plane,master 5d7h v1.29.8+f10c92d
ctrl-plane-1 Ready control-plane,master 5d7h v1.29.8+f10c92d
ctrl-plane-2 Ready control-plane,master 5d7h v1.29.8+f10c92d
worker-0 Ready mcp-1,worker 5d7h v1.27.15+6147456
worker-1 Ready mcp-2,worker 5d7h v1.27.15+6147456
In the second phase of a multi-version upgrade, you must approve all of the Operators and additionally add installations plans for any other Operators that you want to upgrade.
Follow the same procedure as outlined in "Updating the OLM Operators". Ensure that you also update any non-OLM Operators as required.
Monitor the cluster update. For example, to monitor the cluster update from version 4.14 to 4.15, run the following command:
$ watch "oc get clusterversion; echo; oc get co | head -1; oc get co | grep 4.14; oc get co | grep 4.15; echo; oc get no; echo; oc get po -A | grep -E -iv 'running|complete'"
Check to see which Operators need to be updated:
$ oc get installplan -A | grep -E 'APPROVED|false'
Patch the InstallPlan
resources for those Operators:
$ oc patch installplan -n metallb-system install-nwjnh --type merge --patch \
'{"spec":{"approved":true}}'
Monitor the namespace by running the following command:
$ oc get all -n metallb-system
When the update is complete, the required pods should be in a Running
state, and the required ReplicaSet
resources should be ready.
During the update the watch
command cycles through one or several of the cluster Operators at a time, providing a status of the Operator update in the MESSAGE
column.
When the cluster Operators update process is complete, each control plane nodes is rebooted, one at a time.
During this part of the update, messages are reported that state cluster Operators are being updated again or are in a degraded state. This is because the control plane node is offline while it reboots nodes. |
You upgrade the worker nodes after you have updated the control plane by unpausing the relevant mcp
groups you created.
Unpausing the mcp
group starts the upgrade process for the worker nodes in that group.
Each of the worker nodes in the cluster reboot to upgrade to the new EUS, y-stream or z-stream version as required.
In the case of Control Plane Only upgrades note that when a worker node is updated it will only require one reboot and will jump <y+2>-release versions. This is a feature that was added to decrease the amount of time that it takes to upgrade large bare-metal clusters.
This is a potential holding point. You can have a cluster version that is fully supported to run in production with the control plane that is updated to a new EUS release while the worker nodes are at a <y-2>-release. This allows large clusters to upgrade in steps across several maintenance windows. |
You can check how many nodes are managed in an mcp
group.
Run the following command to get the list of mcp
groups:
$ oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-c9a52144456dbff9c9af9c5a37d1b614 True False False 3 3 3 0 36d
mcp-1 rendered-mcp-1-07fe50b9ad51fae43ed212e84e1dcc8e False False False 1 0 0 0 47h
mcp-2 rendered-mcp-2-07fe50b9ad51fae43ed212e84e1dcc8e False False False 1 0 0 0 47h
worker rendered-worker-f1ab7b9a768e1b0ac9290a18817f60f0 True False False 0 0 0 0 36d
You decide how many |
Get the list of nodes in the cluster:
$ oc get nodes
NAME STATUS ROLES AGE VERSION
ctrl-plane-0 Ready control-plane,master 5d8h v1.29.8+f10c92d
ctrl-plane-1 Ready control-plane,master 5d8h v1.29.8+f10c92d
ctrl-plane-2 Ready control-plane,master 5d8h v1.29.8+f10c92d
worker-0 Ready mcp-1,worker 5d8h v1.27.15+6147456
worker-1 Ready mcp-2,worker 5d8h v1.27.15+6147456
Confirm the MachineConfigPool
groups that are paused:
$ oc get mcp -o json | jq -r '["MCP","Paused"], ["---","------"], (.items[] | [(.metadata.name), (.spec.paused)]) | @tsv' | grep -v worker
MCP Paused
--- ------
master false
mcp-1 true
mcp-2 true
Each |
Unpause the required mcp
group to begin the upgrade:
$ oc patch mcp/mcp-1 --type merge --patch '{"spec":{"paused":false}}'
machineconfigpool.machineconfiguration.openshift.io/mcp-1 patched
Confirm that the required mcp
group is unpaused:
$ oc get mcp -o json | jq -r '["MCP","Paused"], ["---","------"], (.items[] | [(.metadata.name), (.spec.paused)]) | @tsv' | grep -v worker
MCP Paused
--- ------
master false
mcp-1 false
mcp-2 true
As each mcp
group is upgraded, continue to unpause and upgrade the remaining nodes.
$ oc get nodes
NAME STATUS ROLES AGE VERSION
ctrl-plane-0 Ready control-plane,master 5d8h v1.29.8+f10c92d
ctrl-plane-1 Ready control-plane,master 5d8h v1.29.8+f10c92d
ctrl-plane-2 Ready control-plane,master 5d8h v1.29.8+f10c92d
worker-0 Ready mcp-1,worker 5d8h v1.29.8+f10c92d
worker-1 NotReady,SchedulingDisabled mcp-2,worker 5d8h v1.27.15+6147456
Run the following commands after updating the cluster to verify that the cluster is back up and running.
Check the cluster version by running the following command:
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.16.14 True False 4h38m Cluster version is 4.16.14
This should return the new cluster version and the PROGRESSING
column should return False
.
Check that all nodes are ready:
$ oc get nodes
NAME STATUS ROLES AGE VERSION
ctrl-plane-0 Ready control-plane,master 5d9h v1.29.8+f10c92d
ctrl-plane-1 Ready control-plane,master 5d9h v1.29.8+f10c92d
ctrl-plane-2 Ready control-plane,master 5d9h v1.29.8+f10c92d
worker-0 Ready mcp-1,worker 5d9h v1.29.8+f10c92d
worker-1 Ready mcp-2,worker 5d9h v1.29.8+f10c92d
All nodes in the cluster should be in a Ready
status and running the same version.
Check that there are no paused mcp
resources in the cluster:
$ oc get mcp -o json | jq -r '["MCP","Paused"], ["---","------"], (.items[] | [(.metadata.name), (.spec.paused)]) | @tsv' | grep -v worker
MCP Paused
--- ------
master false
mcp-1 false
mcp-2 false
Check that all cluster Operators are available:
$ oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.16.14 True False False 5d9h
baremetal 4.16.14 True False False 5d9h
cloud-controller-manager 4.16.14 True False False 5d10h
cloud-credential 4.16.14 True False False 5d10h
cluster-autoscaler 4.16.14 True False False 5d9h
config-operator 4.16.14 True False False 5d9h
console 4.16.14 True False False 5d9h
control-plane-machine-set 4.16.14 True False False 5d9h
csi-snapshot-controller 4.16.14 True False False 5d9h
dns 4.16.14 True False False 5d9h
etcd 4.16.14 True False False 5d9h
image-registry 4.16.14 True False False 85m
ingress 4.16.14 True False False 5d9h
insights 4.16.14 True False False 5d9h
kube-apiserver 4.16.14 True False False 5d9h
kube-controller-manager 4.16.14 True False False 5d9h
kube-scheduler 4.16.14 True False False 5d9h
kube-storage-version-migrator 4.16.14 True False False 4h48m
machine-api 4.16.14 True False False 5d9h
machine-approver 4.16.14 True False False 5d9h
machine-config 4.16.14 True False False 5d9h
marketplace 4.16.14 True False False 5d9h
monitoring 4.16.14 True False False 5d9h
network 4.16.14 True False False 5d9h
node-tuning 4.16.14 True False False 5d7h
openshift-apiserver 4.16.14 True False False 5d9h
openshift-controller-manager 4.16.14 True False False 5d9h
openshift-samples 4.16.14 True False False 5h24m
operator-lifecycle-manager 4.16.14 True False False 5d9h
operator-lifecycle-manager-catalog 4.16.14 True False False 5d9h
operator-lifecycle-manager-packageserver 4.16.14 True False False 5d9h
service-ca 4.16.14 True False False 5d9h
storage 4.16.14 True False False 5d9h
All cluster Operators should report True
in the AVAILABLE
column.
Check that all pods are healthy:
$ oc get po -A | grep -E -iv 'complete|running'
This should not return any pods.
You might see a few pods still moving after the update. Watch this for a while to make sure all pods are cleared. |