$ oc adm upgrade --to=4.15.33
|
Requested update to 4.15.33 (1)
1 | The Requested update value changes depending on your particular update. |
Follow these steps to perform the z-stream cluster update and monitor the update through to completion. Completing a z-stream update is more straightforward than a Control Plane Only or y-stream update.
When updating from one y-stream release to the next, you must ensure that the intermediate z-stream releases are also compatible.
You can verify that you are updating to a viable release by running the |
Start the update:
$ oc adm upgrade --to=4.15.33
|
Requested update to 4.15.33 (1)
1 | The Requested update value changes depending on your particular update. |
You upgrade the worker nodes after you have updated the control plane by unpausing the relevant mcp
groups you created.
Unpausing the mcp
group starts the upgrade process for the worker nodes in that group.
Each of the worker nodes in the cluster reboot to upgrade to the new EUS, y-stream or z-stream version as required.
In the case of Control Plane Only upgrades note that when a worker node is updated it will only require one reboot and will jump <y+2>-release versions. This is a feature that was added to decrease the amount of time that it takes to upgrade large bare-metal clusters.
This is a potential holding point. You can have a cluster version that is fully supported to run in production with the control plane that is updated to a new EUS release while the worker nodes are at a <y-2>-release. This allows large clusters to upgrade in steps across several maintenance windows. |
You can check how many nodes are managed in an mcp
group.
Run the following command to get the list of mcp
groups:
$ oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-c9a52144456dbff9c9af9c5a37d1b614 True False False 3 3 3 0 36d
mcp-1 rendered-mcp-1-07fe50b9ad51fae43ed212e84e1dcc8e False False False 1 0 0 0 47h
mcp-2 rendered-mcp-2-07fe50b9ad51fae43ed212e84e1dcc8e False False False 1 0 0 0 47h
worker rendered-worker-f1ab7b9a768e1b0ac9290a18817f60f0 True False False 0 0 0 0 36d
You decide how many |
Get the list of nodes in the cluster:
$ oc get nodes
NAME STATUS ROLES AGE VERSION
ctrl-plane-0 Ready control-plane,master 5d8h v1.29.8+f10c92d
ctrl-plane-1 Ready control-plane,master 5d8h v1.29.8+f10c92d
ctrl-plane-2 Ready control-plane,master 5d8h v1.29.8+f10c92d
worker-0 Ready mcp-1,worker 5d8h v1.27.15+6147456
worker-1 Ready mcp-2,worker 5d8h v1.27.15+6147456
Confirm the MachineConfigPool
groups that are paused:
$ oc get mcp -o json | jq -r '["MCP","Paused"], ["---","------"], (.items[] | [(.metadata.name), (.spec.paused)]) | @tsv' | grep -v worker
MCP Paused
--- ------
master false
mcp-1 true
mcp-2 true
Each |
Unpause the required mcp
group to begin the upgrade:
$ oc patch mcp/mcp-1 --type merge --patch '{"spec":{"paused":false}}'
machineconfigpool.machineconfiguration.openshift.io/mcp-1 patched
Confirm that the required mcp
group is unpaused:
$ oc get mcp -o json | jq -r '["MCP","Paused"], ["---","------"], (.items[] | [(.metadata.name), (.spec.paused)]) | @tsv' | grep -v worker
MCP Paused
--- ------
master false
mcp-1 false
mcp-2 true
As each mcp
group is upgraded, continue to unpause and upgrade the remaining nodes.
$ oc get nodes
NAME STATUS ROLES AGE VERSION
ctrl-plane-0 Ready control-plane,master 5d8h v1.29.8+f10c92d
ctrl-plane-1 Ready control-plane,master 5d8h v1.29.8+f10c92d
ctrl-plane-2 Ready control-plane,master 5d8h v1.29.8+f10c92d
worker-0 Ready mcp-1,worker 5d8h v1.29.8+f10c92d
worker-1 NotReady,SchedulingDisabled mcp-2,worker 5d8h v1.27.15+6147456
Run the following commands after updating the cluster to verify that the cluster is back up and running.
Check the cluster version by running the following command:
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.16.14 True False 4h38m Cluster version is 4.16.14
This should return the new cluster version and the PROGRESSING
column should return False
.
Check that all nodes are ready:
$ oc get nodes
NAME STATUS ROLES AGE VERSION
ctrl-plane-0 Ready control-plane,master 5d9h v1.29.8+f10c92d
ctrl-plane-1 Ready control-plane,master 5d9h v1.29.8+f10c92d
ctrl-plane-2 Ready control-plane,master 5d9h v1.29.8+f10c92d
worker-0 Ready mcp-1,worker 5d9h v1.29.8+f10c92d
worker-1 Ready mcp-2,worker 5d9h v1.29.8+f10c92d
All nodes in the cluster should be in a Ready
status and running the same version.
Check that there are no paused mcp
resources in the cluster:
$ oc get mcp -o json | jq -r '["MCP","Paused"], ["---","------"], (.items[] | [(.metadata.name), (.spec.paused)]) | @tsv' | grep -v worker
MCP Paused
--- ------
master false
mcp-1 false
mcp-2 false
Check that all cluster Operators are available:
$ oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.16.14 True False False 5d9h
baremetal 4.16.14 True False False 5d9h
cloud-controller-manager 4.16.14 True False False 5d10h
cloud-credential 4.16.14 True False False 5d10h
cluster-autoscaler 4.16.14 True False False 5d9h
config-operator 4.16.14 True False False 5d9h
console 4.16.14 True False False 5d9h
control-plane-machine-set 4.16.14 True False False 5d9h
csi-snapshot-controller 4.16.14 True False False 5d9h
dns 4.16.14 True False False 5d9h
etcd 4.16.14 True False False 5d9h
image-registry 4.16.14 True False False 85m
ingress 4.16.14 True False False 5d9h
insights 4.16.14 True False False 5d9h
kube-apiserver 4.16.14 True False False 5d9h
kube-controller-manager 4.16.14 True False False 5d9h
kube-scheduler 4.16.14 True False False 5d9h
kube-storage-version-migrator 4.16.14 True False False 4h48m
machine-api 4.16.14 True False False 5d9h
machine-approver 4.16.14 True False False 5d9h
machine-config 4.16.14 True False False 5d9h
marketplace 4.16.14 True False False 5d9h
monitoring 4.16.14 True False False 5d9h
network 4.16.14 True False False 5d9h
node-tuning 4.16.14 True False False 5d7h
openshift-apiserver 4.16.14 True False False 5d9h
openshift-controller-manager 4.16.14 True False False 5d9h
openshift-samples 4.16.14 True False False 5h24m
operator-lifecycle-manager 4.16.14 True False False 5d9h
operator-lifecycle-manager-catalog 4.16.14 True False False 5d9h
operator-lifecycle-manager-packageserver 4.16.14 True False False 5d9h
service-ca 4.16.14 True False False 5d9h
storage 4.16.14 True False False 5d9h
All cluster Operators should report True
in the AVAILABLE
column.
Check that all pods are healthy:
$ oc get po -A | grep -E -iv 'complete|running'
This should not return any pods.
You might see a few pods still moving after the update. Watch this for a while to make sure all pods are cleared. |