$ for I in $(oc get ns -o jsonpath='{range .items[*]} {.metadata.name}{"\n"} {end}'); \ do oc delete pods --all -n $I; \ sleep 1; \ done
You can update, or upgrade, an OpenShift Container Platform cluster. If your cluster contains Red Hat Enterprise Linux (RHEL) machines, you must perform more steps to update those machines.
Have access to the cluster as a user with admin
privileges.
See Using RBAC to define and apply permissions.
Have a recent etcd backup in case your upgrade fails and you must restore your cluster to a previous state.
If you are upgrading to this release from OpenShift Container Platform 4.3.3 or earlier, you must restart all Pods after the upgrade is complete. This is because the service CA is automatically rotated as of OpenShift Container Platform 4.3.5. The service CA is rotated during the upgrade and a restart is required afterward to ensure that all services are using the new service CA before the previous service CA expires. After this one-time manual restart, subsequent upgrades and rotations will ensure restart before the service CA expires without requiring manual intervention. |
The OpenShift Container Platform update service is the hosted service that provides over-the-air updates to both OpenShift Container Platform and Red Hat Enterprise Linux CoreOS (RHCOS). It provides a graph, or diagram that contain vertices and the edges that connect them, of component Operators. The edges in the graph show which versions you can safely update to, and the vertices are update payloads that specify the intended state of the managed cluster components.
The cluster Version Operator (CVO) in your cluster checks with the OpenShift Container Platform update service to see the valid updates and update paths based on current component versions and information in the graph. When you request an update, the OpenShift Container Platform CVO uses the release image for that update to upgrade your cluster. The release artifacts are hosted in Quay as container images.
To allow the OpenShift Container Platform update service to provide only compatible updates, a release verification pipeline exists to drive automation. Each release artifact is verified for compatibility with supported cloud platforms and system architectures as well as other component packages. After the pipeline confirms the suitability of a release, the OpenShift Container Platform update service notifies you that it is available.
Because the update service displays all valid updates, you must not force an update to a version that the update service does not display. |
During continuous update mode, two controllers run. One continuously updates the payload manifests, applies them to the cluster, and outputs the status of the controlled rollout of the Operators, whether they are available, upgrading, or failed. The second controller polls the OpenShift Container Platform update service to determine if updates are available.
Reverting your cluster to a previous version, or a rollback, is not supported. Only upgrading to a newer version is supported. |
During the upgrade process, the Machine Config Operator (MCO) applies the new configuration to your cluster machines. It cordons the number of nodes that is specified by the maxUnavailable
field on the machine configuration pool and marks them as unavailable. By default, this value is set to 1
. It then applies the new configuration and reboots the machine. If you use Red Hat Enterprise Linux (RHEL) machines as workers, the MCO does not update the kubelet on these machines because you must update the OpenShift API on them first. Because the specification for the new version is applied to the old kubelet, the RHEL machine cannot return to the Ready
state. You cannot complete the update until the machines are available. However, the maximum number of nodes that are unavailable is set to ensure that normal cluster operations are likely to continue with that number of machines out of service.
In OpenShift Container Platform 4.1, Red Hat introduced the concept of channels for
recommending the appropriate release versions for cluster upgrade. By controlling
the pace of upgrades, these upgrade channels allow you to choose an upgrade
strategy. Upgrade channels are tied to a minor version of
OpenShift Container Platform. For instance, OpenShift Container Platform 4.4
upgrade channels will never include an upgrade to a 4.5 release. This strategy ensures that
administrators explicitly decide to upgrade to the next minor version of
OpenShift Container Platform. Upgrade channels control only release selection and do not impact the version of the cluster that you install; the openshift-install
binary file for a specific version of OpenShift Container Platform always installs that version.
OpenShift Container Platform 4.4 offers the following upgrade channels:
candidate-4.4
fast-4.4
stable-4.4
The candidate-4.4
channel contains candidate builds for a z-stream
(4.4.z) release.
Release candidates contain all the features of the product but are not supported. Use release candidate versions to test feature acceptance and assist in qualifying the next version
of OpenShift Container Platform.
A release candidate is any build that is available in the candidate channel, including ones that do not contain -rc
in their names.
After a version is available in the candidate channel, it goes through more quality checks. If
it meets the quality standard, it is promoted to the fast-4.4
or stable-4.4
channels.
Because of this strategy, if a specific release is available in both the candidate-4.4
channel and in the fast-4.4
or stable-4.4
channels, it is a Red Hat supported version.
The candidate-4.4
channel can include release versions from which there are no recommended updates in any channel.
You can use the candidate-4.4
channel to upgrade from a previous minor version of
OpenShift Container Platform.
Release candidates differ from the nightly builds. Nightly builds are available for early access to features, but updating to or from nightly builds is neither recommended nor supported. Nightly builds are not available in any upgrade channel. You can reference the OpenShift Container Platform release statuses for more build information. |
The fast-4.4
channel is updated with new 4.4
versions as soon as Red Hat declares the given version as a general availability
release. As such, these releases are fully supported, are production quality, and have
performed well while available as a release candidate in the candidate-4.4
channel from where they were promoted. Some time after a release appears in the
fast-4.4
channel, it is added to the stable-4.4
channel. Releases
never appear in the stable-4.4
channel before they appear in the fast-4.4
channel.
You can use the fast-4.4
channel to upgrade from a previous minor version of
OpenShift Container Platform.
While the fast-4.4
channel contains releases as soon
as their errata are published, releases are added to the stable-4.4
channel
after a delay. During this delay, data is collected from Red Hat SRE teams, Red Hat support services,
and pre-production and production environments that participate in connected customer program about the stability of the release.
You can use the stable-4.4
channel to upgrade from a previous minor version of
OpenShift Container Platform.
OpenShift Container Platform maintains an upgrade recommendation service that understands the
version of OpenShift Container Platform you have installed as well as the path to take within
the channel you choose to get you to the next release. You can imagine seeing the
following in the fast-4.4
channel:
4.4.0
4.4.1
4.4.3
4.4.4
The service recommends only upgrades that have been tested and have no serious issues. If your cluster is on 4.4.1 and OpenShift Container Platform suggests 4.4.4, then it is safe for you to update from 4.4.1 to 4.4.4. Do not rely on consecutive patch numbers. In this example, 4.4.2 is not, and never was, available in the channel. The update service will not suggest updating to a version of OpenShift Container Platform that contains known vulnerabilities.
Update stability depends on your channel. The presence of an update recommendation in
the candidate-4.4
channel does not imply that the update is supported.
It means that no serious issues have been found with the update yet, but there might
not be significant traffic through the update to suggest stability. The presence of
an update recommendation in the fast-4.4
or stable-4.4
channels is a declaration that the update is fully supported while it is in the
channel. While releases will never be removed from a channel, update recommendations
that exhibit serious issues will be removed from all channels. Updates initiated
after the update recommendation has been removed might not be supported.
Red Hat will eventually provide supported update paths from any supported release in the fast-4.4
or stable-4.4
channels to the latest release in 4.4.z,
although there can be delays while safe paths away from troubled releases are
constructed and verified.
The fast-4.4
and stable-4.4
channels present a choice between receiving
general availability releases as soon as they are available or allowing Red Hat to
control the rollout of those updates. If issues are detected during rollout or at a
later time, upgrades to that version might be blocked in both the fast-4.4
and
stable-4.4
channels, and a new version might be introduced that becomes the new
preferred upgrade target.
Customers can improve this process by configuring pre-production systems on the
fast-4.4
channel, configuring production systems on the stable-4.4
channel,
and participating in Red Hat’s connected customer program. Red
Hat uses this program to observe the impact of updates on your specific hardware and software
configurations. Future releases might improve or alter the pace at which updates move
from the fast-4.4
to the stable-4.4
channel.
If you manage the container images for your OpenShift Container Platform clusters yourself, you must consult the Red Hat errata that is associated with product releases and note any comments that impact upgrades. During upgrade, the user interface might warn you about switching between these versions, so you must ensure that you selected an appropriate version before you bypass those warnings.
Your cluster is still supported if you change from the stable-4.4
channel to
the fast-4.4
channel. Although you can switch to the
candidate-4.4
channel at any time, some releases in that channel might be unsupported release candidates.
You can switch from the candidate-4.4
channel to the fast-4.4
channel
if your current release is a general availability release. You can always
switch from the fast-4.4
channel to the stable-4.4
channel, although if the current release was recently promoted to
fast-4.4
there can
be a delay of up to a day for the release to be promoted to
stable-4.4
. If you change to a channel that does not include your
current release, an alert displays and no updates can be recommended, but you can
safely change back to your original channel at any point.
If updates are available, you can update your cluster from the web console.
You can find information about available OpenShift Container Platform advisories and updates in the errata section of the Customer Portal.
Have access to the web console as a user with admin
privileges.
From the web console, click Administration > cluster Settings and review the contents of the Overview tab.
For production clusters, ensure that the CHANNEL is set to the correct channel for
your current minor version,
such as stable-4.4
.
For production clusters, you must subscribe to a stable-* or fast-* channel. |
If the UPDATE STATUS is not Updates Available, you cannot upgrade your cluster.
The DESIRED VERSION indicates the cluster version that your cluster is running or is updating to.
Click Updates Available, select
the highest available version
and click Update.
The UPDATE STATUS changes to Updating
, and you can review the progress of
the Operator upgrades on the cluster Operators tab.
If you are upgrading to this release from OpenShift Container Platform 4.3.3 or earlier, you must restart all Pods after the upgrade is complete. You can do this using the following command, which requires the OpenShift CLI (oc
):
$ for I in $(oc get ns -o jsonpath='{range .items[*]} {.metadata.name}{"\n"} {end}'); \ do oc delete pods --all -n $I; \ sleep 1; \ done
Restarting all Pods is required because the service CA is automatically rotated as of OpenShift Container Platform 4.3.5. The service CA is rotated during the upgrade and a restart is required afterward to ensure that all services are using the new service CA before the previous service CA expires. After this one-time manual restart, subsequent upgrades and rotations will ensure restart before the service CA expires without requiring manual intervention. |
After the update completes and the cluster Version Operator refreshes the available updates, check if more updates are available in your current channel.
If updates are available, continue to perform updates in the current channel until you can no longer update.
If no updates are available, change the CHANNEL to the stable-* or fast-* channel for the next minor version, and update to the version that you want in that channel.
You might need to perform several intermediate updates until you reach the version that you want.
When you update a cluster that contains Red Hat Enterprise Linux (RHEL) worker machines, those workers temporarily become unavailable during the update process. You must run the upgrade playbook against each RHEL machine as it enters the |
You can use hooks to run Ansible tasks on the RHEL compute machines during the OpenShift Container Platform update.
When you update OpenShift Container Platform, you can run custom tasks on your Red Hat Enterprise Linux (RHEL) nodes during specific operations by using hooks. Hooks allow you to provide files that define tasks to run before or after specific update tasks. You can use hooks to validate or modify custom infrastructure when you update the RHEL compute nodes in you OpenShift Container Platform cluster.
Because when a hook fails, the operation fails, you must design hooks that are idempotent, or can run multiple times and provide the same results.
Hooks have the following important limitations:
- Hooks do not have a defined or versioned interface. They can use internal
openshift-ansible
variables, but it is possible that the variables will be
modified or removed in future OpenShift Container Platform releases.
- Hooks do not have error handling, so an error in a hook halts the update
process. If you get an error, you must address the problem and then start the
upgrade again.
You define the hooks to use when you update the Red Hat Enterprise Linux (RHEL)
compute machines, which are also known as worker machines, in the hosts
inventory file under the all:vars
section.
You have access to the machine that you used to add the RHEL compute machines
cluster. You must have access to the hosts
Ansible inventory file that defines
your RHEL machines.
After you design the hook, create a YAML file that defines the Ansible tasks for it. This file must be a set of tasks and cannot be a playbook, as shown in the following example:
---
# Trivial example forcing an operator to acknowledge the start of an upgrade
# file=/home/user/openshift-ansible/hooks/pre_compute.yml
- name: note the start of a compute machine update
debug:
msg: "Compute machine upgrade of {{ inventory_hostname }} is about to start"
- name: require the user agree to start an upgrade
pause:
prompt: "Press Enter to start the compute machine update"
Modify the hosts
Ansible inventory file to specify the hook files. The
hook files are specified as parameter values in the [all:vars]
section,
as shown:
[all:vars]
openshift_node_pre_upgrade_hook=/home/user/openshift-ansible/hooks/pre_node.yml
openshift_node_post_upgrade_hook=/home/user/openshift-ansible/hooks/post_node.yml
To avoid ambiguity in the paths to the hook, use absolute paths instead of a relative paths in their definitions.
You can use the following hooks when you update the Red Hat Enterprise Linux (RHEL) compute machines in your OpenShift Container Platform cluster.
Hook name | Description |
---|---|
|
|
|
|
|
|
|
|
After you update your cluster, you must update the Red Hat Enterprise Linux (RHEL) compute machines in your cluster.
You updated your cluster.
Because the RHEL machines require assets that are generated by the cluster to complete the update process, you must update the cluster before you update the RHEL compute machines in it. |
You have access to the machine that you used to add the RHEL compute machines
cluster. You must have access to the hosts
Ansible inventory file that defines
your RHEL machines and the upgrade
playbook.
Stop and disable firewalld on the host:
# systemctl disable --now firewalld.service
You must not enable firewalld later. If you do, you cannot access OpenShift Container Platform logs on the worker. |
Enable the repositories that are required for OpenShift Container Platform 4.4:
On the machine that you run the Ansible playbooks, update the required repositories:
# subscription-manager repos --disable=rhel-7-server-ansible-2.8-rpms \ --disable=rhel-7-server-ose-4.3-rpms \ --enable=rhel-7-server-ansible-2.9-rpms \ --enable=rhel-7-server-ose-4.4-rpms
On the machine that you run the Ansible playbooks, update the required packages, including openshift-ansible
:
# yum update openshift-ansible openshift-clients
On each RHEL compute node, update the required repositories:
# subscription-manager repos --disable=rhel-7-server-ose-4.3-rpms \ --enable=rhel-7-server-ose-4.4-rpms
Update a RHEL worker machine:
Review the current node status to determine which RHEL worker to update:
# oc get node NAME STATUS ROLES AGE VERSION mycluster-control-plane-0 Ready master 145m v1.17.1 mycluster-control-plane-1 Ready master 145m v1.17.1 mycluster-control-plane-2 Ready master 145m v1.17.1 mycluster-rhel7-0 NotReady,SchedulingDisabled worker 98m v1.14.6+97c81d00e mycluster-rhel7-1 Ready worker 98m v1.14.6+97c81d00e mycluster-rhel7-2 Ready worker 98m v1.14.6+97c81d00e mycluster-rhel7-3 Ready worker 98m v1.14.6+97c81d00e
Note which machine has the NotReady,SchedulingDisabled
status.
Review your Ansible inventory file at /<path>/inventory/hosts
and update its contents so that only the machine with the NotReady,SchedulingDisabled
status is listed in the [workers]
section, as shown in the following example:
[all:vars] ansible_user=root #ansible_become=True openshift_kubeconfig_path="~/.kube/config" [workers] mycluster-rhel7-0.example.com
Change to the openshift-ansible
directory and run the upgrade
playbook:
$ cd /usr/share/ansible/openshift-ansible $ ansible-playbook -i /<path>/inventory/hosts playbooks/upgrade.yml (1)
1 | For <path> , specify the path to the Ansible inventory file
that you created. |
Follow the process in the previous step to update each RHEL worker machine in your cluster.
After you update all of the workers, confirm that all of your cluster nodes have updated to the new version:
# oc get node NAME STATUS ROLES AGE VERSION mycluster-control-plane-0 Ready master 145m v1.17.1 mycluster-control-plane-1 Ready master 145m v1.17.1 mycluster-control-plane-2 Ready master 145m v1.17.1 mycluster-rhel7-0 NotReady,SchedulingDisabled worker 98m v1.17.1 mycluster-rhel7-1 Ready worker 98m v1.17.1 mycluster-rhel7-2 Ready worker 98m v1.17.1 mycluster-rhel7-3 Ready worker 98m v1.17.1