This is a cache of https://docs.okd.io/4.15/networking/ovn_kubernetes_network_provider/rollback-to-openshift-sdn.html. It is a snapshot of the page at 2024-11-26T20:36:24.922+0000.
Rolling back to the OpenShift SDN network plugin - OVN-Kubernetes network plugin | Networking | OKD 4.15
×

As a cluster administrator, you can roll back to the OpenShift SDN network plugin from the OVN-Kubernetes network plugin using either the offline migration method, or the limited live migration method. This can only be done after the migration to the OVN-Kubernetes network plugin has successfully completed.

  • If you used the offline migration method to migrate to the OpenShift SDN network plugin from the OVN-Kubernetes network plugin, you should use the offline migration rollback method.

  • If you used the limited live migration method to migrate to the OpenShift SDN network plugin from the OVN-Kubernetes network plugin, you should use the limited live migration rollback method.

OpenShift SDN CNI is deprecated as of OKD 4.14. As of OKD 4.15, the network plugin is not an option for new installations. In a subsequent future release, the OpenShift SDN network plugin is planned to be removed and no longer supported. Red Hat will provide bug fixes and support for this feature until it is removed, but this feature will no longer receive enhancements. As an alternative to OpenShift SDN CNI, you can use OVN Kubernetes CNI instead.

Using the offline migration method to roll back to the OpenShift SDN network plugin

Cluster administrators can roll back to the OpenShift SDN Container Network Interface (CNI) network plugin by using the offline migration method. During the migration you must manually reboot every node in your cluster. With the offline migration method, there is some downtime, during which your cluster is unreachable.

You must wait until the migration process from OpenShift SDN to OVN-Kubernetes network plugin is successful before initiating a rollback.

If a rollback to OpenShift SDN is required, the following table describes the process.

Table 1. Performing a rollback to OpenShift SDN
User-initiated steps Migration activity

Suspend the MCO to ensure that it does not interrupt the migration.

The MCO stops.

Set the migration field of the Network.operator.openshift.io custom resource (CR) named cluster to OpenShiftSDN. Make sure the migration field is null before setting it to a value.

CNO

Updates the status of the Network.config.openshift.io CR named cluster accordingly.

Update the networkType field.

CNO

Performs the following actions:

  • Destroys the OVN-Kubernetes control plane pods.

  • Deploys the OpenShift SDN control plane pods.

  • Updates the Multus objects to reflect the new network plugin.

Reboot each node in the cluster.

Cluster

As nodes reboot, the cluster assigns IP addresses to pods on the OpenShift-SDN network.

Enable the MCO after all nodes in the cluster reboot.

MCO

Rolls out an update to the systemd configuration necessary for OpenShift SDN; the MCO updates a single machine per pool at a time by default, so the total time the migration takes increases with the size of the cluster.

Prerequisites
  • The OpenShift CLI (oc) is installed.

  • Access to the cluster as a user with the cluster-admin role is available.

  • The cluster is installed on infrastructure configured with the OVN-Kubernetes network plugin.

  • A recent backup of the etcd database is available.

  • A manual reboot can be triggered for each node.

  • The cluster is in a known good state, without any errors.

Procedure
  1. Stop all of the machine configuration pools managed by the Machine Config Operator (MCO):

    • Stop the master configuration pool by entering the following command in your CLI:

      $ oc patch MachineConfigPool master --type='merge' --patch \
        '{ "spec": { "paused": true } }'
    • Stop the worker machine configuration pool by entering the following command in your CLI:

      $ oc patch MachineConfigPool worker --type='merge' --patch \
        '{ "spec":{ "paused": true } }'
  2. To prepare for the migration, set the migration field to null by entering the following command in your CLI:

    $ oc patch Network.operator.openshift.io cluster --type='merge' \
      --patch '{ "spec": { "migration": null } }'
  3. Check that the migration status is empty for the Network.config.openshift.io object by entering the following command in your CLI. Empty command output indicates that the object is not in a migration operation.

    $ oc get Network.config cluster -o jsonpath='{.status.migration}'
  4. Apply the patch to the Network.operator.openshift.io object to set the network plugin back to OpenShift SDN by entering the following command in your CLI:

    $ oc patch Network.operator.openshift.io cluster --type='merge' \
      --patch '{ "spec": { "migration": { "networkType": "OpenShiftSDN" } } }'

    If you applied the patch to the Network.config.openshift.io object before the patch operation finalizes on the Network.operator.openshift.io object, the Cluster Network Operator (CNO) enters into a degradation state and this causes a slight delay until the CNO recovers from the degraded state.

  5. Confirm that the migration status of the network plugin for the Network.config.openshift.io cluster object is OpenShiftSDN by entering the following command in your CLI:

    $ oc get Network.config cluster -o jsonpath='{.status.migration.networkType}'
  6. Apply the patch to the Network.config.openshift.io object to set the network plugin back to OpenShift SDN by entering the following command in your CLI:

    $ oc patch Network.config.openshift.io cluster --type='merge' \
      --patch '{ "spec": { "networkType": "OpenShiftSDN" } }'
  7. Optional: Disable automatic migration of several OVN-Kubernetes capabilities to the OpenShift SDN equivalents:

    • Egress IPs

    • Egress firewall

    • Multicast

    To disable automatic migration of the configuration for any of the previously noted OpenShift SDN features, specify the following keys:

    $ oc patch Network.operator.openshift.io cluster --type='merge' \
      --patch '{
        "spec": {
          "migration": {
            "networkType": "OpenShiftSDN",
            "features": {
              "egressIP": <bool>,
              "egressFirewall": <bool>,
              "multicast": <bool>
            }
          }
        }
      }'

    where:

    bool: Specifies whether to enable migration of the feature. The default is true.

  8. Optional: You can customize the following settings for OpenShift SDN to meet your network infrastructure requirements:

    • Maximum transmission unit (MTU)

    • VXLAN port

    To customize either or both of the previously noted settings, customize and enter the following command in your CLI. If you do not need to change the default value, omit the key from the patch.

    $ oc patch Network.operator.openshift.io cluster --type=merge \
      --patch '{
        "spec":{
          "defaultNetwork":{
            "openshiftSDNConfig":{
              "mtu":<mtu>,
              "vxlanPort":<port>
        }}}}'
    mtu

    The MTU for the VXLAN overlay network. This value is normally configured automatically, but if the nodes in your cluster do not all use the same MTU, then you must set this explicitly to 50 less than the smallest node MTU value.

    port

    The UDP port for the VXLAN overlay network. If a value is not specified, the default is 4789. The port cannot be the same as the Geneve port that is used by OVN-Kubernetes. The default value for the Geneve port is 6081.

    Example patch command
    $ oc patch Network.operator.openshift.io cluster --type=merge \
      --patch '{
        "spec":{
          "defaultNetwork":{
            "openshiftSDNConfig":{
              "mtu":1200
        }}}}'
  9. Reboot each node in your cluster. You can reboot the nodes in your cluster with either of the following approaches:

    • With the oc rsh command, you can use a bash script similar to the following:

      #!/bin/bash
      readarray -t POD_NODES <<< "$(oc get pod -n openshift-machine-config-operator -o wide| grep daemon|awk '{print $1" "$7}')"
      
      for i in "${POD_NODES[@]}"
      do
        read -r POD NODE <<< "$i"
        until oc rsh -n openshift-machine-config-operator "$POD" chroot /rootfs shutdown -r +1
          do
            echo "cannot reboot node $NODE, retry" && sleep 3
          done
      done
    • With the ssh command, you can use a bash script similar to the following. The script assumes that you have configured sudo to not prompt for a password.

      #!/bin/bash
      
      for ip in $(oc get nodes  -o jsonpath='{.items[*].status.addresses[?(@.type=="InternalIP")].address}')
      do
         echo "reboot node $ip"
         ssh -o StrictHostKeyChecking=no core@$ip sudo shutdown -r -t 3
      done
  10. Wait until the Multus daemon set rollout completes. Run the following command to see your rollout status:

    $ oc -n openshift-multus rollout status daemonset/multus

    The name of the Multus pods is in the form of multus-<xxxxx> where <xxxxx> is a random sequence of letters. It might take several moments for the pods to restart.

    Example output
    Waiting for daemon set "multus" rollout to finish: 1 out of 6 new pods have been updated...
    ...
    Waiting for daemon set "multus" rollout to finish: 5 of 6 updated pods are available...
    daemon set "multus" successfully rolled out
  11. After the nodes in your cluster have rebooted and the multus pods are rolled out, start all of the machine configuration pools by running the following commands::

    • Start the master configuration pool:

      $ oc patch MachineConfigPool master --type='merge' --patch \
        '{ "spec": { "paused": false } }'
    • Start the worker configuration pool:

      $ oc patch MachineConfigPool worker --type='merge' --patch \
        '{ "spec": { "paused": false } }'

    As the MCO updates machines in each config pool, it reboots each node.

    By default the MCO updates a single machine per pool at a time, so the time that the migration requires to complete grows with the size of the cluster.

  12. Confirm the status of the new machine configuration on the hosts:

    1. To list the machine configuration state and the name of the applied machine configuration, enter the following command in your CLI:

      $ oc describe node | egrep "hostname|machineconfig"
      Example output
      kubernetes.io/hostname=master-0
      machineconfiguration.openshift.io/currentConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b
      machineconfiguration.openshift.io/desiredConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b
      machineconfiguration.openshift.io/reason:
      machineconfiguration.openshift.io/state: Done

      Verify that the following statements are true:

      • The value of machineconfiguration.openshift.io/state field is Done.

      • The value of the machineconfiguration.openshift.io/currentConfig field is equal to the value of the machineconfiguration.openshift.io/desiredConfig field.

    2. To confirm that the machine config is correct, enter the following command in your CLI:

      $ oc get machineconfig <config_name> -o yaml

      where <config_name> is the name of the machine config from the machineconfiguration.openshift.io/currentConfig field.

  13. Confirm that the migration succeeded:

    1. To confirm that the network plugin is OpenShift SDN, enter the following command in your CLI. The value of status.networkType must be OpenShiftSDN.

      $ oc get Network.config/cluster -o jsonpath='{.status.networkType}{"\n"}'
    2. To confirm that the cluster nodes are in the Ready state, enter the following command in your CLI:

      $ oc get nodes
    3. If a node is stuck in the NotReady state, investigate the machine config daemon pod logs and resolve any errors.

      1. To list the pods, enter the following command in your CLI:

        $ oc get pod -n openshift-machine-config-operator
        Example output
        NAME                                         READY   STATUS    RESTARTS   AGE
        machine-config-controller-75f756f89d-sjp8b   1/1     Running   0          37m
        machine-config-daemon-5cf4b                  2/2     Running   0          43h
        machine-config-daemon-7wzcd                  2/2     Running   0          43h
        machine-config-daemon-fc946                  2/2     Running   0          43h
        machine-config-daemon-g2v28                  2/2     Running   0          43h
        machine-config-daemon-gcl4f                  2/2     Running   0          43h
        machine-config-daemon-l5tnv                  2/2     Running   0          43h
        machine-config-operator-79d9c55d5-hth92      1/1     Running   0          37m
        machine-config-server-bsc8h                  1/1     Running   0          43h
        machine-config-server-hklrm                  1/1     Running   0          43h
        machine-config-server-k9rtx                  1/1     Running   0          43h

        The names for the config daemon pods are in the following format: machine-config-daemon-<seq>. The <seq> value is a random five character alphanumeric sequence.

      2. To display the pod log for each machine config daemon pod shown in the previous output, enter the following command in your CLI:

        $ oc logs <pod> -n openshift-machine-config-operator

        where pod is the name of a machine config daemon pod.

      3. Resolve any errors in the logs shown by the output from the previous command.

    4. To confirm that your pods are not in an error state, enter the following command in your CLI:

      $ oc get pods --all-namespaces -o wide --sort-by='{.spec.nodeName}'

      If pods on a node are in an error state, reboot that node.

  14. Complete the following steps only if the migration succeeds and your cluster is in a good state:

    1. To remove the migration configuration from the Cluster Network Operator configuration object, enter the following command in your CLI:

      $ oc patch Network.operator.openshift.io cluster --type='merge' \
        --patch '{ "spec": { "migration": null } }'
    2. To remove the OVN-Kubernetes configuration, enter the following command in your CLI:

      $ oc patch Network.operator.openshift.io cluster --type='merge' \
        --patch '{ "spec": { "defaultNetwork": { "ovnKubernetesConfig":null } } }'
    3. To remove the OVN-Kubernetes network provider namespace, enter the following command in your CLI:

      $ oc delete namespace openshift-ovn-kubernetes

Using the limited live migration method to roll back to the OpenShift SDN network plugin

As a cluster administrator, you can roll back to the OpenShift SDN Container Network Interface (CNI) network plugin by using the limited live migration method. During the migration with this method, nodes are automatically rebooted and service to the cluster is not interrupted.

You must wait until the migration process from OpenShift SDN to OVN-Kubernetes network plugin is successful before initiating a rollback.

If a rollback to OpenShift SDN is required, the following table describes the process.

Table 2. Performing a rollback to OpenShift SDN
User-initiated steps Migration activity

Patch the cluster-level networking configuration by changing the networkType from OVNKubernetes to OpenShiftSDN.

Cluster Network Operator (CNO)

Performs the following actions:

  • Sets migration related fields in the network.operator custom resource (CR) and waits for routable MTUs to be applied to all nodes by the Machine Config Operator (MCO).

  • Patches the network.operator CR to set the migration mode to Live for OpenShiftSDN and deploys the OpenShiftSDN network plugin in migration mode.

  • Deploys OVN-Kubernetes with hybrid overlay enabled.

  • Waits for both CNI plugins to be deployed and updates the conditions in the status of the network.config CR.

  • Triggers the MCO to apply the new machine config to each machine config pool, which includes node cordoning, draining, and rebooting.

  • Removes migration-related fields from the network.operator CR and performs cleanup actions, such as deleting OpenShift SDN resources and redeploying OVN-Kubernetes in normal mode with the necessary configurations.

  • Waits for the OpenShiftSDN redeployment and updates the status conditions in the network.config CR to indicate migration completion.

Prerequisites
  • The OpenShift CLI (oc) is installed.

  • Access to the cluster as a user with the cluster-admin role is available.

  • The cluster is installed on infrastructure configured with the OVN-Kubernetes network plugin.

  • A recent backup of the etcd database is available.

  • A manual reboot can be triggered for each node.

  • The cluster is in a known good state, without any errors.

Procedure
  1. To initiate the rollback to OpenShift SDN, enter the following command:

    $ oc patch Network.config.openshift.io cluster --type='merge' --patch '{"metadata":{"annotations":{"network.openshift.io/network-type-migration":""}},"spec":{"networkType":"OpenShiftSDN"}}'
  2. To watch the progress of your migration, enter the following command:

    $ watch -n1 'oc get network.config/cluster -o json | jq ".status.conditions[]|\"\\(.type) \\(.status) \\(.reason) \\(.message)\""  -r | column --table --table-columns NAME,STATUS,REASON,MESSAGE --table-columns-limit 4; echo; oc get mcp -o wide; echo; oc get node -o "custom-columns=NAME:metadata.name,STATE:metadata.annotations.machineconfiguration\\.openshift\\.io/state,DESIRED:metadata.annotations.machineconfiguration\\.openshift\\.io/desiredConfig,CURRENT:metadata.annotations.machineconfiguration\\.openshift\\.io/currentConfig,REASON:metadata.annotations.machineconfiguration\\.openshift\\.io/reason"'

    The command prints the following information every second:

    • The conditions on the status of the network.config.openshift.io/cluster object, reporting the progress of the migration.

    • The status of different nodes with respect to the machine-config-operator resource, including whether they are upgrading or have been upgraded, as well as their current and desired configurations.

  3. Complete the following steps only if the migration succeeds and your cluster is in a good state:

    1. Remove the network.openshift.io/network-type-migration= annotation from the network.config custom resource by entering the following command:

      $ oc annotate network.config cluster network.openshift.io/network-type-migration-
    2. Remove the OVN-Kubernetes network provider namespace by entering the following command:

      $ oc delete namespace openshift-ovn-kubernetes