This is a cache of https://docs.okd.io/latest/hosted_control_planes/hcp_high_availability/hcp-backup-restore-virt.html. It is a snapshot of the page at 2024-11-26T18:58:55.783+0000.
Backing up and restoring a hosted cluster on OpenShift Virtualization - High availability for hosted control planes | Hosted control planes | OKD 4
×

You can back up and restore a hosted cluster on OKD Virtualization to fix failures.

Backing up a hosted cluster on OKD Virtualization

When you back up a hosted cluster on OKD Virtualization, the hosted cluster can remain running. The backup contains the hosted control plane components and the etcd for the hosted cluster.

When the hosted cluster is not running compute nodes on external infrastructure, hosted cluster workload data that is stored in persistent volume claims (PVCs) that are provisioned by KubeVirt CSI are also backed up. The backup does not contain any KubeVirt virtual machines (VMs) that are used as compute nodes. Those VMs are automatically re-created after the restore process is completed.

Procedure
  1. Create a Velero backup resource by creating a YAML file that is similar to the following example:

    apiVersion: velero.io/v1
    kind: Backup
    metadata:
      name: hc-clusters-hosted-backup
      namespace: openshift-adp
      labels:
        velero.io/storage-location: default
    spec:
      includedNamespaces: (1)
      - clusters
      - clusters-hosted
      includedResources:
      - sa
      - role
      - rolebinding
      - deployment
      - statefulset
      - pv
      - pvc
      - bmh
      - configmap
      - infraenv
      - priorityclasses
      - pdb
      - hostedcluster
      - nodepool
      - secrets
      - hostedcontrolplane
      - cluster
      - datavolume
      - service
      - route
      excludedResources: [ ]
      labelSelector: (2)
        matchExpressions:
        - key: 'hypershift.openshift.io/is-kubevirt-rhcos'
          operator: 'DoesNotExist'
      storageLocation: default
      preserveNodePorts: true
      ttl: 4h0m0s
      snapshotMoveData: true (3)
      datamover: "velero" (4)
      defaultVolumesToFsBackup: false (5)
    1 This field selects the namespaces from the objects to back up. Include namespaces from both the hosted cluster and the hosted control plane. In this example, clusters is a namespace from the hosted cluster and clusters-hosted is a namespace from the hosted control plane. By default, the HostedControlPlane namespace is clusters-<hosted_cluster_name>.
    2 The boot image of the VMs that are used as the hosted cluster nodes are stored in large PVCs. To reduce backup time and storage size, you can filter those PVCs out of the backup by adding this label selector.
    3 This field and the datamover field enable automatically uploading the CSI VolumeSnapshots to remote cloud storage.
    4 This field and the snapshotMoveData field enable automatically uploading the CSI VolumeSnapshots to remote cloud storage.
    5 This field indicates whether pod volume file system backup is used for all volumes by default. Set this value to false to back up the PVCs that you want.
  2. Apply the changes to the YAML file by entering the following command:

    $ oc apply -f <backup_file_name>.yaml

    Replace <backup_file_name> with the name of your file.

  3. Monitor the backup process in the backup object status and in the Velero logs.

    • To monitor the backup object status, enter the following command:

      $ watch "oc get backup -n openshift-adp <backup_file_name> -o jsonpath='{.status}' | jq"
    • To monitor the Velero logs, enter the following command:

      $ oc logs -n openshift-adp -ldeploy=velero -f
Verification
  • When the status.phase field is Completed, the backup process is considered complete.

Restoring a hosted cluster on OKD Virtualization

After you back up a hosted cluster on OKD Virtualization, you can restore the backup.

The restore process can be completed only on the same management cluster where you created the backup.

Procedure
  1. Ensure that no pods or persistent volume claims (PVCs) are running in the HostedControlPlane namespace.

  2. Delete the following objects from the management cluster:

    • HostedCluster

    • NodePool

    • PVCs

  3. Create a restoration manifest YAML file that is similar to the following example:

    apiVersion: velero.io/v1
    kind: Restore
    metadata:
      name: hc-clusters-hosted-restore
      namespace: openshift-adp
    spec:
      backupName: hc-clusters-hosted-backup
      restorePVs: true (1)
      existingResourcePolicy: update (2)
      excludedResources:
      - nodes
      - events
      - events.events.k8s.io
      - backups.velero.io
      - restores.velero.io
      - resticrepositories.velero.io
    1 This field starts the recovery of pods with the included persistent volumes.
    2 Setting existingResourcePolicy to update ensures that any existing objects are overwritten with backup content. This action can cause issues with objects that contain immutable fields, which is why you deleted the HostedCluster, node pools, and PVCs. If you do not set this policy, the Velero engine skips the restoration of objects that already exist.
  4. Apply the changes to the YAML file by entering the following command:

    $ oc apply -f <restore_resource_file_name>.yaml

    Replace <restore_resource_file_name> with the name of your file.

  5. Monitor the restore process by checking the restore status field and the Velero logs.

    • To check the restore status field, enter the following command:

      $ watch "oc get restore -n openshift-adp <backup_file_name> -o jsonpath='{.status}' | jq"
    • To check the Velero logs, enter the following command:

      $ oc logs -n openshift-adp -ldeploy=velero -f
Verification
  • When the status.phase field is Completed, the restore process is considered complete.

Next steps
  • After some time, the KubeVirt VMs are created and join the hosted cluster as compute nodes. Make sure that the hosted cluster workloads are running again as expected.