OADP 1.4 release notes - OADP Application backup and restore | Backup and restore

The release notes for OpenShift API for Data Protection (OADP) describe new features and enhancements, deprecated features, product recommendations, known issues, and resolved issues.

For additional information about OADP, see OpenShift API for Data Protection (OADP) FAQs

OADP 1.4.9 release notes

The OpenShift API for Data Protection (OADP) 1.4.9 release notes list resolved issues.

Resolved issues

File system backups no longer create PodVolumeBackup CRs for excluded PVCs

Before this update, when performing a file system (FS) backup with defaultVolumesToFsBackup: true and explicitly excluding persistentvolumeclaims (PVCs) using includedResources, PodVolumeBackup resources were still created for these excluded PVCs. With this release, the backup logic was updated to correctly identify and skip the creation of PodVolumeBackup resources for persistentvolumeclaims types when they are explicitly excluded from the backup configuration. As a result, PodVolumeBackup CRs are no longer created for excluded PVCs.

OADP-3009

S3 storage uses proxy values with insecureSkipTLSVerify: "true"

Before this update, when running image registry backups to S3 storage in a proxy-required environment, setting insecureSkipTLSVerify: "true" caused the system to ignore the configured proxy, leading to backups hanging or failing. With this release, the backup logic has been updated to properly respect proxy settings. As a result, backups using backupImages: true now complete successfully regardless of whether insecureSkipTLSVerify is set to "true" or "false".

OADP-3143

DPA reconciliation fails with a clear error when a VSL references a missing credential key

Before this update, the DataProtectionApplication (DPA) did not validate that the VolumeSnapshotLocation (VSL) credential.key existed in the referenced secret, so a DPA could reconcile successfully with a wrong key name. This allowed misconfigured VSL credentials to pass reconciliation but later caused backups to fail validation with an error indicating that the secret was missing data for the specified key. With this release, DPA reconciliation checks that the credential.key exists in the referenced secret and fails with a clear error message when it does not exist.

OADP-4833

Restricted permissions for Velero cloud credentials

Before this update, the Velero /credentials/cloud secret was mounted with incorrect permissions, making it world-readable. As a consequence, any process or user with access to the container file system could read sensitive cloud credential data. With this release, the Velero secret default permissions were changed to 0640. As a result, access to the credentials file is limited to the intended owner or group.

OADP-5072

Improved error when imagestream backups run without oadp-<bsl_name>-<bsl_provider>-registry-secret

Before this update, when the backup and the Backup Storage Location (BSL) are managed outside the scope of the Data Protection Application (DPA), the DPA did not create the relevant oadp-<bsl_name>-<bsl_provider>-registry-secret. As a result, the OpenShift Velero plugin panicked on the imagestream backup with the following panic error:

024-02-27T10:46:50.028951744Z time="2024-02-27T10:46:50Z" level=error msg="Error backing up item"
backup=openshift-adp/<backup name> error="error executing custom action (groupResource=imagestreams.image.openshift.io,
namespace=<BSL Name>, name=postgres): rpc error: code = Aborted desc = plugin panicked:
runtime error: index out of range with length 1, stack trace: goroutine 94…

With this release, if the required secret is missing, the plugin returns an error message indicating missing oadp-<bsl_name>-<bsl_provider>-registry-secret.

OADP-5076

Single-node OpenShift clusters no longer crash due to premature CRD sync before API initialization

Before this update, the controller crashed during image-based upgrade (IBU) due to missing OpenShift Container Platform custom resource definitions (CRDs) before they were fully initialized. As a consequence, this failure delayed DataProtectionApplication (DPA) reconciliation during IBU upgrade by 8 minutes. This release resolves this issue by requiring the controller to wait for OpenShift Container Platform CRDs to load before starting in the IBU environment on single-node OpenShift, while also disabling leader election. This change shortens the DPA reconciliation window and improves the overall upgrade duration for single-node OpenShift clusters.

OADP-7419

OADP 1.4.9 fixes the following CVEs

OADP 1.4.8 release notes

OpenShift API for Data Protection (OADP) 1.4.8 is a Container Grade Only (CGO) release, which is released to refresh the health grades of the containers. No code was changed in the product itself compared to that of OADP 1.4.7. OADP 1.4.8 fixes several Common Vulnerabilities and Exposures (CVEs).

Resolved issues

OADP 1.4.8 fixes the following CVEs

OADP 1.4.7 release notes

OpenShift API for Data Protection (OADP) 1.4.7 is a Container Grade Only (CGO) release, which is released to refresh the health grades of the containers. No code was changed in the product itself compared to that of OADP 1.4.6.

OADP 1.4.6 release notes

OpenShift API for Data Protection (OADP) 1.4.6 is a Container Grade Only (CGO) release, which is released to refresh the health grades of the containers. No code was changed in the product itself compared to that of OADP 1.4.5.

OADP 1.4.5 release notes

The OpenShift API for Data Protection (OADP) 1.4.5 release notes lists new features and resolved issues.

New features

Collecting logs with the must-gather tool has been improved with a Markdown summary

You can collect logs and information about OpenShift API for Data Protection (OADP) custom resources by using the must-gather tool. The must-gather data must be attached to all customer cases. This tool generates a Markdown output file with the collected information, which is located in the clusters directory of the must-gather logs. (OADP-5904)

Resolved issues

OADP 1.4.5 fixes the following CVEs

OADP 1.4.4 release notes

OpenShift API for Data Protection (OADP) 1.4.4 is a Container Grade Only (CGO) release, which is released to refresh the health grades of the containers. No code was changed in the product itself compared to that of OADP 1.4.3.

Known issues

Issue with restoring stateful applications

When you restore a stateful application that uses the azurefile-csi storage class, the restore operation remains in the Finalizing phase. (OADP-5508)

OADP 1.4.3 release notes

The OpenShift API for Data Protection (OADP) 1.4.3 release notes lists the following new feature.

New features

Notable changes in the kubevirt velero plugin in version 0.7.1

With this release, the kubevirt velero plugin has been updated to version 0.7.1. Notable improvements include the following bug fix and new features:

Virtual machine instances (VMIs) are no longer ignored from backup when the owner VM is excluded.
Object graphs now include all extra objects during backup and restore operations.
Optionally generated labels are now added to new firmware Universally Unique Identifiers (UUIDs) during restore operations.
Switching VM run strategies during restore operations is now possible.
Clearing a MAC address by label is now supported.
The restore-specific checks during the backup operation are now skipped.
The VirtualMachineClusterInstancetype and VirtualMachineClusterPreference custom resource definitions (CRDs) are now supported.

OADP 1.4.2 release notes

The OpenShift API for Data Protection (OADP) 1.4.2 release notes lists new features, resolved issues and bugs, and known issues.

New features

Backing up different volumes in the same namespace by using the VolumePolicy feature is now possible

With this release, Velero provides resource policies to back up different volumes in the same namespace by using the VolumePolicy feature. The supported VolumePolicy feature to back up different volumes includes skip, snapshot, and fs-backup actions. OADP-1071

File system backup and data mover can now use short-term credentials

File system backup and data mover can now use short-term credentials such as AWS Security Token Service (STS) and Google Cloud WIF. With this support, backup is successfully completed without any PartiallyFailed status. OADP-5095

Resolved issues

DPA now reports errors if VSL contains an incorrect provider value

Previously, if the provider of a Volume Snapshot Location (VSL) spec was incorrect, the Data Protection Application (DPA) reconciled successfully. With this update, DPA reports errors and requests for a valid provider value. OADP-5044

Data Mover restore is successful irrespective of using different OADP namespaces for backup and restore

Previously, when backup operation was executed by using OADP installed in one namespace but was restored by using OADP installed in a different namespace, the Data Mover restore failed. With this update, Data Mover restore is now successful. OADP-5460

SSE-C backup works with the calculated MD5 of the secret key

Previously, backup failed with the following error:

Requests specifying Server Side Encryption with Customer provided keys must provide the client calculated MD5 of the secret key.

With this update, missing Server-Side Encryption with Customer-Provided Keys (SSE-C) base64 and MD5 hash are now fixed. As a result, SSE-C backup works with the calculated MD5 of the secret key. In addition, incorrect errorhandling for the customerKey size is also fixed. OADP-5388

For a complete list of all issues resolved in this release, see the list of OADP 1.4.2 resolved issues in Jira.

Known issues

The nodeSelector spec is not supported for the Data Mover restore action

When a Data Protection Application (DPA) is created with the nodeSelector field set in the nodeAgent parameter, Data Mover restore partially fails instead of completing the restore operation. OADP-5260

The S3 storage does not use proxy environment when TLS skip verify is specified

In the image registry backup, the S3 storage does not use the proxy environment when the insecureSkipTLSVerify parameter is set to true. OADP-3143

Kopia does not delete artifacts after backup expiration

Even after you delete a backup, Kopia does not delete the volume artifacts from the ${bucket_name}/kopia/${namespace} on the S3 location after backup expired. For more information, see "About Kopia repository maintenance". OADP-5131

Additional resources

About Kopia repository maintenance

OADP 1.4.1 release notes

The OpenShift API for Data Protection (OADP) 1.4.1 release notes lists new features, resolved issues and bugs, and known issues.

New features

New DPA fields to update client qps and burst

You can now change Velero Server Kubernetes API queries per second and burst values by using the new Data Protection Application (DPA) fields. The new DPA fields are spec.configuration.velero.client-qps and spec.configuration.velero.client-burst, which both default to 100. OADP-4076

Enabling non-default algorithms with Kopia

With this update, you can now configure the hash, encryption, and splitter algorithms in Kopia to select non-default options to optimize performance for different backup workloads.

To configure these algorithms, set the env variable of a velero pod in the podConfig section of the DataProtectionApplication (DPA) configuration. If this variable is not set, or an unsupported algorithm is chosen, Kopia will default to its standard algorithms. OADP-4640

Resolved issues

Restoring a backup without pods is now successful

Previously, restoring a backup without pods and having StorageClass VolumeBindingMode set as WaitForFirstConsumer, resulted in the PartiallyFailed status with an error: fail to patch dynamic PV, err: context deadline exceeded. With this update, patching dynamic PV is skipped and restoring a backup is successful without any PartiallyFailed status. OADP-4231

PodVolumeBackup CR now displays correct message

Previously, the PodVolumeBackup custom resource (CR) generated an incorrect message, which was: get a podvolumebackup with status "InProgress" during the server starting, mark it as "Failed". With this update, the message produced is now:

found a podvolumebackup with status "InProgress" during the server starting,
mark it as "Failed".

OADP-4224

Overriding imagePullPolicy is now possible with DPA

Previously, OADP set the imagePullPolicy parameter to Always for all images. With this update, OADP checks if each image contains sha256 or sha512 digest, then it sets imagePullPolicy to IfNotPresent; otherwise imagePullPolicy is set to Always. You can now override this policy by using the new spec.containerImagePullPolicy DPA field. OADP-4172

OADP Velero can now retry updating the restore status if initial update fails

Previously, OADP Velero failed to update the restored CR status. This left the status at InProgress indefinitely. Components which relied on the backup and restore CR status to determine the completion would fail. With this update, the restore CR status for a restore correctly proceeds to the Completed or Failed status. OADP-3227

Restoring BuildConfig Build from a different cluster is successful without any errors

Previously, when performing a restore of the BuildConfig Build resource from a different cluster, the application generated an error on TLS verification to the internal image registry. The resulting error was failed to verify certificate: x509: certificate signed by unknown authority error. With this update, the restore of the BuildConfig build resources to a different cluster can proceed successfully without generating the failed to verify certificate error. OADP-4692

Restoring an empty PVC is successful

Previously, downloading data failed while restoring an empty persistent volume claim (PVC). It failed with the following error:

data path restore failed: Failed to run kopia restore: Unable to load
    snapshot : snapshot not found

With this update, the downloading of data proceeds to correct conclusion when restoring an empty PVC and the error message is not generated. OADP-3106

There is no Velero memory leak in CSI and DataMover plugins

Previously, a Velero memory leak was caused by using the CSI and DataMover plugins. When the backup ended, the Velero plugin instance was not deleted and the memory leak consumed memory until an Out of Memory (OOM) condition was generated in the Velero pod. With this update, there is no resulting Velero memory leak when using the CSI and DataMover plugins. OADP-4448

Post-hook operation does not start before the related PVs are released

Previously, due to the asynchronous nature of the Data Mover operation, a post-hook might be attempted before the Data Mover persistent volume claim (PVC) releases the persistent volumes (PVs) of the related pods. This problem would cause the backup to fail with a PartiallyFailed status. With this update, the post-hook operation is not started until the related PVs are released by the Data Mover PVC, eliminating the PartiallyFailed backup status. OADP-3140

Deploying a DPA works as expected in namespaces with more than 37 characters

When you install the OADP Operator in a namespace with more than 37 characters to create a new DPA, labeling the "cloud-credentials" Secret fails and the DPA reports the following error:

The generated label name is too long.

With this update, creating a DPA does not fail in namespaces with more than 37 characters in the name. OADP-3960

Restore is successfully completed by overriding the timeout error

Previously, in a large scale environment, the restore operation would result in a Partiallyfailed status with the error: fail to patch dynamic PV, err: context deadline exceeded. With this update, the resourceTimeout Velero server argument is used to override this timeout error resulting in a successful restore. OADP-4344

For a complete list of all issues resolved in this release, see the list of OADP 1.4.1 resolved issues in Jira.

Known issues

Cassandra application pods enter into the CrashLoopBackoff status after restoring OADP

After OADP restores, the Cassandra application pods might enter CrashLoopBackoff status. To work around this problem, delete the StatefulSet pods that are returning the error CrashLoopBackoff state after restoring OADP. The StatefulSet controller then recreates these pods and it runs normally. OADP-4407

Deployment referencing ImageStream is not restored properly leading to corrupted pod and volume contents

During a File System Backup (FSB) restore operation, a Deployment resource referencing an ImageStream is not restored properly. The restored pod that runs the FSB, and the postHook is terminated prematurely.

During the restore operation, the OpenShift Container Platform controller updates the spec.template.spec.containers[0].image field in the Deployment resource with an updated ImageStreamTag hash. The update triggers the rollout of a new pod, terminating the pod on which velero runs the FSB along with the post-hook. For more information about image stream trigger, see Triggering updates on image stream changes.

The workaround for this behavior is a two-step restore process:

Perform a restore excluding the Deployment resources, for example:

$ velero restore create <RESTORE_NAME> \
  --from-backup <BACKUP_NAME> \
  --exclude-resources=deployment.apps

Once the first restore is successful, perform a second restore by including these resources, for example:

$ velero restore create <RESTORE_NAME> \
  --from-backup <BACKUP_NAME> \
  --include-resources=deployment.apps

OADP-3954

OADP 1.4.0 release notes

The OpenShift API for Data Protection (OADP) 1.4.0 release notes lists resolved issues and known issues.

Resolved issues

Restore works correctly in OKD 4.16

Previously, while restoring the deleted application namespace, the restore operation partially failed with the resource name may not be empty error in OKD 4.16. With this update, restore works as expected in OKD 4.16. OADP-4075

Data Mover backups work properly in the OKD 4.16 cluster

Previously, Velero was using the earlier version of SDK where the Spec.SourceVolumeMode field did not exist. As a consequence, Data Mover backups failed in the OKD 4.16 cluster on the external snapshotter with version 4.2. With this update, external snapshotter is upgraded to version 7.0 and later. As a result, backups do not fail in the OKD 4.16 cluster. OADP-3922

For a complete list of all issues resolved in this release, see the list of OADP 1.4.0 resolved issues in Jira.

Known issues

Backup fails when checksumAlgorithm is not set for MCG

While performing a backup of any application with Noobaa as the backup location, if the checksumAlgorithm configuration parameter is not set, backup fails. To fix this problem, if you do not provide a value for checksumAlgorithm in the Backup Storage Location (BSL) configuration, an empty value is added. The empty value is only added for BSLs that are created using Data Protection Application (DPA) custom resource (CR), and this value is not added if BSLs are created using any other method. OADP-4274

For a complete list of all known issues in this release, see the list of OADP 1.4.0 known issues in Jira.

Upgrade notes

Always upgrade to the next minor version. Do not skip versions. To update to a later version, upgrade only one channel at a time. For example, to upgrade from OpenShift API for Data Protection (OADP) 1.1 to 1.3, upgrade first to 1.2, and then to 1.3.

Changes from OADP 1.3 to 1.4

The Velero server has been updated from version 1.12 to 1.14. Note that there are no changes in the Data Protection Application (DPA).

This changes the following:

The velero-plugin-for-csi code is now available in the Velero code, which means an init container is no longer required for the plugin.
Velero changed client Burst and QPS defaults from 30 and 20 to 100 and 100, respectively.
The velero-plugin-for-aws plugin updated default value of the spec.config.checksumAlgorithm field in BackupStorageLocation objects (BSLs) from "" (no checksum calculation) to the CRC32 algorithm. The checksum algorithm types are known to work only with AWS. Several S3 providers require the md5sum to be disabled by setting the checksum algorithm to "". Confirm md5sum algorithm support and configuration with your storage provider.

In OADP 1.4, the default value for BSLs created within DPA for this configuration is "". This default value means that the md5sum is not checked, which is consistent with OADP 1.3. For BSLs created within DPA, update it by using the spec.backupLocations[].velero.config.checksumAlgorithm field in the DPA. If your BSLs are created outside DPA, you can update this configuration by using spec.config.checksumAlgorithm in the BSLs.

Backing up the DPA configuration

You must back up your current DataProtectionApplication (DPA) configuration.

Procedure

Save your current DPA configuration by running the following command:
Example command
```
$ oc get dpa -n openshift-adp -o yaml > dpa.orig.backup
```

Upgrading the OADP Operator

Use the following procedure when upgrading the OpenShift API for Data Protection (OADP) Operator.

Procedure

Change your subscription channel for the OADP Operator from stable-1.3 to stable-1.4.
Wait for the Operator and containers to update and restart.

Additional resources

Updating installed Operators

Converting DPA to the new version

To upgrade from OADP 1.3 to 1.4, no Data Protection Application (DPA) changes are required.

Verifying the upgrade

Use the following procedure to verify the upgrade.

Procedure

Verify the installation by viewing the OpenShift API for Data Protection (OADP) resources by running the following command:

$ oc get all -n openshift-adp

Example output

NAME                                                     READY   STATUS    RESTARTS   AGE
pod/oadp-operator-controller-manager-67d9494d47-6l8z8    2/2     Running   0          2m8s
pod/restic-9cq4q                                         1/1     Running   0          94s
pod/restic-m4lts                                         1/1     Running   0          94s
pod/restic-pv4kr                                         1/1     Running   0          95s
pod/velero-588db7f655-n842v                              1/1     Running   0          95s

NAME                                                       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/oadp-operator-controller-manager-metrics-service   ClusterIP   172.30.70.140    <none>        8443/TCP   2m8s

NAME                    DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/restic   3         3         3       3            3           <none>          96s

NAME                                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/oadp-operator-controller-manager    1/1     1            1           2m9s
deployment.apps/velero                              1/1     1            1           96s

NAME                                                           DESIRED   CURRENT   READY   AGE
replicaset.apps/oadp-operator-controller-manager-67d9494d47    1         1         1       2m9s
replicaset.apps/velero-588db7f655                              1         1         1       96s

Verify that the DataProtectionApplication (DPA) is reconciled by running the following command:

$ oc get dpa dpa-sample -n openshift-adp -o jsonpath='{.status}'

Example output

{"conditions":[{"lastTransitionTime":"2023-10-27T01:23:57Z","message":"Reconcile complete","reason":"Complete","status":"True","type":"Reconciled"}]}

Verify the type is set to Reconciled.

Verify the backup storage location and confirm that the PHASE is Available by running the following command:

$ oc get backupstoragelocations.velero.io -n openshift-adp

Example output

NAME           PHASE       LAST VALIDATED   AGE     DEFAULT
dpa-sample-1   Available   1s               3d16h   true