$ oc get dpa -n openshift-adp
NAME RECONCILED AGE
velero-sample True 2m51s
The release notes for OpenShift API for Data Protection (OADP) describe new features and enhancements, deprecated features, product recommendations, known issues, and resolved issues.
For additional information about OADP, see OpenShift API for Data Protection (OADP) FAQs |
The OpenShift API for Data Protection (OADP) 1.5.1 release notes lists new features, resolved issues, known issues, and deprecated features.
CloudStorage
API is fully supported
The CloudStorage
API feature, available as a Technology Preview before this update, is fully supported from OADP 1.5.1. The CloudStorage
API automates the creation of a bucket for object storage.
New DataProtectionTest
custom resource is available
The DataProtectionTest
(DPT) is a custom resource (CR) that provides a framework to validate your OADP configuration. The DPT CR checks and reports information for the following parameters:
The upload performance of the backups to the object storage.
The Container Storage Interface (CSI) snapshot readiness for persistent volume claims.
The storage bucket configuration, such as encryption and versioning.
Using this information in the DPT CR, you can ensure that your data protection environment is properly configured and performing according to the set configuration.
Note that you must configure STORAGE_ACCOUNT_ID
when using DPT with OADP on Azure.
New node agent load affinity configurations are available
Node agent load affinity: You can schedule the node agent pods on specific nodes by using the spec.podConfig.nodeSelector
object of the DataProtectionApplication
(DPA) custom resource (CR). You can add more restrictions on the node agent pods scheduling by using the nodeagent.loadAffinity
object in the DPA spec.
Repository maintenance job affinity configurations: You can use the repository maintenance job affinity configurations in the DataProtectionApplication
(DPA) custom resource (CR) only if you use Kopia as the backup repository.
You have the option to configure the load affinity at the global level affecting all repositories, or for each repository. You can also use a combination of global and per-repository configuration.
Velero load affinity: You can use the podConfig.nodeSelector
object to assign the Velero pod to specific nodes. You can also configure the velero.loadAffinity
object for pod-level affinity and anti-affinity.
Node agent load concurrency is available
With this update, users can control the maximum number of node agent operations that can run simultaneously on each node within their cluster. It also enables better resource management, optimizing backup and restore workflows for improved performance and a more streamlined experience.
DataProtectionApplicationSpec
overflowed annotation limit, causing potential misconfiguration in deployments
Before this update, the DataProtectionApplicationSpec
used deprecated PodAnnotations
, which led to an annotation limit overflow. This caused potential misconfigurations in deployments. In this release, we have added PodConfig
for annotations in pods deployed by the Operator, ensuring consistent annotations and improved manageability for end users. As a result, deployments should now be more reliable and easier to manage.
Root file system for OADP controller manager is now read-only
Before this update, the manager
container of the openshift-adp-controller-manager-*
pod was configured to run with a writable root file system. As a consequence, this could allow for tampering with the container’s file system or the writing of foreign executables. With this release, the container’s security context has been updated to set the root file system to read-only while ensuring necessary functions that require write access, such as the Kopia cache, continue to operate correctly. As a result, the container is hardened against potential threats.
nonAdmin.enable: false
in multiple DPAs no longer causes reconcile issues
Before this update, when a user attempted to create a second non-admin DataProtectionApplication
(DPA) on a cluster where one already existed, the new DPA failed to reconcile. With this release, the restriction on Non-Admin Controller installation to one per cluster has been removed. As a result, users can install multiple Non-Admin Controllers across the cluster without encountering errors.
OADP supports self-signed certificates
Before this update, using a self-signed certificate for backup images with a storage provider such as Minio resulted in an x509: certificate signed by unknown authority
error during the backup process. With this release, certificate validation has been updated to support self-signed certificates in OADP, ensuring successful backups.
velero describe
includes defaultVolumesToFsBackup
Before this update, the velero describe
output command omitted the defaultVolumesToFsBackup
flag. This affected the visibility of backup configuration details for users. With this release, the velero describe
output includes the defaultVolumesToFsBackup
flag information, improving the visibility of backup settings.
DPT CR no longer fail when s3Url
is secured
Before this update, DataProtectionTest
(DPT) failed to run when s3Url
was secured due to an unverified certificate because the DPT CR lacked the ability to skip or add the caCert in the spec field. As a consequence, data upload failure occurred due to an unverified certificate. With this release, DPT CR has been updated to accept and skip CA cert in spec field, resolving SSL verification errors. As a result, DPT no longer fails when using secured s3Url
.
Adding a backupLocation to DPA with an existing backupLocation name is not rejected
Before this update, adding a second backupLocation
with the same name in DataProtectionApplication
(DPA) caused OADP to enter an invalid state, leading to Backup and Restore failures due to Velero’s inability to read Secret credentials. As a consequence, Backup and Restore operations failed. With this release, the duplicate backupLocation
names in DPA are no longer allowed, preventing Backup and Restore failures. As a result, duplicate backupLocation
names are rejected, ensuring seamless data protection.
The restore fails for backups created on OpenStack using the Cinder CSI driver
When you start a restore operation for a backup that was created on an OpenStack platform using the Cinder Container Storage Interface (CSI) driver, the initial backup only succeeds after the source application is manually scaled down. The restore job fails, preventing you from successfully recovering your application’s data and state from the backup. No known workaround exists.
Datamover pods scheduled on unexpected nodes during backup if the nodeAgent.loadAffinity
parameter has many elements
Due to an issue in Velero 1.14 and later, the OADP node-agent only processes the first nodeSelector
element within the loadAffinity
array. As a consequence, if you define multiple nodeSelector
objects, all objects except the first are ignored, potentially causing datamover pods to be scheduled on unexpected nodes during a backup.
To work around this problem, consolidate all required matchExpressions
from multiple nodeSelector
objects into the first nodeSelector
object. As a result, all node affinity rules are correctly applied, ensuring datamover pods are scheduled to the appropriate nodes.
OADP Backup fails when using CA certificates with aliased command
The CA certificate is not stored as a file on the running Velero container. As a consequence, the user experience degraded due to missing caCert
in Velero container, requiring manual setup and downloads.
To work around this problem, manually add cert to the Velero deployment. For instructions, see Using cacert with velero command aliased via velero deployment.
The nodeSelector
spec is not supported for the Data Mover restore action
When a Data Protection Application (DPA) is created with the nodeSelector
field set in the nodeAgent
parameter, Data Mover restore partially fails instead of completing the restore operation. No known workaround exists.
Image streams backups are partially failing when the DPA is configured with caCert
An unverified certificate in the S3 connection during backups with caCert
in DataProtectionApplication
(DPA) causes the ocp-django
application’s backup to partially fail and result in data loss. No known workaround exists.
Kopia does not delete cache on worker node
When the ephemeral-storage
parameter is configured and running file system restore, the cache is not automatically deleted from the worker node. As a consequence, the /var
partition overflows during backup restore, causing increased storage usage and potential resource exhaustion. To work around this problem, restart the node agent pod, which clears the cache. As a result, cache is deleted.
GCP VSL backups fail with Workload Identity because of invalid project configuration
When performing a volumeSnapshotLocation
(VSL) backup on GCP Workload Identity, the Velero GCP plugin creates an invalid API request if the GCP project is also specified in the snapshotLocations
configuration of DataProtectionApplication
(DPA). As a consequence, the GCP API returns a RESOURCE_PROJECT_INVALID
error, and the backup job finishes with a PartiallyFailed
status. No known workaround exists.
VSL backups fail for CloudStorage
API on AWS with STS
The volumeSnapshotLocation
(VSL) backup fails because of missing the AZURE_RESOURCE_GROUP
parameter in the credentials file, even if AZURE_RESOURCE_GROUP
is already mentioned in the DataProtectionApplication
(DPA) config for VSL. No known workaround exists.
Backups of applications with ImageStreams
fail on Azure with STS
When backing up applications that include image stream resources on an Azure cluster using STS, the OADP plugin incorrectly attempts to locate a secret-based credential for the container registry. As a consequence, the required secret is not found in the STS environment, causing the ImageStream
custom backup action to fail. This results in the overall backup status marked as PartiallyFailed
. No known workaround exists.
DPA reconciliation fails for CloudStorageRef
configuration
When a user creates a bucket and uses the backupLocations.bucket.cloudStorageRef
configuration, bucket credentials are not present in the DataProtectionApplication
(DPA) custom resource (CR). As a result, the DPA reconciliation fails even if bucket credentials are present in the CloudStorage
CR. To work around this problem, add the same credentials to the backupLocations
section of the DPA CR.
The configuration.restic
specification field has been deprecated
With OADP 1.5.0, the configuration.restic
specification field has been deprecated. Use the nodeAgent
section with the uploaderType
field for selecting kopia
or restic
as a uploaderType
. Note that Restic is deprecated in OADP 1.5.0.
The OpenShift API for Data Protection (OADP) 1.5.0 release notes lists resolved issues and known issues.
OADP 1.5.0 introduces a new feature named OADP Self-Service, enabling namespace admin users to back up and restore applications on the OKD. In the earlier versions of OADP, you needed the cluster-admin role to perform OADP operations such as backing up and restoring an application, creating a backup storage location, and so on.
From OADP 1.5.0 onward, you do not need the cluster-admin role to perform the backup and restore operations. You can use OADP with the namespace admin role. The namespace admin role has administrator access only to the namespace the user is assigned to. You can use the Self-Service feature only after the cluster administrator installs the OADP Operator and provides the necessary permissions.
must-gather
tool has been improved with a Markdown summaryYou can collect logs, and information about OpenShift API for Data Protection (OADP) custom resources by using the must-gather
tool. The must-gather
data must be attached to all customer cases.
This tool generates a Markdown output file with the collected information, which is located in the must-gather
logs clusters directory.
dataMoverPrepareTimeout
and resourceTimeout
parameters are now added to nodeAgent
within the DPAThe nodeAgent
field in Data Protection Application (DPA) now includes the following parameters:
dataMoverPrepareTimeout
: Defines the duration the DataUpload
or DataDownload
process will wait. The default value is 30 minutes.
resourceTimeout
: Sets the timeout for resource processes not addressed by other specific timeout parameters. The default value is 10 minutes.
spec.configuration.nodeAgent
parameter in DPA for configuring nodeAgent
daemon setVelero no longer uses the node-agent-config
config map for configuring the nodeAgent
daemon set. With this update, you must use the new spec.configuration.nodeAgent
parameter in a Data Protection Application (DPA) for configuring the nodeAgent
daemon set.
With Velero 1.15 and later, you can now configure the total size of a cache per repository. This prevents pods from being removed due to running out of ephemeral storage. See the following new parameters added to the NodeAgentConfig
field in DPA:
cacheLimitMB
: Sets the local data cache size limit in megabytes.
fullMaintenanceInterval
: The default value is 24 hours. Controls the removal rate of deleted Velero backups from the Kopia repository using the following override options:
normalGC: 24 hours
fastGC: 12 hours
eagerGC: 6 hours
With this update, the following changes are added:
A new configuration
option is now added to the velero
field in DPA.
The default value for the disableFsBackup
parameter is false
or non-existing
. With this update, the following options are added to the SecurityContext
field:
Privileged: true
AllowPrivilegeEscalation: true
If you set the disableFsBackup
parameter to true
, it removes the following mounts from the node-agent:
host-pods
host-plugins
Modifies that the node-agent is always run as a non-root user.
Changes the root file system to read only.
Updates the following mount points with the write access:
/home/velero
tmp/credentials
Uses the SeccompProfileTypeRuntimeDefault
option for the SeccompProfile
parameter.
By default, only one thread processes an item block. Velero 1.16 supports a parallel item backup, where multiple items within a backup can be processed in parallel.
You can use the optional Velero server parameter --item-block-worker-count
to run additional worker threads to process items in parallel. To enable this in OADP, set the dpa.Spec.Configuration.Velero.ItemBlockWorkerCount
parameter to an integer value greater than zero.
Running multiple full backups in parallel is not yet supported. |
With the of release OADP 1.5.0, the logs are now available in the JSON format. It helps to have pre-parsed data in their Elastic logs management system.
oc get dpa
command now displays RECONCILED
statusWith this release, the oc get dpa
command now displays RECONCILED
status instead of displaying only NAME
and AGE
to improve user experience. For example:
$ oc get dpa -n openshift-adp
NAME RECONCILED AGE
velero-sample True 2m51s
FallbackToLogsOnError
for terminationMessagePolicy
With this release, the terminationMessagePolicy
field can now set the FallbackToLogsOnError
value for the OpenShift API for Data Protection (OADP) Operator containers such as operator-manager
, velero
, node-agent
, and non-admin-controller
.
This change ensures that if a container exits with an error and the termination message file is empty, OpenShift uses the last portion of the container logs output as the termination message.
Previously, the namespace admin could not execute an application after the restore operation with the following errors:
exec operation is not allowed because the pod’s security context exceeds your permissions
unable to validate against any security context constraint
not usable by user or serviceaccount, provider restricted-v2
With this update, this issue is now resolved and the namespace admin can access the application successfully after the restore.
Previously, status restoration was only configured at the resource type using the restoreStatus
field in the Restore
custom resource (CR).
With this release, you can now specify the status restoration at the individual resource instance level using the following annotation:
metadata:
annotations:
velero.io/restore-status: "true"
excludedClusterScopedResources
Previously, on performing the backup of an application with the excludedClusterScopedResources
field set to storageclasses
, Namespace
parameter, the backup was successful but the restore partially failed.
With this update, the restore is successful.
waitingForPluginOperations
phasePreviously, a backup was marked as failed with the following error message:
failureReason: found a backup with status "InProgress" during the server starting, mark it as "Failed"
With this update, the backup is completed if it gets restarted during the waitingForPluginOperations
phase.
true
in DPAPreviously, when the spec.configuration.velero.disableFsBackup
field from a Data Protection Application (DPA) was set to true
, the backup partially failed with an error, which was not informative.
This update makes error messages more useful for troubleshooting issues. For example, error messages indicating that disableFsBackup: true
is the issue in a DPA or not having access to a DPA if it is for non-administrator users.
Previously, AWS credentials using STS authentication were not properly validated.
With this update, the parseAWSSecret
function detects STS-specific fields, and updates the ensureSecretDataExists
function to handle STS profiles correctly.
repositoryMaintenance
job affinity config is available to configurePreviously, the new configurations for repository maintenance job pod affinity was missing from a DPA specification.
With this update, the repositoryMaintenance
job affinity config is now available to map a BackupRepository
identifier to its configuration.
ValidationErrors
field fades away once the CR specification is correctPreviously, when a schedule CR was created with a wrong spec.schedule
value and the same was later patched with a correct value, the ValidationErrors
field still existed. Consequently, the ValidationErrors
field was displaying incorrect information even though the spec was correct.
With this update, the ValidationErrors
field fades away once the CR specification is correct.
volumeSnapshotContents
custom resources are restored when the includedNamesapces
field is used in restoreSpec
Previously, when a restore operation was triggered with the includedNamespace
field in a restore specification, restore operation was completed successfully but no volumeSnapshotContents
custom resources (CR) were created and the PVCs were in a Pending
status.
With this update, volumeSnapshotContents
CR are restored even when the includedNamesapces
field is used in restoreSpec
. As a result, an application pod is in a Running
state after restore.
Previously, the container was configured with the readOnlyRootFilesystem: true
setting for security, but the code attempted to create temporary files in the /tmp
directory using the os.CreateTemp()
function. Consequently, while using the AWS STS authentication with the Cloud Credential Operator (CCO) flow, OADP failed to create temporary files that were required for AWS credential handling with the following error:
ERROR unable to determine if bucket exists. {"error": "open /tmp/aws-shared-credentials1211864681: read-only file system"}
With this update, the following changes are added to address this issue:
A new emptyDir
volume named tmp-dir
to the controller pod specification.
A volume mount to the container, which mounts this volume to the /tmp
directory.
For security best practices, the readOnlyRootFilesystem: true
is maintained.
Replaced the deprecated ioutil.TempFile()
function with the recommended os.CreateTemp()
function.
Removed the unnecessary io/ioutil
import, which is no longer needed.
For a complete list of all issues resolved in this release, see the list of OADP 1.5.0 resolved issues in Jira.
Even after deleting a backup, Kopia does not delete the volume artifacts from the ${bucket_name}/kopia/${namespace}
on the S3 location after the backup expired. Information related to the expired and removed data files remains in the metadata.
To ensure that OpenShift API for Data Protection (OADP) functions properly, the data is not deleted, and it exists in the /kopia/
directory, for example:
kopia.repository
: Main repository format information such as encryption, version, and other details.
kopia.blobcfg
: Configuration for how data blobs are named.
kopia.maintenance
: Tracks maintenance owner, schedule, and last successful build.
log
: Log blobs.
For a complete list of all known issues in this release, see the list of OADP 1.5.0 known issues in Jira.
configuration.restic
specification field has been deprecatedWith OpenShift API for Data Protection (OADP) 1.5.0, the configuration.restic
specification field has been deprecated. Use the nodeAgent
section with the uploaderType
field for selecting kopia
or restic
as a uploaderType
. Note that, Restic is deprecated in OpenShift API for Data Protection (OADP) 1.5.0.
OADP can support and facilitate application migrations within HyperShift hosted OpenShift clusters as a Technology Preview. It ensures a seamless backup and restore operation for applications in hosted clusters.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
Always upgrade to the next minor version. Do not skip versions. To update to a later version, upgrade only one channel at a time. For example, to upgrade from OADP 1.1 to 1.3, upgrade first to 1.2, and then to 1.3. |
The Velero server has been updated from version 1.14 to 1.16.
This changes the following:
OpenShift API for Data Protection implements a streamlined version support policy. Red Hat supports only one version of OpenShift API for Data Protection (OADP) on one OpenShift version to ensure better stability and maintainability. OADP 1.5.0 is only supported on OpenShift 4.19 version.
OADP 1.5.0 introduces a new feature named OADP Self-Service, enabling namespace admin users to back up and restore applications on the OKD. In the earlier versions of OADP, you needed the cluster-admin role to perform OADP operations such as backing up and restoring an application, creating a backup storage location, and so on.
From OADP 1.5.0 onward, you do not need the cluster-admin role to perform the backup and restore operations. You can use OADP with the namespace admin role. The namespace admin role has administrator access only to the namespace the user is assigned to. You can use the Self-Service feature only after the cluster administrator installs the OADP Operator and provides the necessary permissions.
backupPVC
and restorePVC
configurationsA backupPVC
resource is an intermediate persistent volume claim (PVC) to access data during the data movement backup operation. You create a readonly
backup PVC by using the nodeAgent.backupPVC
section of the DataProtectionApplication
(DPA) custom resource.
A restorePVC
resource is an intermediate PVC that is used to write data during the Data Mover restore operation.
You can configure restorePVC
in the DPA by using the ignoreDelayBinding
field.
You must back up your current DataProtectionApplication
(DPA) configuration.
Save your current DPA configuration by running the following command:
$ oc get dpa -n openshift-adp -o yaml > dpa.orig.backup
You can upgrade the OpenShift API for Data Protection (OADP) Operator using the following procedure.
Do not install OADP 1.5.0 on a OpenShift 4.18 cluster. |
You have installed the latest OADP 1.4.5.
You have backed up your data.
Upgrade OpenShift 4.18 to OpenShift 4.19.
OpenShift API for Data Protection (OADP) 1.4 is not supported on OpenShift 4.19. |
Change your subscription channel for the OADP Operator from stable-1.4
to stable
.
Wait for the Operator and containers to update and restart.
The OpenShift API for Data Protection (OADP) 1.4 is not supported on OpenShift 4.19. You can convert Data Protection Application (DPA) to the new OADP 1.5 version by using the new spec.configuration.nodeAgent
field and its sub-fields.
To configure nodeAgent
daemon set, use the spec.configuration.nodeAgent
parameter in DPA. See the following example:
DataProtectionApplication
configuration...
spec:
configuration:
nodeAgent:
enable: true
uploaderType: kopia
...
To configure nodeAgent
daemon set by using the configmap
resource named node-agent-config
, see the following example configuration:
...
spec:
configuration:
nodeAgent:
backupPVC:
...
loadConcurrency:
...
podResources:
...
restorePVC:
...
...
You can verify the OpenShift API for Data Protection (OADP) upgrade by using the following procedure.
Verify that the DataProtectionApplication
(DPA) has been reconciled successfully:
$ oc get dpa dpa-sample -n openshift-adp
NAME RECONCILED AGE dpa-sample True 2m51s
The |
Verify that the installation finished by viewing the OADP resources by running the following command:
$ oc get all -n openshift-adp
NAME READY STATUS RESTARTS AGE pod/node-agent-9pjz9 1/1 Running 0 3d17h pod/node-agent-fmn84 1/1 Running 0 3d17h pod/node-agent-xw2dg 1/1 Running 0 3d17h pod/openshift-adp-controller-manager-76b8bc8d7b-kgkcw 1/1 Running 0 3d17h pod/velero-64475b8c5b-nh2qc 1/1 Running 0 3d17h NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/openshift-adp-controller-manager-metrics-service ClusterIP 172.30.194.192 <none> 8443/TCP 3d17h service/openshift-adp-velero-metrics-svc ClusterIP 172.30.190.174 <none> 8085/TCP 3d17h NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-agent 3 3 3 3 3 <none> 3d17h NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/openshift-adp-controller-manager 1/1 1 1 3d17h deployment.apps/velero 1/1 1 1 3d17h NAME DESIRED CURRENT READY AGE replicaset.apps/openshift-adp-controller-manager-76b8bc8d7b 1 1 1 3d17h replicaset.apps/openshift-adp-controller-manager-85fff975b8 0 0 0 3d17h replicaset.apps/velero-64475b8c5b 1 1 1 3d17h replicaset.apps/velero-8b5bc54fd 0 0 0 3d17h replicaset.apps/velero-f5c9ffb66 0 0 0 3d17h
The |
Verify the backup storage location and confirm that the PHASE
is Available
by running the following command:
$ oc get backupstoragelocations.velero.io -n openshift-adp
NAME PHASE LAST VALIDATED AGE DEFAULT
dpa-sample-1 Available 1s 3d16h true