About disaster recovery - Disaster recovery | Backup and restore

This is a cache of https://docs.openshift.com/container-platform/4.4/backup_and_restore/disaster_recovery/about-disaster-recovery.html. It is a snapshot of the page at 2024-11-30T01:10:39.458+0000.

About disaster recovery - Disaster recovery | Backup and restore | OpenShift Container Platform 4.4

About
- Welcome
- About OpenShift Kubernetes Engine
- Legal notice
Release notes
- OpenShift Container Platform 4.4 release notes
- Versioning policy
Architecture
- Product architecture
- Installation and update
- The control plane
- Understanding OpenShift development
- Red Hat Enterprise Linux CoreOS
- The CI/CD methodology and practice
- Using ArgoCD
- Admission plug-ins
Installing
- Installing on AWS
- Installing on Azure
- Installing on GCP
- Installing on bare metal
- Installing on IBM Z and LinuxONE
  - Installing a cluster on IBM Z and LinuxONE
  - Restricted network IBM Z installation
- Installing on IBM Power
  - Installing a cluster on IBM Power
  - Restricted network IBM Power installation
- Installing on OpenStack
- Installing on RHV
- Installing on vSphere
- Troubleshooting installation issues
- Support for FIPS cryptography
- Installation configuration
Updating clusters
- Updating a cluster between minor versions
- Updating a cluster within a minor version from the web console
- Updating a cluster within a minor version by using the CLI
- Updating a cluster that includes RHEL compute machines
- Updating a restricted network cluster
Support
- Getting support
- Gathering data about your cluster
- Remote health monitoring with connected clusters
Web console
- Accessing the web console
- Viewing cluster information
- Configuring the web console
- Customizing the web console
- Developer perspective
- Disabling the web console
Security
- Container security
- Configuring certificates
- certificate types and descriptions
- Viewing audit logs
- Allowing JavaScript-based access to the API server from additional hosts
- Encrypting etcd data
- Scanning pods for vulnerabilities
Authentication and authorization
- Understanding authentication
- Configuring the internal OAuth server
- Understanding identity provider configuration
- Configuring identity providers
- Using RBAC to define and apply permissions
- Removing the kubeadmin user
- Configuring the user agent
- Understanding and creating service accounts
- Using service accounts in applications
- Using a service account as an OAuth client
- Scoping tokens
- Using bound service account tokens
- Managing security context constraints
- Impersonating the system:admin user
- Syncing LDAP groups
Networking
- Understanding networking
- Accessing hosts
- Understanding the Cluster Network Operator
- Understanding the DNS Operator
- Understanding the Ingress Operator
- Using SCTP
- Network policy
- Multiple networks
- Hardware networks
- OpenShift SDN default CNI network provider
- OVN-Kubernetes default CNI network provider
- Configuring Routes
  - Route configuration
  - Secured routes
- Configuring ingress cluster traffic
- Configuring the cluster-wide proxy
- Configuring a custom PKI
Storage
- Understanding ephemeral storage
- Understanding persistent storage
- Configuring persistent storage
- Using Container Storage Interface (CSI)
- Expanding persistent volumes
- Dynamic provisioning
Registry
- Overview
- Image Registry Operator in OpenShift Container Platform
- Setting up and configuring the registry
- Registry options
- Accessing the registry
- Exposing the registry
Operators
- Understanding Operators
- Understanding the Operator Lifecycle Manager (OLM)
- Understanding the OperatorHub
- Adding Operators to a cluster
- Configuring proxy support
- Deleting Operators from a cluster
- Creating applications from installed Operators
- Viewing Operator status
- Creating policy for Operator installations and upgrades
- Using OLM on restricted networks
- CRDs
  - Extending the Kubernetes API with CRDs
  - Managing resources from CRDs
- Operator SDK
- Red Hat Operators
Builds
- Understanding image builds
- Understanding build configurations
- Creating build inputs
- Managing build output
- Using build strategies
- Custom image builds with Buildah
- Performing basic builds
- Triggering and modifying builds
- Performing advanced builds
- Using Red Hat subscriptions in builds
- Securing builds by strategy
- Build configuration resources
- Troubleshooting builds
- Setting up additional trusted certificate authorities for builds
- Creating and using ConfigMaps
Pipelines
- Understanding OpenShift Pipelines
- Installing OpenShift Pipelines
- Uninstalling OpenShift Pipelines
- Creating applications with OpenShift Pipelines
- Working with Pipelines using the Developer perspective
- OpenShift Pipelines release notes
Images
- Configuring the Samples Operator
- Using the Samples Operator with an alternate registry
- Understanding containers, images, and imagestreams
- Creating images
- Managing images
- Managing image streams
- Using image streams with Kubernetes resources
- Triggering updates on image stream changes
- Image configuration resources
- Using templates
- Using Ruby on Rails
- Using images
Applications
- Projects
- Application life cycle management
- Service brokers
- Deployments
- Quotas
  - Resource quotas per project
  - Resource quotas across multiple projects
- Monitoring project and application metrics using the Developer perspective
- Monitoring application health
- Idling applications
- Pruning objects to reclaim resources
- Using the Red Hat Marketplace
Machine management
- Creating machine sets
- Manually scaling a machine set
- Modifying a machine set
- Deleting a machine
- Applying autoscaling to a cluster
- Creating infrastructure machine sets
- User-provisioned infrastructure
- Deploying machine health checks
Nodes
- Working with pods
- Controlling pod placement onto nodes (scheduling)
- Using Jobs and DaemonSets
  - Running background tasks on nodes automatically with daemonsets
  - Running tasks in pods using jobs
- Working with nodes
- Working with containers
- Working with clusters
Logging
- About cluster logging
- About deploying cluster logging
- Deploying cluster logging
- Updating cluster logging
- Viewing cluster logs
- Viewing cluster logs using Kibana
- Configuring your cluster logging deployment
- Viewing Elasticsearch status
- Viewing cluster logging status
- Moving the cluster logging resources with node selectors
- Manually rolling out Elasticsearch
- Collecting logging data for Red Hat Support
- Troubleshooting Kibana
- Exported fields
- Uninstalling cluster logging
Monitoring
- Cluster monitoring
- Monitoring your own services
- Exposing custom application metrics for autoscaling
Metering
- About metering
- Installing metering
- Upgrading metering
- Configuring metering
- Reports
  - About reports
  - Storage Locations
- Using metering
- Examples of using metering
- Troubleshooting and debugging
- Uninstalling metering
Scalability and performance
- Recommended installation practices
- Recommended host practices
- Recommended cluster scaling practices
- Using the Node Tuning Operator
- Using Cluster Loader
- Using CPU Manager
- Using Topology Manager
- Scaling the Cluster Monitoring Operator
- Planning your environment according to object maximums
- Optimizing storage
- Optimizing routing
- Optimizing networking
- What huge pages do and how they are consumed by apps
Backup and restore
- Backing up etcd data
- Replacing an unhealthy etcd member
- Disaster recovery
Migration
- Migrating from OpenShift Container Platform 3
- Migrating from OpenShift Container Platform 4.1
- Migrating from OpenShift Container Platform 4.2 and later
CLI tools
- OpenShift CLI (oc)
- Developer CLI (odo)
- Helm CLI
  - Getting started with Helm on OpenShift Container Platform
- Knative CLI (kn) for use with OpenShift Serverless
- Pipelines CLI (tkn)
API reference
- API list
- Common object reference
  - Index
- Authorization APIs
  - About Authorization APIs
  - LocalResourceAccessReview [authorization.openshift.io/v1]
  - LocalSubjectAccessReview [authorization.openshift.io/v1]
  - ResourceAccessReview [authorization.openshift.io/v1]
  - SelfSubjectRulesReview [authorization.openshift.io/v1]
  - SubjectAccessReview [authorization.openshift.io/v1]
  - SubjectRulesReview [authorization.openshift.io/v1]
  - TokenReview [authentication.k8s.io/v1]
  - LocalSubjectAccessReview [authorization.k8s.io/v1]
  - SelfSubjectAccessReview [authorization.k8s.io/v1]
  - SelfSubjectRulesReview [authorization.k8s.io/v1]
  - SubjectAccessReview [authorization.k8s.io/v1]
- Autoscale APIs
  - About Autoscale APIs
  - ClusterAutoscaler [autoscaling.openshift.io/v1]
  - MachineAutoscaler [autoscaling.openshift.io/v1beta1]
  - HorizontalPodAutoscaler [autoscaling/v1]
- Config APIs
  - About Config APIs
  - APIServer [config.openshift.io/v1]
  - Authentication [config.openshift.io/v1]
  - Build [config.openshift.io/v1]
  - ClusterOperator [config.openshift.io/v1]
  - ClusterVersion [config.openshift.io/v1]
  - Console [config.openshift.io/v1]
  - DNS [config.openshift.io/v1]
  - FeatureGate [config.openshift.io/v1]
  - Image [config.openshift.io/v1]
  - Infrastructure [config.openshift.io/v1]
  - Ingress [config.openshift.io/v1]
  - Network [config.openshift.io/v1]
  - OAuth [config.openshift.io/v1]
  - OperatorHub [config.openshift.io/v1]
  - Project [config.openshift.io/v1]
  - Proxy [config.openshift.io/v1]
  - Scheduler [config.openshift.io/v1]
- Console APIs
  - About Console APIs
  - ConsoleCLIDownload [console.openshift.io/v1]
  - ConsoleExternalLogLink [console.openshift.io/v1]
  - ConsoleLink [console.openshift.io/v1]
  - ConsoleNotification [console.openshift.io/v1]
  - ConsoleYAMLSample [console.openshift.io/v1]
- Extension APIs
  - About Extension APIs
  - APIService [apiregistration.k8s.io/v1]
  - CustomResourceDefinition [apiextensions.k8s.io/v1]
  - MutatingWebhookConfiguration [admissionregistration.k8s.io/v1]
  - ValidatingWebhookConfiguration [admissionregistration.k8s.io/v1]
- Image APIs
  - About Image APIs
  - Image [image.openshift.io/v1]
  - ImageSignature [image.openshift.io/v1]
  - ImageStreamImage [image.openshift.io/v1]
  - ImageStreamImport [image.openshift.io/v1]
  - ImageStreamMapping [image.openshift.io/v1]
  - ImageStream [image.openshift.io/v1]
  - ImageStreamTag [image.openshift.io/v1]
  - ImageTag [image.openshift.io/v1]
- Machine APIs
  - About Machine APIs
  - ContainerRuntimeConfig [machineconfiguration.openshift.io/v1]
  - ControllerConfig [machineconfiguration.openshift.io/v1]
  - KubeletConfig [machineconfiguration.openshift.io/v1]
  - MachineConfigPool [machineconfiguration.openshift.io/v1]
  - MachineConfig [machineconfiguration.openshift.io/v1]
  - MachineHealthCheck [machine.openshift.io/v1beta1]
  - Machine [machine.openshift.io/v1beta1]
  - MachineSet [machine.openshift.io/v1beta1]
- Metadata APIs
  - About Metadata APIs
  - Binding [core/v1]
  - ComponentStatus [core/v1]
  - ConfigMap [core/v1]
  - ControllerRevision [apps/v1]
  - Event [events.k8s.io/v1beta1]
  - Event [core/v1]
  - Lease [coordination.k8s.io/v1]
  - Namespace [core/v1]
- Monitoring APIs
  - About Monitoring APIs
  - Alertmanager [monitoring.coreos.com/v1]
  - PodMonitor [monitoring.coreos.com/v1]
  - Prometheus [monitoring.coreos.com/v1]
  - PrometheusRule [monitoring.coreos.com/v1]
  - ServiceMonitor [monitoring.coreos.com/v1]
- Network APIs
  - About Network APIs
  - ClusterNetwork [network.openshift.io/v1]
  - Endpoints [core/v1]
  - EndpointSlice [discovery.k8s.io/v1beta1]
  - EgressNetworkPolicy [network.openshift.io/v1]
  - HostSubnet [network.openshift.io/v1]
  - Ingress [networking.k8s.io/v1beta1]
  - NetNamespace [network.openshift.io/v1]
  - NetworkAttachmentDefinition [k8s.cni.cncf.io/v1]
  - NetworkPolicy [networking.k8s.io/v1]
  - Route [route.openshift.io/v1]
  - Service [core/v1]
- Node APIs
  - About Node APIs
  - Node [core/v1]
  - Profile [tuned.openshift.io/v1]
  - RuntimeClass [node.k8s.io/v1beta1]
  - Tuned [tuned.openshift.io/v1]
- OAuth APIs
  - About OAuth APIs
  - OAuthAccessToken [oauth.openshift.io/v1]
  - OAuthAuthorizeToken [oauth.openshift.io/v1]
  - OAuthClientAuthorization [oauth.openshift.io/v1]
  - OAuthClient [oauth.openshift.io/v1]
- Operator APIs
  - About Operator APIs
  - Authentication [operator.openshift.io/v1]
  - Console [operator.openshift.io/v1]
  - Config [imageregistry.operator.openshift.io/v1]
  - Config [samples.operator.openshift.io/v1]
  - CSISnapshotController [operator.openshift.io/v1]
  - DNS [operator.openshift.io/v1]
  - DNSRecord [ingress.operator.openshift.io/v1]
  - Etcd [operator.openshift.io/v1]
  - ImageContentSourcePolicy [operator.openshift.io/v1alpha1]
  - ImagePruner [imageregistry.operator.openshift.io/v1]
  - IngressController [operator.openshift.io/v1]
  - KubeAPIServer [operator.openshift.io/v1]
  - KubeControllerManager [operator.openshift.io/v1]
  - KubeScheduler [operator.openshift.io/v1]
  - KubeStorageVersionMigrator [operator.openshift.io/v1]
  - Network [operator.openshift.io/v1]
  - OpenShiftAPIServer [operator.openshift.io/v1]
  - OpenShiftControllerManager [operator.openshift.io/v1]
  - ServiceCA [operator.openshift.io/v1]
  - ServiceCatalogAPIServer [operator.openshift.io/v1]
  - ServiceCatalogControllerManager [operator.openshift.io/v1]
- OperatorHub APIs
  - About OperatorHub APIs
  - CatalogSourceConfig [operators.coreos.com/v1]
  - CatalogSource [operators.coreos.com/v1alpha1]
  - ClusterServiceVersion [operators.coreos.com/v1alpha1]
  - InstallPlan [operators.coreos.com/v1alpha1]
  - OperatorGroup [operators.coreos.com/v1]
  - OperatorSource [operators.coreos.com/v1]
  - PackageManifest [packages.operators.coreos.com/v1]
  - Subscription [operators.coreos.com/v1alpha1]
- Policy APIs
  - About Policy APIs
  - PodDisruptionBudget [policy/v1beta1]
- Project APIs
  - About Project APIs
  - Project [project.openshift.io/v1]
  - ProjectRequest [project.openshift.io/v1]
- RBAC APIs
  - About RBAC APIs
  - ClusterRoleBinding [rbac.authorization.k8s.io/v1]
  - ClusterRole [rbac.authorization.k8s.io/v1]
  - RoleBinding [rbac.authorization.k8s.io/v1]
  - Role [rbac.authorization.k8s.io/v1]
- Role APIs
  - About Role APIs
  - ClusterRoleBinding [authorization.openshift.io/v1]
  - ClusterRole [authorization.openshift.io/v1]
  - RoleBindingRestriction [authorization.openshift.io/v1]
  - RoleBinding [authorization.openshift.io/v1]
  - Role [authorization.openshift.io/v1]
- Schedule and quota APIs
  - About Schedule and quota APIs
  - AppliedClusterResourceQuota [quota.openshift.io/v1]
  - ClusterResourceQuota [quota.openshift.io/v1]
  - LimitRange [core/v1]
  - PriorityClass [scheduling.k8s.io/v1]
  - ResourceQuota [core/v1]
- Security APIs
  - About Security APIs
  - certificateSigningRequest [certificates.k8s.io/v1beta1]
  - CredentialsRequest [cloudcredential.openshift.io/v1]
  - PodSecurityPolicyReview [security.openshift.io/v1]
  - PodSecurityPolicySelfSubjectReview [security.openshift.io/v1]
  - PodSecurityPolicySubjectReview [security.openshift.io/v1]
  - RangeAllocation [security.openshift.io/v1]
  - Secret [core/v1]
  - SecurityContextConstraints [security.openshift.io/v1]
  - ServiceAccount [core/v1]
- Storage APIs
  - About Storage APIs
  - CSIDriver [storage.k8s.io/v1beta1]
  - CSINode [storage.k8s.io/v1]
  - PersistentVolumeClaim [core/v1]
  - StorageClass [storage.k8s.io/v1]
  - VolumeAttachment [storage.k8s.io/v1]
  - VolumeSnapshot [snapshot.storage.k8s.io/v1beta1]
  - VolumeSnapshotClass [snapshot.storage.k8s.io/v1beta1]
  - VolumeSnapshotContent [snapshot.storage.k8s.io/v1beta1]
- Template APIs
  - About Template APIs
  - BrokerTemplateInstance [template.openshift.io/v1]
  - PodTemplate [core/v1]
  - Template [template.openshift.io/v1]
  - TemplateInstance [template.openshift.io/v1]
- User and group APIs
  - About User and group APIs
  - Group [user.openshift.io/v1]
  - Identity [user.openshift.io/v1]
  - UserIdentityMapping [user.openshift.io/v1]
  - User [user.openshift.io/v1]
- Workloads APIs
  - About Workloads APIs
  - BuildConfig [build.openshift.io/v1]
  - Build [build.openshift.io/v1]
  - CronJob [batch/v1beta1]
  - DaemonSet [apps/v1]
  - Deployment [apps/v1]
  - DeploymentConfig [apps.openshift.io/v1]
  - Job [batch/v1]
  - Pod [core/v1]
  - ReplicationController [core/v1]
  - PersistentVolume [core/v1]
  - ReplicaSet [apps/v1]
  - StatefulSet [apps/v1]
Service Mesh
- Service Mesh 1.x
Jaeger
- Jaeger release notes
- Jaeger architecture
  - Jaeger architecture
- Jaeger installation
Container-native virtualization
- About container-native virtualization
- Container-native virtualization release notes
  - Container-native virtualization release notes
- Container-native virtualization installation
- Upgrading container-native virtualization
- Using the CLI tools
- Virtual machines
- Virtual machine templates
- Live migration
- Node maintenance
- Node networking
- Logging, events, and monitoring
Serverless applications
- Release Notes
- Support
- Architecture
  - Knative Serving
  - Knative Eventing
- Getting started
- Installing OpenShift Serverless
- Creating and managing serverless applications
- High availability on OpenShift Serverless
- Tracing requests
- Knative Serving
- Knative Eventing
- Event sources
- Using metering with OpenShift Serverless

The disaster recovery documentation provides information for administrators on how to recover from several disaster situations that might occur with their OpenShift Container Platform cluster. As an administrator, you might need to follow one or more of the following procedures in order to return your cluster to a working state.

Restoring to a previous cluster state

This solution handles situations where you want to restore your cluster to a previous state, for example, if an administrator deletes something critical. This also includes situations where you have lost the majority of your master hosts, leading to etcd quorum loss and the cluster going offline. As long as you have taken an etcd backup, you can follow this procedure to restore your cluster to a previous state.

If applicable, you might also need to recover from expired control plane certificates.

If you have a majority of your masters still available and have an etcd quorum, then follow the procedure to replace a single unhealthy etcd member.

Recovering from expired control plane certificates

This solution handles situations where your control plane certificates have expired. For example, if you shut down your cluster before the first certificate rotation, which occurs 24 hours after installation, your certificates will not be rotated and will expire. You can follow this procedure to recover from expired control plane certificates.