This is a cache of https://docs.okd.io/4.10/scalability_and_performance/optimizing-storage.html. It is a snapshot of the page at 2024-11-21T22:24:15.581+0000.
Optimizing storage | Scalability and performance | OKD 4.10
×

Optimizing storage helps to minimize storage use across all resources. By optimizing storage, administrators help ensure that existing storage resources are working in an efficient manner.

Available persistent storage options

Understand your persistent storage options so that you can optimize your OKD environment.

Table 1. Available storage options
Storage type Description Examples

Block

  • Presented to the operating system (OS) as a block device

  • Suitable for applications that need full control of storage and operate at a low level on files bypassing the file system

  • Also referred to as a Storage Area Network (SAN)

  • Non-shareable, which means that only one client at a time can mount an endpoint of this type

AWS EBS and VMware vSphere support dynamic persistent volume (PV) provisioning natively in OKD.

File

  • Presented to the OS as a file system export to be mounted

  • Also referred to as Network Attached Storage (NAS)

  • Concurrency, latency, file locking mechanisms, and other capabilities vary widely between protocols, implementations, vendors, and scales.

RHEL NFS, NetApp NFS [1], and Vendor NFS

Object

  • Accessible through a REST API endpoint

  • Configurable for use in the OpenShift image registry

  • Applications must build their drivers into the application and/or container.

AWS S3

  1. NetApp NFS supports dynamic PV provisioning when using the Trident plugin.

Currently, CNS is not supported in OKD 4.10.

The following table summarizes the recommended and configurable storage technologies for the given OKD cluster application.

Table 2. Recommended and configurable storage technology
Storage type ROX1 RWX2 Registry Scaled registry Metrics3 Logging Apps

Block

Yes4

No

Configurable

Not configurable

Recommended

Recommended

Recommended

File

Yes4

Yes

Configurable

Configurable

Configurable5

Configurable6

Recommended

Object

Yes

Yes

Recommended

Recommended

Not configurable

Not configurable

Not configurable7

1 ReadOnlyMany

2 ReadWriteMany

3 Prometheus is the underlying technology used for metrics.

4 This does not apply to physical disk, VM physical disk, VMDK, loopback over NFS, AWS EBS, and Azure Disk.

5 For metrics, using file storage with the ReadWriteMany (RWX) access mode is unreliable. If you use file storage, do not configure the RWX access mode on any persistent volume claims (PVCs) that are configured for use with metrics.

6 For logging, using any shared storage would be an anti-pattern. One volume per elasticsearch is required.

7 Object storage is not consumed through OKD’s PVs or PVCs. Apps must integrate with the object storage REST API.

A scaled registry is an OpenShift image registry where two or more pod replicas are running.

Specific application storage recommendations

Testing shows issues with using the NFS server on Fedora as storage backend for core services. This includes the OpenShift Container Registry and Quay, Prometheus for monitoring storage, and Elasticsearch for logging storage. Therefore, using Fedora NFS to back PVs used by core services is not recommended.

Other NFS implementations on the marketplace might not have these issues. Contact the individual NFS implementation vendor for more information on any testing that was possibly completed against these OKD core components.

Registry

In a non-scaled/high-availability (HA) OpenShift image registry cluster deployment:

  • The storage technology does not have to support RWX access mode.

  • The storage technology must ensure read-after-write consistency.

  • The preferred storage technology is object storage followed by block storage.

  • File storage is not recommended for OpenShift image registry cluster deployment with production workloads.

Scaled registry

In a scaled/HA OpenShift image registry cluster deployment:

  • The storage technology must support RWX access mode.

  • The storage technology must ensure read-after-write consistency.

  • The preferred storage technology is object storage.

  • Red Hat OpenShift Data Foundation (ODF), Amazon Simple Storage Service (Amazon S3), Google Cloud Storage (GCS), Microsoft Azure Blob Storage, and OpenStack Swift are supported.

  • Object storage should be S3 or Swift compliant.

  • For non-cloud platforms, such as vSphere and bare metal installations, the only configurable technology is file storage.

  • Block storage is not configurable.

Metrics

In an OKD hosted metrics cluster deployment:

  • The preferred storage technology is block storage.

  • Object storage is not configurable.

It is not recommended to use file storage for a hosted metrics cluster deployment with production workloads.

Logging

In an OKD hosted logging cluster deployment:

  • The preferred storage technology is block storage.

  • Object storage is not configurable.

Applications

Application use cases vary from application to application, as described in the following examples:

  • Storage technologies that support dynamic PV provisioning have low mount time latencies, and are not tied to nodes to support a healthy cluster.

  • Application developers are responsible for knowing and understanding the storage requirements for their application, and how it works with the provided storage to ensure that issues do not occur when an application scales or interacts with the storage layer.

Other specific application storage recommendations

It is not recommended to use RAID configurations on Write intensive workloads, such as etcd. If you are running etcd with a RAID configuration, you might be at risk of encountering performance issues with your workloads.

  • OpenStack Cinder: OpenStack Cinder tends to be adept in ROX access mode use cases.

  • Databases: Databases (RDBMSs, NoSQL DBs, etc.) tend to perform best with dedicated block storage.

  • The etcd database must have enough storage and adequate performance capacity to enable a large cluster. Information about monitoring and benchmarking tools to establish ample storage and a high-performance environment is described in Recommended etcd practices.

Data storage management

The following table summarizes the main directories that OKD components write data to.

Table 3. Main directories for storing OKD data
Directory Notes Sizing Expected growth

/var/lib/etcd

Used for etcd storage when storing the database.

Less than 20 GB.

Database can grow up to 8 GB.

Will grow slowly with the environment. Only storing metadata.

Additional 20-25 GB for every additional 8 GB of memory.

/var/lib/containers

This is the mount point for the CRI-O runtime. Storage used for active container runtimes, including pods, and storage of local images. Not used for registry storage.

50 GB for a node with 16 GB memory. Note that this sizing should not be used to determine minimum cluster requirements.

Additional 20-25 GB for every additional 8 GB of memory.

Growth is limited by capacity for running containers.

/var/lib/kubelet

Ephemeral volume storage for pods. This includes anything external that is mounted into a container at runtime. Includes environment variables, kube secrets, and data volumes not backed by persistent volumes.

Varies

Minimal if pods requiring storage are using persistent volumes. If using ephemeral storage, this can grow quickly.

/var/log

Log files for all components.

10 to 30 GB.

Log files can grow quickly; size can be managed by growing disks or by using log rotate.

Optimizing storage performance for Microsoft Azure

OKD and Kubernetes are sensitive to disk performance, and faster storage is recommended, particularly for etcd on the control plane nodes.

For production Azure clusters and clusters with intensive workloads, the virtual machine operating system disk for control plane machines should be able to sustain a tested and recommended minimum throughput of 5000 IOPS / 200MBps. This throughput can be provided by having a minimum of 1 TiB Premium SSD (P30). In Azure and Azure Stack Hub, disk performance is directly dependent on SSD disk sizes. To achieve the throughput supported by a Standard_D8s_v3 virtual machine, or other similar machine types, and the target of 5000 IOPS, at least a P30 disk is required.

Host caching must be set to ReadOnly for low latency and high IOPS and throughput when reading data. Reading data from the cache, which is present either in the VM memory or in the local SSD disk, is much faster than reading from the disk, which is in the blob storage.