Planning your environment according to object maximums | Scalability and performance

OpenShift Container Platform Tested cluster maximums for major releases
OpenShift Container Platform tested cluster maximums
OpenShift Container Platform environment and configuration on which the cluster maximums are tested
How to plan your environment according to tested cluster maximums
How to plan your environment according to application requirements

Consider the following tested object maximums when you plan your OpenShift Container Platform cluster.

These guidelines are based on the largest possible cluster. For smaller clusters, the maximums are lower. There are many factors that influence the stated thresholds, including the etcd version or storage data format.

In most cases, exceeding these numbers results in lower overall performance. It does not necessarily mean that the cluster will fail.

OpenShift Container Platform Tested cluster maximums for major releases

Tested Cloud Platforms for OpenShift Container Platform 3.x: Red Hat OpenStack Platform (RHOSP), Amazon Web Services and Microsoft Azure. Tested Cloud Platforms for OpenShift Container Platform 4.x: Amazon Web Services, Microsoft Azure and Google Cloud Platform.

Maximum type	3.x tested maximum	4.x tested maximum
Number of Nodes	2,000	2,000
Number of Pods ^[1]	150,000	150,000
Number of Pods per node	250	500 ^[2]
Number of Pods per core	There is no default value.	There is no default value.
Number of Namespaces ^[3]	10,000	10,000
Number of Builds	10,000 (Default pod RAM 512 Mi) - Pipeline Strategy	10,000 (Default pod RAM 512 Mi) - Source-to-Image (s2i) build strategy
Number of Pods per namespace ^[4]	25,000	25,000
Number of Services ^[5]	10,000	10,000
Number of Services per Namespace	5,000	5,000
Number of Back-ends per Service	5,000	5,000
Number of Deployments per Namespace ^[4]	2,000	2,000

OpenShift Container Platform tested cluster maximums

Limit type	3.10 tested maximum	3.11 tested maximum	4.1 tested maximum	4.2 tested maximum
Number of Nodes	2,000	2,000	2,000	2,000
Number of Pods ^[6]	150,000	150,000	150,000	150,000
Number of Pods per node	250	250	250	250
Number of Pods per core	There is no default value.	There is no default value.	There is no default value.	There is no default value.
Number of Namespaces ^[7]	10,000	10,000	10,000	10,000
Number of builds	10,000 (Default pod RAM 512 Mi)	10,000 (Default pod RAM 512 Mi)	10,000 (Default pod RAM 512 Mi)	10,000 (Default pod RAM 512 Mi)
Number of Pods per Namespace ^[8]	3,000	25,000	25,000	25,000
Number of Services ^[9]	10,000	10,000	10,000	10,000
Number of Services per Namespace	5,000	5,000	5,000	5,000
Number of Back-ends per Service	5,000	5,000	5,000	5,000
Number of Deployments per Namespace ^[8]	2,000	2,000	2,000	2,000

In OpenShift Container Platform 4.2, half of a CPU core (500 millicore) is reserved by the system compared to OpenShift Container Platform 3.11 and previous versions.

OpenShift Container Platform environment and configuration on which the cluster maximums are tested

Google Cloud Platform:

Node	Flavor	vCPU	RAM(GiB)	Disk type	Disk size(GiB)/IOPS	Count	Region
Master/Etcd	n1-highmem-16	16	104	Regional/Zonal SSD	220	3	us-east4
Infra ^[10]	n1-standard-64	64	240	Regional/Zonal SSD	100	3	us-east4
Workload ^[11]	n1-standard-16	16	60	Regional/Zonal SSD	500 ^[12]	1	us-east4
Worker	n1-standard-8	8	30	Regional/Zonal SSD	100	3/25/250 ^[13]	us-east4

AWS cloud platform:

Node	Flavor	vCPU	RAM(GiB)	Disk type	Disk size(GiB)/IOPS	Count	Region
Master/Etcd ^[14]	r5.4xlarge	16	128	io1	220 / 3000	3	us-west-2
Infra ^[15]	m5.12xlarge	48	192	gp2	100	3	us-west-2
Workload ^[16]	m5.4xlarge	16	64	gp2	500 ^[17]	1	us-west-2
Worker	m5.large	2	8	gp2	100	2000	us-west-2

How to plan your environment according to tested cluster maximums

Oversubscribing the physical resources on a node affects resource guarantees the Kubernetes scheduler makes during pod placement. Learn what measures you can take to avoid memory swapping.

Some of the tested maximums are stretched only in a single dimension. They will vary when many objects are running on the cluster.

The numbers noted in this documentation are based on Red Hat’s test methodology, setup, configuration, and tunings. These numbers can vary based on your own individual setup and environments.

While planning your environment, determine how many pods are expected to fit per node:

Required Pods per Cluster / Pods per Node = Total Number of Nodes Needed

The current maximum number of pods per node is 250. However, the number of pods that fit on a node is dependent on the application itself. Consider the application’s memory, CPU, and storage requirements, as described in How to plan your environment according to application requirements.

Example scenario

If you want to scope your cluster at 2200 pods, assuming the 250 maximum pods per node, you would need at least nine nodes:

2200 / 250 = 8.8

If you increase the number of nodes to 20, then the pod distribution changes to 110 pods per node:

2200 / 20 = 110

Where:

Required Pods per Cluster / Total Number of Nodes = Expected Pods per Node

How to plan your environment according to application requirements

Consider an example application environment:

Pod type	Pod quantity	Max memory	CPU cores	Persistent storage
apache	100	500 MB	0.5	1 GB
node.js	200	1 GB	1	1 GB
postgresql	100	1 GB	2	10 GB
JBoss EAP	100	1 GB	1	1 GB

Extrapolated requirements: 550 CPU cores, 450GB RAM, and 1.4TB storage.

Instance size for nodes can be modulated up or down, depending on your preference. Nodes are often resource overcommitted. In this deployment scenario, you can choose to run additional smaller nodes or fewer larger nodes to provide the same amount of resources. Factors such as operational agility and cost-per-instance should be considered.

Node type	Quantity	CPUs	RAM (GB)
Nodes (option 1)	100	4	16
Nodes (option 2)	50	8	32
Nodes (option 3)	25	16	64

Some applications lend themselves well to overcommitted environments, and some do not. Most Java applications and applications that use huge pages are examples of applications that would not allow for overcommitment. That memory can not be used for other applications. In the example above, the environment would be roughly 30 percent overcommitted, a common ratio.

1. The Pod count displayed here is the number of test Pods. The actual number of Pods depends on the application’s memory, CPU, and storage requirements.

2. This was tested on a cluster with 100 worker nodes with 500 Pods per worker node. The default maxPods is still 250. To get to 500 maxPods, the cluster must be created with a hostPrefix of 22 in the install-config.yaml file and maxPods set to 500 using a custom KubeletConfig. The maximum number of Pods with attached Persistant Volume Claims (PVC) depends on storage backend from where PVC are allocated. In our tests, only OpenShift Container Storage v4 (OCS v4) was able to satisfy the number of Pods per node discussed in this document.

3. When there are a large number of active projects, etcd might suffer from poor performance if the keyspace grows excessively large and exceeds the space quota. Periodic maintenance of etcd, including defragmentaion, is highly recommended to free etcd storage.

4. There are a number of control loops in the system that must iterate over all objects in a given namespace as a reaction to some changes in state. Having a large number of objects of a given type in a single namespace can make those loops expensive and slow down processing given state changes. The limit assumes that the system has enough CPU, memory, and disk to satisfy the application requirements.

5. Each Service port and each Service back-end has a corresponding entry in iptables. The number of back-ends of a given Service impact the size of the endpoints objects, which impacts the size of data that is being sent all over the system.

6. The Pod count displayed here is the number of test Pods. The actual number of Pods depends on the application’s memory, CPU, and storage requirements.

7. When there are a large number of active projects, etcd might suffer from poor performance if the keyspace grows excessively large and exceeds the space quota. Periodic maintenance of etcd, including defragmentaion, is highly recommended to free etcd storage.

8. There are a number of control loops in the system that must iterate over all objects in a given namespace as a reaction to some changes in state. Having a large number of objects of a given type in a single namespace can make those loops expensive and slow down processing given state changes. The limit assumes that the system has enough CPU, memory, and disk to satisfy the application requirements.

9. Each service port and each service back-end has a corresponding entry in iptables. The number of back-ends of a given service impact the size of the endpoints objects, which impacts the size of data that is being sent all over the system.

10. Infra nodes are used to host Monitoring, Ingress and Registry components to make sure they have enough resources to run at large scale.

11. Workload node is dedicated to run performance and scalability workload generators.

12. Larger disk size is used to have enough space to store large amounts of data collected during the performance and scalability test run.

13. Cluster is scaled in iterations and performance and scalability tests are executed at the specified node counts.

14. io1 disk with 3000 IOPS is used for master/etcd nodes as etcd is I/O intensive and latency sensitive.

15. Infra nodes are used to host Monitoring, Ingress and Registry components to make sure they have enough resources to run at large scale.

16. Workload node is dedicated to run performance and scalability workload generators.

17. Larger disk size is used to have enough space to store large amounts of data collected during the performance and scalability test run.