Optimizing Compute Resources | Scaling and Performance Guide

Overcommitting
Image Considerations
- Using a Pre-deployed Image to Improve Efficiency
- Pre-pulling Images
Debugging Using the RHEL Tools Container Image
Debugging Using Ansible-based Health Checks

Overcommitting

You can use overcommit procedures so that resources such as CPU and memory are more accessible to the parts of your cluster that need them.

To avoid erratic cluster behavior due to scheduling collisions between the hypervisor and Kubernetes, do not overcommit at the hypervisor level.

Note that when you overcommit, there is a risk that another application may not have access to the resources it requires when it needs them, which will result in reduced performance. However, this may be an acceptable trade-off in favor of increased density and reduced costs. For example, development, quality assurance (QA), or test environments may be overcommitted, whereas production might not be.

OKD implements resource management through the compute resource model and quota system. See the documentation for more information about the OpenShift resource model.

For more information and strategies for overcommitting, see the Overcommitting documentation in the Cluster Administration Guide.

Image Considerations

Using a Pre-deployed Image to Improve Efficiency

You can create a base OKD image with a number of tasks built-in to improve efficiency, maintain configuration consistency on all node hosts, and reduce repetitive tasks. This is known as a pre-deployed image.

For example, because every node requires the ose-pod image in order to run pods, each node has to periodically connect to the container image registry in order to pull the latest image. This can become problematic when you have 100 nodes attempting this at the same time, and can lead to resource contention on the image registry, waste of network bandwidth, and increased pod launch times.

To build a pre-deployed image:

Create an instance of the type and size required.
Ensure a dedicated storage device is available for CRI-O or Docker local image or container storage, separate from any persistent volumes for containers.
Fully update the system, and ensure CRI-O or Docker is installed.
Ensure the host has access to all yum repositories.
Set up thin-provisioned LVM storage.
Pre-seed your commonly used images (such as the rhel7 base image), as well as OKD infrastructure container images (ose-pod, ose-deployer, etc.) into your pre-deployed image.

Ensure that pre-deployed images are configured for any appropriate cluster configurations, such as being able to run on OpenStack, or AWS, as well as any other cluster configurations.

Pre-pulling Images

To efficiently produce images, you can pre-pull any necessary container images to all node hosts. This means the image does not have to be initially pulled, which saves time and performance over slow connections, especially for images, such as s2i, metrics, and logging, which can be large.

This is also useful for machines that cannot access the registry for security purposes.

Alternatively, you can use a local image instead of the default of a specified registry. To do this:

Pull from local images by setting the imagePullPolicy parameter of a pod configuration to IfNotPresent or Never.
Ensure that all nodes in the cluster have the same images saved locally.

Pulling from a local registry is suitable if you can control node configuration. However, it will not work reliably on cloud providers that do not replace nodes automatically, such as GCE. If you are running on Google Container Engine (GKE), there will already be a .dockercfg file on each node with Google Container Registry credentials.

Debugging Using the RHEL Tools Container Image

Red Hat distributes a rhel-tools container image, packaging tools that aid in debugging scaling or performance problems. This container image:

Allows users to deploy minimal footprint container hosts by moving packages out of the base distribution and into this support container.
Provides debugging capabilities for Red Hat Enterprise Linux 7 Atomic Host, which has an immutable package tree. rhel-tools includes utilities such as tcpdump, sosreport, git, gdb, perf, and many more common system administration utilities.

Use the rhel-tools container with the following:

# atomic run rhel7/rhel-tools

See the RHEL Tools Container documentation for more information.

Debugging Using Ansible-based Health Checks

Additional diagnostic health checks are available through the Ansible-based tooling used to install and manage OKD clusters. They can report common deployment problems for the current OKD installation.

These checks can be run either using the ansible-playbook command (the same method used during cluster installations) or as a containerized version of openshift-ansible. For the ansible-playbook method, the checks are provided by the openshift-ansible Git repository. For the containerized method, the openshift/origin-ansible container image is distributed via Docker Hub.

See Ansible-based Health Checks in the Cluster Administration guide for information on the available health checks and example usage.