This is a cache of https://docs.openshift.com/container-platform/4.9/serverless/integrations/gpu-resources.html. It is a snapshot of the page at 2024-09-20T15:19:45.785+0000.
Using NVIDIA GPU resources with serverless applications - Integrations | Serverless | OpenShift Container Platform 4.9
×

NVIDIA supports using GPU resources on OpenShift Container Platform. See GPU Operator on OpenShift for more information about setting up GPU resources on OpenShift Container Platform.

Specifying GPU requirements for a service

After GPU resources are enabled for your OpenShift Container Platform cluster, you can specify GPU requirements for a Knative service using the Knative (kn) CLI.

Prerequisites
  • The OpenShift Serverless Operator, Knative Serving and Knative Eventing are installed on the cluster.

  • You have installed the Knative (kn) CLI.

  • GPU resources are enabled for your OpenShift Container Platform cluster.

  • You have created a project or have access to a project with the appropriate roles and permissions to create applications and other workloads in OpenShift Container Platform.

Using NVIDIA GPU resources is not supported for IBM Z and IBM Power.

Procedure
  1. Create a Knative service and set the GPU resource requirement limit to 1 by using the --limit nvidia.com/gpu=1 flag:

    $ kn service create hello --image <service-image> --limit nvidia.com/gpu=1

    A GPU resource requirement limit of 1 means that the service has 1 GPU resource dedicated. services do not share GPU resources. Any other services that require GPU resources must wait until the GPU resource is no longer in use.

    A limit of 1 GPU also means that applications exceeding usage of 1 GPU resource are restricted. If a service requests more than 1 GPU resource, it is deployed on a node where the GPU resource requirements can be met.

  2. Optional. For an existing service, you can change the GPU resource requirement limit to 3 by using the --limit nvidia.com/gpu=3 flag:

    $ kn service update hello --limit nvidia.com/gpu=3