This is a cache of https://docs.openshift.com/container-platform/4.17/networking/hardware_networks/configuring-sriov-operator.html. It is a snapshot of the page at 2024-11-28T06:46:16.971+0000.
Configuring the SR-IOV Operator - Hardware networks | Networking | OpenShift Container Platform 4.17
×

Configuring the SR-IOV Network Operator

  • Create a SriovOperatorConfig custom resource (CR) to deploy all the SR-IOV Operator components:

    1. Create a file named sriovOperatorConfig.yaml using the following YAML:

      apiVersion: sriovnetwork.openshift.io/v1
      kind: SriovOperatorConfig
      metadata:
        name: default
        namespace: openshift-sriov-network-operator
      spec:
        disableDrain: false
        enableInjector: true
        enableOperatorWebhook: true
        logLevel: 2
        featureGates:
          metricsExporter: false

      The only valid name for the SriovOperatorConfig resource is default and it must be in the namespace where the Operator is deployed.

    2. Create the resource by running the following command:

      $ oc apply -f sriovOperatorConfig.yaml

SR-IOV Network Operator config custom resource

The fields for the sriovoperatorconfig custom resource are described in the following table:

Table 1. SR-IOV Network Operator config custom resource
Field Type Description

metadata.name

string

Specifies the name of the SR-IOV Network Operator instance. The default value is default. Do not set a different value.

metadata.namespace

string

Specifies the namespace of the SR-IOV Network Operator instance. The default value is openshift-sriov-network-operator. Do not set a different value.

spec.configDaemonNodeSelector

string

Specifies the node selection to control scheduling the SR-IOV Network Config Daemon on selected nodes. By default, this field is not set and the Operator deploys the SR-IOV Network Config daemon set on worker nodes.

spec.disableDrain

boolean

Specifies whether to disable the node draining process or enable the node draining process when you apply a new policy to configure the NIC on a node. Setting this field to true facilitates software development and installing OpenShift Container Platform on a single node. By default, this field is not set.

For single-node clusters, set this field to true after installing the Operator. This field must remain set to true.

spec.enableInjector

boolean

Specifies whether to enable or disable the Network Resources Injector daemon set. By default, this field is set to true.

spec.enableOperatorWebhook

boolean

Specifies whether to enable or disable the Operator Admission Controller webhook daemon set. By default, this field is set to true.

spec.logLevel

integer

Specifies the log verbosity level of the Operator. Set to 0 to show only the basic logs. Set to 2 to show all the available logs. By default, this field is set to 2.

spec.featureGates

map[string]bool

Specifies whether to enable or disable the optional features. For example, metricsExporter.

spec.featureGates.metricsExporter

boolean

Specifies whether to enable or disable the SR-IOV Network Operator metrics. By default, this field is set to false.

About the Network Resources Injector

The Network Resources Injector is a Kubernetes Dynamic Admission Controller application. It provides the following capabilities:

  • Mutation of resource requests and limits in a pod specification to add an SR-IOV resource name according to an SR-IOV network attachment definition annotation.

  • Mutation of a pod specification with a Downward API volume to expose pod annotations, labels, and huge pages requests and limits. Containers that run in the pod can access the exposed information as files under the /etc/podnetinfo path.

By default, the Network Resources Injector is enabled by the SR-IOV Network Operator and runs as a daemon set on all control plane nodes. The following is an example of Network Resources Injector pods running in a cluster with three control plane nodes:

$ oc get pods -n openshift-sriov-network-operator
Example output
NAME                                      READY   STATUS    RESTARTS   AGE
network-resources-injector-5cz5p          1/1     Running   0          10m
network-resources-injector-dwqpx          1/1     Running   0          10m
network-resources-injector-lktz5          1/1     Running   0          10m

About the SR-IOV Network Operator admission controller webhook

The SR-IOV Network Operator Admission Controller webhook is a Kubernetes Dynamic Admission Controller application. It provides the following capabilities:

  • Validation of the SriovNetworkNodePolicy CR when it is created or updated.

  • Mutation of the SriovNetworkNodePolicy CR by setting the default value for the priority and deviceType fields when the CR is created or updated.

By default the SR-IOV Network Operator Admission Controller webhook is enabled by the Operator and runs as a daemon set on all control plane nodes.

Use caution when disabling the SR-IOV Network Operator Admission Controller webhook. You can disable the webhook under specific circumstances, such as troubleshooting, or if you want to use unsupported devices. For information about configuring unsupported devices, see Configuring the SR-IOV Network Operator to use an unsupported NIC.

The following is an example of the Operator Admission Controller webhook pods running in a cluster with three control plane nodes:

$ oc get pods -n openshift-sriov-network-operator
Example output
NAME                                      READY   STATUS    RESTARTS   AGE
operator-webhook-9jkw6                    1/1     Running   0          16m
operator-webhook-kbr5p                    1/1     Running   0          16m
operator-webhook-rpfrl                    1/1     Running   0          16m

About custom node selectors

The SR-IOV Network Config daemon discovers and configures the SR-IOV network devices on cluster nodes. By default, it is deployed to all the worker nodes in the cluster. You can use node labels to specify on which nodes the SR-IOV Network Config daemon runs.

Disabling or enabling the Network Resources Injector

To disable or enable the Network Resources Injector, which is enabled by default, complete the following procedure.

Prerequisites
  • Install the OpenShift CLI (oc).

  • Log in as a user with cluster-admin privileges.

  • You must have installed the SR-IOV Network Operator.

Procedure
  • Set the enableInjector field. Replace <value> with false to disable the feature or true to enable the feature.

    $ oc patch sriovoperatorconfig default \
      --type=merge -n openshift-sriov-network-operator \
      --patch '{ "spec": { "enableInjector": <value> } }'

    You can alternatively apply the following YAML to update the Operator:

    apiVersion: sriovnetwork.openshift.io/v1
    kind: SriovOperatorConfig
    metadata:
      name: default
      namespace: openshift-sriov-network-operator
    spec:
      enableInjector: <value>

Disabling or enabling the SR-IOV Network Operator admission controller webhook

To disable or enable the admission controller webhook, which is enabled by default, complete the following procedure.

Prerequisites
  • Install the OpenShift CLI (oc).

  • Log in as a user with cluster-admin privileges.

  • You must have installed the SR-IOV Network Operator.

Procedure
  • Set the enableOperatorWebhook field. Replace <value> with false to disable the feature or true to enable it:

    $ oc patch sriovoperatorconfig default --type=merge \
      -n openshift-sriov-network-operator \
      --patch '{ "spec": { "enableOperatorWebhook": <value> } }'

    You can alternatively apply the following YAML to update the Operator:

    apiVersion: sriovnetwork.openshift.io/v1
    kind: SriovOperatorConfig
    metadata:
      name: default
      namespace: openshift-sriov-network-operator
    spec:
      enableOperatorWebhook: <value>

Configuring a custom NodeSelector for the SR-IOV Network Config daemon

The SR-IOV Network Config daemon discovers and configures the SR-IOV network devices on cluster nodes. By default, it is deployed to all the worker nodes in the cluster. You can use node labels to specify on which nodes the SR-IOV Network Config daemon runs.

To specify the nodes where the SR-IOV Network Config daemon is deployed, complete the following procedure.

When you update the configDaemonNodeSelector field, the SR-IOV Network Config daemon is recreated on each selected node. While the daemon is recreated, cluster users are unable to apply any new SR-IOV Network node policy or create new SR-IOV pods.

Procedure
  • To update the node selector for the operator, enter the following command:

    $ oc patch sriovoperatorconfig default --type=json \
      -n openshift-sriov-network-operator \
      --patch '[{
          "op": "replace",
          "path": "/spec/configDaemonNodeSelector",
          "value": {<node_label>}
        }]'

    Replace <node_label> with a label to apply as in the following example: "node-role.kubernetes.io/worker": "".

    You can alternatively apply the following YAML to update the Operator:

    apiVersion: sriovnetwork.openshift.io/v1
    kind: SriovOperatorConfig
    metadata:
      name: default
      namespace: openshift-sriov-network-operator
    spec:
      configDaemonNodeSelector:
        <node_label>

Configuring the SR-IOV Network Operator for single node installations

By default, the SR-IOV Network Operator drains workloads from a node before every policy change. The Operator performs this action to ensure that there no workloads using the virtual functions before the reconfiguration.

For installations on a single node, there are no other nodes to receive the workloads. As a result, the Operator must be configured not to drain the workloads from the single node.

After performing the following procedure to disable draining workloads, you must remove any workload that uses an SR-IOV network interface before you change any SR-IOV network node policy.

Prerequisites
  • Install the OpenShift CLI (oc).

  • Log in as a user with cluster-admin privileges.

  • You must have installed the SR-IOV Network Operator.

Procedure
  • To set the disableDrain field to true and the configDaemonNodeSelector field to node-role.kubernetes.io/master: "", enter the following command:

    $ oc patch sriovoperatorconfig default --type=merge -n openshift-sriov-network-operator --patch '{ "spec": { "disableDrain": true, "configDaemonNodeSelector": { "node-role.kubernetes.io/master": "" } } }'

    You can alternatively apply the following YAML to update the Operator:

    apiVersion: sriovnetwork.openshift.io/v1
    kind: SriovOperatorConfig
    metadata:
      name: default
      namespace: openshift-sriov-network-operator
    spec:
      disableDrain: true
      configDaemonNodeSelector:
       node-role.kubernetes.io/master: ""

Deploying the SR-IOV Operator for hosted control planes

After you configure and deploy your hosting service cluster, you can create a subscription to the SR-IOV Operator on a hosted cluster. The SR-IOV pod runs on worker machines rather than the control plane.

Prerequisites

You must configure and deploy the hosted cluster on AWS.

Procedure
  1. Create a namespace and an Operator group:

    apiVersion: v1
    kind: Namespace
    metadata:
      name: openshift-sriov-network-operator
    ---
    apiVersion: operators.coreos.com/v1
    kind: OperatorGroup
    metadata:
      name: sriov-network-operators
      namespace: openshift-sriov-network-operator
    spec:
      targetNamespaces:
      - openshift-sriov-network-operator
  2. Create a subscription to the SR-IOV Operator:

    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: sriov-network-operator-subsription
      namespace: openshift-sriov-network-operator
    spec:
      channel: stable
      name: sriov-network-operator
      config:
        nodeSelector:
          node-role.kubernetes.io/worker: ""
      source: s/qe-app-registry/redhat-operators
      sourceNamespace: openshift-marketplace
Verification
  1. To verify that the SR-IOV Operator is ready, run the following command and view the resulting output:

    $ oc get csv -n openshift-sriov-network-operator
    Example output
    NAME                                         DISPLAY                   VERSION               REPLACES                                     PHASE
    sriov-network-operator.4.17.0-202211021237   SR-IOV Network Operator   4.17.0-202211021237   sriov-network-operator.4.17.0-202210290517   Succeeded
  2. To verify that the SR-IOV pods are deployed, run the following command:

    $ oc get pods -n openshift-sriov-network-operator

About the SR-IOV network metrics exporter

The Single Root I/O Virtualization (SR-IOV) network metrics exporter reads the metrics for SR-IOV virtual functions (VFs) and exposes these VF metrics in Prometheus format. When the SR-IOV network metrics exporter is enabled, you can query the SR-IOV VF metrics by using the OpenShift Container Platform web console to monitor the networking activity of the SR-IOV pods.

When you query the SR-IOV VF metrics by using the web console, the SR-IOV network metrics exporter fetches and returns the VF network statistics along with the name and namespace of the pod that the VF is attached to.

The SR-IOV VF metrics that the metrics exporter reads and exposes in Prometheus format are described in the following table:

Table 2. SR-IOV VF metrics
Metric Description Example PromQL query to examine the VF metric

sriov_vf_rx_bytes

Received bytes per virtual function.

sriov_vf_rx_bytes * on (pciAddr,node) group_left(pod,namespace,dev_type) sriov_kubepoddevice

sriov_vf_tx_bytes

Transmitted bytes per virtual function.

sriov_vf_tx_bytes * on (pciAddr,node) group_left(pod,namespace,dev_type) sriov_kubepoddevice

sriov_vf_rx_packets

Received packets per virtual function.

sriov_vf_rx_packets * on (pciAddr,node) group_left(pod,namespace,dev_type) sriov_kubepoddevice

sriov_vf_tx_packets

Transmitted packets per virtual function.

sriov_vf_tx_packets * on (pciAddr,node) group_left(pod,namespace,dev_type) sriov_kubepoddevice

sriov_vf_rx_dropped

Dropped packets upon receipt per virtual function.

sriov_vf_rx_dropped * on (pciAddr,node) group_left(pod,namespace,dev_type) sriov_kubepoddevice

sriov_vf_tx_dropped

Dropped packets during transmission per virtual function.

sriov_vf_tx_dropped * on (pciAddr,node) group_left(pod,namespace,dev_type) sriov_kubepoddevice

sriov_vf_rx_multicast

Received multicast packets per virtual function.

sriov_vf_rx_multicast * on (pciAddr,node) group_left(pod,namespace,dev_type) sriov_kubepoddevice

sriov_vf_rx_broadcast

Received broadcast packets per virtual function.

sriov_vf_rx_broadcast * on (pciAddr,node) group_left(pod,namespace,dev_type) sriov_kubepoddevice

sriov_kubepoddevice

Virtual functions linked to active pods.

-

You can also combine these queries with the kube-state-metrics to get more information about the SR-IOV pods. For example, you can use the following query to get the VF network statistics along with the application name from the standard Kubernetes pod label:

(sriov_vf_tx_packets * on (pciAddr,node)  group_left(pod,namespace)  sriov_kubepoddevice) * on (pod,namespace) group_left (label_app_kubernetes_io_name) kube_pod_labels

Enabling the SR-IOV network metrics exporter

The Single Root I/O Virtualization (SR-IOV) network metrics exporter is disabled by default. To enable the metrics exporter, you must set the spec.featureGates.metricsExporter field to true.

When the metrics exporter is enabled, the SR-IOV Network Operator deploys the metrics exporter only on nodes with SR-IOV capabilities.

Prerequisites
  • You have installed the OpenShift CLI (oc).

  • You have logged in as a user with cluster-admin privileges.

  • You have installed the SR-IOV Network Operator.

Procedure
  1. Enable cluster monitoring by running the following command:

    $ oc label ns/openshift-sriov-network-operator openshift.io/cluster-monitoring=true

    To enable cluster monitoring, you must add the openshift.io/cluster-monitoring=true label in the namespace where you have installed the SR-IOV Network Operator.

  2. Set the spec.featureGates.metricsExporter field to true by running the following command:

    $ oc patch -n openshift-sriov-network-operator sriovoperatorconfig/default \
        --type='merge' -p='{"spec": {"featureGates": {"metricsExporter": true}}}'
Verification
  1. Check that the SR-IOV network metrics exporter is enabled by running the following command:

    $ oc get pods -n openshift-sriov-network-operator
    Example output
    NAME                                     READY   STATUS    RESTARTS   AGE
    operator-webhook-hzfg4                   1/1     Running   0          5d22h
    sriov-network-config-daemon-tr54m        1/1     Running   0          5d22h
    sriov-network-metrics-exporter-z5d7t     1/1     Running   0          10s
    sriov-network-operator-cc6fd88bc-9bsmt   1/1     Running   0          5d22h

    The sriov-network-metrics-exporter pod must be in the READY state.

  2. Optional: Examine the SR-IOV virtual function (VF) metrics by using the OpenShift Container Platform web console. For more information, see "Querying metrics".