apiVersion: v1
kind: Namespace
metadata:
name: openshift-nfd
labels:
name: openshift-nfd
openshift.io/cluster-monitoring: "true"
Learn about the Node Feature Discovery (NFD) Operator and how you can use it to expose node-level information by orchestrating Node Feature Discovery, a Kubernetes add-on for detecting hardware features and system configuration.
The Node Feature Discovery Operator (NFD) manages the detection of hardware features and configuration in an OKD cluster by labeling the nodes with hardware-specific information. NFD labels the host with node-specific attributes, such as PCI cards, kernel, operating system version, and so on.
The NFD Operator can be found on the Operator Hub by searching for “Node Feature Discovery”.
The Node Feature Discovery (NFD) Operator orchestrates all resources needed to run the NFD daemon set. As a cluster administrator, you can install the NFD Operator by using the OKD cli or the web console.
As a cluster administrator, you can install the NFD Operator using the cli.
An OKD cluster
Install the OpenShift cli (oc
).
Log in as a user with cluster-admin
privileges.
Create a namespace for the NFD Operator.
Create the following Namespace
custom resource (CR) that defines the openshift-nfd
namespace, and then save the YAML in the nfd-namespace.yaml
file. Set cluster-monitoring
to "true"
.
apiVersion: v1
kind: Namespace
metadata:
name: openshift-nfd
labels:
name: openshift-nfd
openshift.io/cluster-monitoring: "true"
Create the namespace by running the following command:
$ oc create -f nfd-namespace.yaml
Install the NFD Operator in the namespace you created in the previous step by creating the following objects:
Create the following OperatorGroup
CR and save the YAML in the nfd-operatorgroup.yaml
file:
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
generateName: openshift-nfd-
name: openshift-nfd
namespace: openshift-nfd
spec:
targetNamespaces:
- openshift-nfd
Create the OperatorGroup
CR by running the following command:
$ oc create -f nfd-operatorgroup.yaml
Create the following Subscription
CR and save the YAML in the nfd-sub.yaml
file:
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: nfd
namespace: openshift-nfd
spec:
channel: "stable"
installPlanApproval: Automatic
name: nfd
source: redhat-operators
sourceNamespace: openshift-marketplace
Create the subscription object by running the following command:
$ oc create -f nfd-sub.yaml
Change to the openshift-nfd
project:
$ oc project openshift-nfd
To verify that the Operator deployment is successful, run:
$ oc get pods
NAME READY STATUS RESTARTS AGE
nfd-controller-manager-7f86ccfb58-vgr4x 2/2 Running 0 10m
A successful deployment shows a Running
status.
As a cluster administrator, you can install the NFD Operator using the web console.
In the OKD web console, click Operators → OperatorHub.
Choose Node Feature Discovery from the list of available Operators, and then click Install.
On the Install Operator page, select A specific namespace on the cluster, and then click Install. You do not need to create a namespace because it is created for you.
To verify that the NFD Operator installed successfully:
Navigate to the Operators → Installed Operators page.
Ensure that Node Feature Discovery is listed in the openshift-nfd project with a Status of InstallSucceeded.
During installation an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message. |
If the Operator does not appear as installed, troubleshoot further:
Navigate to the Operators → Installed Operators page and inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.
Navigate to the Workloads → Pods page and check the logs for pods in the openshift-nfd
project.
The Node Feature Discovery (NFD) Operator orchestrates all resources needed to run the Node-Feature-Discovery daemon set by watching for a NodeFeatureDiscovery
custom resource (CR). Based on the NodeFeatureDiscovery
CR, the Operator creates the operand (NFD) components in the selected namespace. You can edit the CR to use another namespace, image, image pull policy, and nfd-worker-conf
config map, among other options.
As a cluster administrator, you can create a NodeFeatureDiscovery
CR by using the OpenShift cli (oc
) or the web console.
As a cluster administrator, you can create a NodeFeatureDiscovery
CR instance by using the OpenShift cli (oc
).
The |
The following example shows the use of -rhel9
to acquire the correct image.
You have access to an OKD cluster
You installed the OpenShift cli (oc
).
You logged in as a user with cluster-admin
privileges.
You installed the NFD Operator.
Create a NodeFeatureDiscovery
CR:
NodeFeatureDiscovery
CRapiVersion: nfd.openshift.io/v1
kind: NodeFeatureDiscovery
metadata:
name: nfd-instance
namespace: openshift-nfd
spec:
instance: "" # instance is empty by default
topologyupdater: false # False by default
operand:
image: registry.redhat.io/openshift4/ose-node-feature-discovery-rhel9:v4.13
imagePullPolicy: Always
workerConfig:
configData: |
core:
# labelWhiteList:
# noPublish: false
sleepInterval: 60s
# sources: [all]
# klog:
# addDirHeader: false
# alsologtostderr: false
# logBacktraceAt:
# logtostderr: true
# skipHeaders: false
# stderrthreshold: 2
# v: 0
# vmodule:
## NOTE: the following options are not dynamically run-time configurable
## and require a nfd-worker restart to take effect after being changed
# logDir:
# logFile:
# logFileMaxSize: 1800
# skipLogHeaders: false
sources:
cpu:
cpuid:
# NOTE: whitelist has priority over blacklist
attributeBlacklist:
- "BMI1"
- "BMI2"
- "CLMUL"
- "CMOV"
- "CX16"
- "ERMS"
- "F16C"
- "HTT"
- "LZCNT"
- "MMX"
- "MMXEXT"
- "NX"
- "POPCNT"
- "RDRAND"
- "RDSEED"
- "RDTSCP"
- "SGX"
- "SSE"
- "SSE2"
- "SSE3"
- "SSE4.1"
- "SSE4.2"
- "SSSE3"
attributeWhitelist:
kernel:
kconfigFile: "/path/to/kconfig"
configOpts:
- "NO_HZ"
- "X86"
- "DMI"
pci:
deviceClassWhitelist:
- "0200"
- "03"
- "12"
deviceLabelFields:
- "class"
customConfig:
configData: |
- name: "more.kernel.features"
matchOn:
- loadedKMod: ["example_kmod3"]
Create the NodeFeatureDiscovery
CR by running the following command:
$ oc apply -f <filename>
Check that the NodeFeatureDiscovery
CR was created by running the following command:
$ oc get pods
NAME READY STATUS RESTARTS AGE
nfd-controller-manager-7f86ccfb58-vgr4x 2/2 Running 0 11m
nfd-master-hcn64 1/1 Running 0 60s
nfd-master-lnnxx 1/1 Running 0 60s
nfd-master-mp6hr 1/1 Running 0 60s
nfd-worker-vgcz9 1/1 Running 0 60s
nfd-worker-xqbws 1/1 Running 0 60s
A successful deployment shows a Running
status.
As a cluster administrator, you can create a NodeFeatureDiscovery
CR instance by using the OpenShift cli (oc
).
You have access to an OKD cluster
You installed the OpenShift cli (oc
).
You logged in as a user with cluster-admin
privileges.
You installed the NFD Operator.
You have access to a mirror registry with the required images.
You installed the skopeo
cli tool.
Determine the digest of the registry image:
Run the following command:
$ skopeo inspect docker://registry.redhat.io/openshift4/ose-node-feature-discovery:<openshift_version>
$ skopeo inspect docker://registry.redhat.io/openshift4/ose-node-feature-discovery:v4.12
Inspect the output to identify the image digest:
{
...
"Digest": "sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef",
...
}
Use the skopeo
cli tool to copy the image from registry.redhat.io
to your mirror registry, by running the following command:
skopeo copy docker://registry.redhat.io/openshift4/ose-node-feature-discovery@<image_digest> docker://<mirror_registry>/openshift4/ose-node-feature-discovery@<image_digest>
skopeo copy docker://registry.redhat.io/openshift4/ose-node-feature-discovery@sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef docker://<your-mirror-registry>/openshift4/ose-node-feature-discovery@sha256:1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef
Create a NodeFeatureDiscovery
CR:
NodeFeatureDiscovery
CRapiVersion: nfd.openshift.io/v1
kind: NodeFeatureDiscovery
metadata:
name: nfd-instance
spec:
operand:
image: <mirror_registry>/openshift4/ose-node-feature-discovery@<image_digest>
imagePullPolicy: Always
workerConfig:
configData: |
core:
# labelWhiteList:
# noPublish: false
sleepInterval: 60s
# sources: [all]
# klog:
# addDirHeader: false
# alsologtostderr: false
# logBacktraceAt:
# logtostderr: true
# skipHeaders: false
# stderrthreshold: 2
# v: 0
# vmodule:
## NOTE: the following options are not dynamically run-time configurable
## and require a nfd-worker restart to take effect after being changed
# logDir:
# logFile:
# logFileMaxSize: 1800
# skipLogHeaders: false
sources:
cpu:
cpuid:
# NOTE: whitelist has priority over blacklist
attributeBlacklist:
- "BMI1"
- "BMI2"
- "CLMUL"
- "CMOV"
- "CX16"
- "ERMS"
- "F16C"
- "HTT"
- "LZCNT"
- "MMX"
- "MMXEXT"
- "NX"
- "POPCNT"
- "RDRAND"
- "RDSEED"
- "RDTSCP"
- "SGX"
- "SSE"
- "SSE2"
- "SSE3"
- "SSE4.1"
- "SSE4.2"
- "SSSE3"
attributeWhitelist:
kernel:
kconfigFile: "/path/to/kconfig"
configOpts:
- "NO_HZ"
- "X86"
- "DMI"
pci:
deviceClassWhitelist:
- "0200"
- "03"
- "12"
deviceLabelFields:
- "class"
customConfig:
configData: |
- name: "more.kernel.features"
matchOn:
- loadedKMod: ["example_kmod3"]
Create the NodeFeatureDiscovery
CR by running the following command:
$ oc apply -f <filename>
Check the status of the NodeFeatureDiscovery
CR by running the following command:
$ oc get nodefeaturediscovery nfd-instance -o yaml
Check that the pods are running without ImagePullBackOff
errors by running the following command:
$ oc get pods -n <nfd_namespace>
As a cluster administrator, you can create a NodeFeatureDiscovery
CR by using the OKD web console.
You have access to an OKD cluster
You logged in as a user with cluster-admin
privileges.
You installed the NFD Operator.
Navigate to the Operators → Installed Operators page.
In the Node Feature Discovery section, under Provided APIs, click Create instance.
Edit the values of the NodeFeatureDiscovery
CR.
click Create.
The core
section contains common configuration settings that are not specific to any particular feature source.
core.sleepInterval
specifies the interval between consecutive passes of feature detection or re-detection, and thus also the interval between node re-labeling. A non-positive value implies infinite sleep interval; no re-detection or re-labeling is done.
This value is overridden by the deprecated --sleep-interval
command line flag, if specified.
core:
sleepInterval: 60s (1)
The default value is 60s
.
core.sources
specifies the list of enabled feature sources. A special value all
enables all feature sources.
This value is overridden by the deprecated --sources
command line flag, if specified.
Default: [all]
core:
sources:
- system
- custom
core.labelWhiteList
specifies a regular expression for filtering feature labels based on the label name. Non-matching labels are not published.
The regular expression is only matched against the basename part of the label, the part of the name after '/'. The label prefix, or namespace, is omitted.
This value is overridden by the deprecated --label-whitelist
command line flag, if specified.
Default: null
core:
labelWhiteList: '^cpu-cpuid'
Setting core.noPublish
to true
disables all communication with the nfd-master
. It is effectively a dry run flag; nfd-worker
runs feature detection normally, but no labeling requests are sent to nfd-master
.
This value is overridden by the --no-publish
command line flag, if specified.
Example:
core:
noPublish: true (1)
The default value is false
.
The following options specify the logger configuration, most of which can be dynamically adjusted at run-time.
The logger options can also be specified using command line flags, which take precedence over any corresponding config file options.
If set to true
, core.klog.addDirHeader
adds the file directory to the header of the log messages.
Default: false
Run-time configurable: yes
Log to standard error as well as files.
Default: false
Run-time configurable: yes
When logging hits line file:N, emit a stack trace.
Default: empty
Run-time configurable: yes
If non-empty, write log files in this directory.
Default: empty
Run-time configurable: no
If not empty, use this log file.
Default: empty
Run-time configurable: no
core.klog.logFileMaxSize
defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0
, the maximum file size is unlimited.
Default: 1800
Run-time configurable: no
Log to standard error instead of files
Default: true
Run-time configurable: yes
If core.klog.skipHeaders
is set to true
, avoid header prefixes in the log messages.
Default: false
Run-time configurable: yes
If core.klog.skipLogHeaders
is set to true
, avoid headers when opening log files.
Default: false
Run-time configurable: no
Logs at or above this threshold go to stderr.
Default: 2
Run-time configurable: yes
core.klog.v
is the number for the log level verbosity.
Default: 0
Run-time configurable: yes
core.klog.vmodule
is a comma-separated list of pattern=N
settings for file-filtered logging.
Default: empty
Run-time configurable: yes
The sources
section contains feature source specific configuration parameters.
Prevent publishing cpuid
features listed in this option.
This value is overridden by sources.cpu.cpuid.attributeWhitelist
, if specified.
Default: [BMI1, BMI2, CLMUL, CMOV, CX16, ERMS, F16C, HTT, LZCNT, MMX, MMXEXT, NX, POPCNT, RDRAND, RDSEED, RDTSCP, SGX, SGXLC, SSE, SSE2, SSE3, SSE4.1, SSE4.2, SSSE3]
sources:
cpu:
cpuid:
attributeBlacklist: [MMX, MMXEXT]
Only publish the cpuid
features listed in this option.
sources.cpu.cpuid.attributeWhitelist
takes precedence over sources.cpu.cpuid.attributeBlacklist
.
Default: empty
sources:
cpu:
cpuid:
attributeWhitelist: [AVX512BW, AVX512CD, AVX512DQ, AVX512F, AVX512VL]
sources.kernel.kconfigFile
is the path of the kernel config file. If empty, NFD runs a search in the well-known standard locations.
Default: empty
sources:
kernel:
kconfigFile: "/path/to/kconfig"
sources.kernel.configOpts
represents kernel configuration options to publish as feature labels.
Default: [NO_HZ, NO_HZ_IDLE, NO_HZ_FULL, PREEMPT]
sources:
kernel:
configOpts: [NO_HZ, X86, DMI]
sources.pci.deviceClassWhitelist
is a list of PCI device class IDs for which to publish a label. It can be specified as a main class only (for example, 03
) or full class-subclass combination (for example 0300
). The former implies that all
subclasses are accepted. The format of the labels can be further configured with deviceLabelFields
.
Default: ["03", "0b40", "12"]
sources:
pci:
deviceClassWhitelist: ["0200", "03"]
sources.pci.deviceLabelFields
is the set of PCI ID fields to use when constructing the name of the feature label. Valid fields are class
, vendor
, device
, subsystem_vendor
and subsystem_device
.
Default: [class, vendor]
sources:
pci:
deviceLabelFields: [class, vendor, device]
With the example config above, NFD would publish labels such as feature.node.kubernetes.io/pci-<class-id>_<vendor-id>_<device-id>.present=true
sources.usb.deviceClassWhitelist
is a list of USB device class IDs for
which to publish a feature label. The format of the labels can be further
configured with deviceLabelFields
.
Default: ["0e", "ef", "fe", "ff"]
sources:
usb:
deviceClassWhitelist: ["ef", "ff"]
sources.usb.deviceLabelFields
is the set of USB ID fields from which to compose the name of the feature label. Valid fields are class
, vendor
, and device
.
Default: [class, vendor, device]
sources:
pci:
deviceLabelFields: [class, vendor]
With the example config above, NFD would publish labels like: feature.node.kubernetes.io/usb-<class-id>_<vendor-id>.present=true
.
sources.custom
is the list of rules to process in the custom feature source to create user-specific labels.
Default: empty
source:
custom:
- name: "my.custom.feature"
matchOn:
- loadedKMod: ["e1000e"]
- pciId:
class: ["0200"]
vendor: ["8086"]
NodeFeatureRule
objects are a NodeFeatureDiscovery
custom resource designed for rule-based custom labeling of nodes. Some use cases include application-specific labeling or distribution by hardware vendors to create specific labels for their devices.
NodeFeatureRule
objects provide a method to create vendor- or application-specific labels and taints. It uses a flexible rule-based mechanism for creating labels and optionally taints based on node features.
Create a NodeFeatureRule
object to label nodes if a set of rules match the conditions.
Create a custom resource file named nodefeaturerule.yaml
that contains the following text:
apiVersion: nfd.openshift.io/v1
kind: NodeFeatureRule
metadata:
name: example-rule
spec:
rules:
- name: "example rule"
labels:
"example-custom-feature": "true"
# Label is created if all of the rules below match
matchFeatures:
# Match if "veth" kernel module is loaded
- feature: kernel.loadedmodule
matchExpressions:
veth: {op: Exists}
# Match if any PCI device with vendor 8086 exists in the system
- feature: pci.device
matchExpressions:
vendor: {op: In, value: ["8086"]}
This custom resource specifies that labelling occurs when the veth
module is loaded and any PCI device with vendor code 8086
exists in the cluster.
Apply the nodefeaturerule.yaml
file to your cluster by running the following command:
$ oc apply -f https://raw.githubusercontent.com/kubernetes-sigs/node-feature-discovery/v0.13.6/examples/nodefeaturerule.yaml
The example applies the feature label on nodes with the veth
module loaded and any PCI device with vendor code 8086
exists.
A relabeling delay of up to 1 minute might occur. |
The Node Feature Discovery (NFD) Topology Updater is a daemon responsible for examining allocated resources on a worker node. It accounts for resources that are available to be allocated to new pod on a per-zone basis, where a zone can be a Non-Uniform Memory Access (NUMA) node. The NFD Topology Updater communicates the information to nfd-master, which creates a NodeResourceTopology
custom resource (CR) corresponding to all of the worker nodes in the cluster. One instance of the NFD Topology Updater runs on each node of the cluster.
To enable the Topology Updater workers in NFD, set the topologyupdater
variable to true
in the NodeFeatureDiscovery
CR, as described in the section Using the Node Feature Discovery Operator.
When run with NFD Topology Updater, NFD creates custom resource instances corresponding to the node resource hardware topology, such as:
apiVersion: topology.node.k8s.io/v1alpha1
kind: NodeResourceTopology
metadata:
name: node1
topologyPolicies: ["SingleNUMANodeContainerLevel"]
zones:
- name: node-0
type: Node
resources:
- name: cpu
capacity: 20
allocatable: 16
available: 10
- name: vendor/nic1
capacity: 3
allocatable: 3
available: 3
- name: node-1
type: Node
resources:
- name: cpu
capacity: 30
allocatable: 30
available: 15
- name: vendor/nic2
capacity: 6
allocatable: 6
available: 6
- name: node-2
type: Node
resources:
- name: cpu
capacity: 30
allocatable: 30
available: 15
- name: vendor/nic1
capacity: 3
allocatable: 3
available: 3
To view available command line flags, run the nfd-topology-updater -help
command. For example, in a podman container, run the following command:
$ podman run gcr.io/k8s-staging-nfd/node-feature-discovery:master nfd-topology-updater -help
The -ca-file
flag is one of the three flags, together with the -cert-file
and `-key-file`flags, that controls the mutual TLS authentication on the NFD Topology Updater. This flag specifies the TLS root certificate that is used for verifying the authenticity of nfd-master.
Default: empty
The |
$ nfd-topology-updater -ca-file=/opt/nfd/ca.crt -cert-file=/opt/nfd/updater.crt -key-file=/opt/nfd/updater.key
The -cert-file
flag is one of the three flags, together with the -ca-file
and -key-file flags
, that controls mutual TLS authentication on the NFD Topology Updater. This flag specifies the TLS certificate presented for authenticating outgoing requests.
Default: empty
The |
$ nfd-topology-updater -cert-file=/opt/nfd/updater.crt -key-file=/opt/nfd/updater.key -ca-file=/opt/nfd/ca.crt
Print usage and exit.
The -key-file
flag is one of the three flags, together with the -ca-file
and -cert-file
flags, that controls the mutual TLS authentication on the NFD Topology Updater. This flag specifies the private key corresponding the given certificate file, or -cert-file
, that is used for authenticating outgoing requests.
Default: empty
The |
$ nfd-topology-updater -key-file=/opt/nfd/updater.key -cert-file=/opt/nfd/updater.crt -ca-file=/opt/nfd/ca.crt
The -kubelet-config-file
specifies the path to the Kubelet’s configuration
file.
Default: /host-var/lib/kubelet/config.yaml
$ nfd-topology-updater -kubelet-config-file=/var/lib/kubelet/config.yaml
The -no-publish
flag disables all communication with the nfd-master, making it a dry run flag for nfd-topology-updater. NFD Topology Updater runs resource hardware topology detection normally, but no CR requests are sent to nfd-master.
Default: false
$ nfd-topology-updater -no-publish
The -oneshot
flag causes the NFD Topology Updater to exit after one pass of resource hardware topology detection.
Default: false
$ nfd-topology-updater -oneshot -no-publish
The -podresources-socket
flag specifies the path to the Unix socket where kubelet exports a gRPC service to enable discovery of in-use CPUs and devices, and to provide metadata for them.
Default: /host-var/liblib/kubelet/pod-resources/kubelet.sock
$ nfd-topology-updater -podresources-socket=/var/lib/kubelet/pod-resources/kubelet.sock
The -server
flag specifies the address of the nfd-master endpoint to connect to.
Default: localhost:8080
$ nfd-topology-updater -server=nfd-master.nfd.svc.cluster.local:443
The -server-name-override
flag specifies the common name (CN) which to expect from the nfd-master TLS certificate. This flag is mostly intended for development and debugging purposes.
Default: empty
$ nfd-topology-updater -server-name-override=localhost
The -sleep-interval
flag specifies the interval between resource hardware topology re-examination and custom resource updates. A non-positive value implies infinite sleep interval and no re-detection is done.
Default: 60s
$ nfd-topology-updater -sleep-interval=1h
Print version and exit.
The -watch-namespace
flag specifies the namespace to ensure that resource hardware topology examination only happens for the pods running in the
specified namespace. Pods that are not running in the specified namespace are not considered during resource accounting. This is particularly useful for testing and debugging purposes. A *
value means that all of the pods across all namespaces are considered during the accounting process.
Default: *
$ nfd-topology-updater -watch-namespace=rte