Use zero touch provisioning (ZTP) to provision distributed units at new edge sites in a disconnected environment. The workflow starts when the site is connected to the network and ends with the CNF workload deployed and running on the site nodes.
ZTP for RAN deployments is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope. |
Telco edge computing presents extraordinary challenges with managing hundreds to tens of thousands of clusters in hundreds of thousands of locations. These challenges require fully-automated management solutions with, as closely as possible, zero human interaction.
Zero touch provisioning (ZTP) allows you to provision new edge sites with declarative configurations of bare-metal equipment at remote sites. Template or overlay configurations install OpenShift Container Platform features that are required for CNF workloads. End-to-end functional test suites are used to verify CNF related features. All configurations are declarative in nature.
You start the workflow by creating declarative configurations for ISO images that are delivered to the edge nodes to begin the installation process. The images are used to repeatedly provision large numbers of nodes efficiently and quickly, allowing you keep up with requirements from the field for far edge nodes.
Service providers are deploying a more distributed mobile network architecture allowed by the modular functional framework defined for 5G. This allows service providers to move from appliance-based radio access networks (RAN) to open cloud RAN architecture, gaining flexibility and agility in delivering services to end users.
The following diagram shows how ZTP works within a far edge framework.
ZTP uses the GitOps deployment set of practices for infrastructure deployment that allows developers to perform tasks that would otherwise fall under the purview of IT operations. GitOps achieves these tasks using declarative specifications stored in Git repositories, such as YAML files and other defined patterns, that provide a framework for deploying the infrastructure. The declarative output is leveraged by the Open Cluster Manager for multisite deployment.
One of the motivators for a GitOps approach is the requirement for reliability at scale. This is a significant challenge that GitOps helps solve.
GitOps addresses the reliability issue by providing traceability, RBAC, and a single source of truth for the desired state of each site. Scale issues are addressed by GitOps providing structure, tooling, and event driven operations through webhooks.
You can install a distributed unit (DU) on a single node at scale with Red Hat Advanced Cluster Management (RHACM) (ACM) using the assisted installer (AI) and the policy generator with core-reduction technology enabled. The DU installation is done using zero touch provisioning (ZTP) in a disconnected environment.
ACM manages clusters in a hub and spoke architecture, where a single hub cluster manages many spoke clusters. ACM applies radio access network (RAN) policies from predefined custom resources (CRs). Hub clusters running ACM provision and deploy the spoke clusters using ZTP and AI. DU installation follows the AI installation of OpenShift Container Platform on a single node.
The AI service handles provisioning of OpenShift Container Platform on single nodes running on bare metal. ACM ships with and deploys the assisted installer when the MultiClusterHub
custom resource is installed.
With ZTP and AI, you can provision OpenShift Container Platform single nodes to run your DUs at scale. A high level overview of ZTP for distributed units in a disconnected environment is as follows:
A hub cluster running ACM manages a disconnected internal registry that mirrors the OpenShift Container Platform release images. The internal registry is used to provision the spoke single nodes.
You manage the bare-metal host machines for your DUs in an inventory file that uses YAML for formatting. You store the inventory file in a Git repository.
You install the DU bare-metal host machines on site, and make the hosts ready for provisioning. To be ready for provisioning, the following is required for each bare-metal host:
Network connectivity - including DNS for your network. Hosts should be reachable through the hub and managed spoke clusters. Ensure there is layer 3 connectivity between the hub and the host where you want to install your hub cluster.
Baseboard Management Controller (BMC) details for each host - ZTP uses BMC details to connect the URL and credentials for accessing the BMC. Create spoke cluster definition CRs. These define the relevant elements for the managed clusters. Required CRs are as follows:
Custom Resource | Description |
---|---|
Namespace |
Namespace for the managed single-node cluster. |
BMCSecret CR |
Credentials for the host BMC. |
Image Pull Secret CR |
Pull secret for the disconnected registry. |
AgentClusterInstall |
Specifies the single-node cluster’s configuration such as networking, number of supervisor (control plane) nodes, and so on. |
Clusterdeployment |
Defines the cluster name, domain, and other details. |
KlusterletAddonConfig |
Manages installation and termination of add-ons on the ManagedCluster for ACM. |
ManagedCluster |
Describes the managed cluster for ACM. |
InfraEnv |
Describes the installation ISO to be mounted on the destination node that the assisted installer service creates. This is the final step of the manifest creation phase. |
BareMetalHost |
Describes the details of the bare-metal host, including BMC and credentials details. |
When a change is detected in the host inventory repository, a host management event is triggered to provision the new or updated host.
The host is provisioned. When the host is provisioned and successfully rebooted, the host agent reports Ready
status to the hub cluster.
ACM deploys single-node OpenShift, which is OpenShift Container Platform installed on single nodes, leveraging zero touch provisioning (ZTP). The initial site plan is broken down into smaller components and initial configuration data is stored in a Git repository. Zero touch provisioning uses a declarative GitOps approach to deploy these nodes. The deployment of the nodes includes:
Installing the host operating system (RHCOS) on a blank server.
Deploying OpenShift Container Platform on single nodes.
Creating cluster policies and site subscriptions.
Leveraging a GitOps deployment topology for a develop once, deploy anywhere model.
Making the necessary network configurations to the server operating system.
Deploying profile Operators and performing any needed software-related configuration, such as performance profile, PTP, and SR-IOV.
Downloading images needed to run workloads (CNFs).
You use zero touch provisioning (ZTP) to deploy single-node OpenShift clusters to run distributed units (DUs) on small hardware footprints at disconnected far edge sites. A single-node cluster runs OpenShift Container Platform on top of one bare-metal host, hence the single node. Edge servers contain a single node with supervisor functions and worker functions on the same host that are deployed at low bandwidth or disconnected edge sites.
OpenShift Container Platform is configured on the single node to use workload partitioning. Workload partitioning separates cluster management workloads from user workloads and can run the cluster management workloads on a reserved set of CPUs. Workload partitioning is useful for resource-constrained environments, such as single-node production deployments, where you want to reserve most of the CPU resources for user workloads and configure OpenShift Container Platform to use fewer CPU resources within the host.
A single-node cluster hosting a DU application on a node is divided into the following configuration categories:
Common - Values are the same for all single-node cluster sites managed by a hub cluster.
Pools of sites - Common across a pool of sites where a pool size can be 1 to n.
Site specific - Likely specific to a site with no overlap with other sites, for example, a vlan.
Site planning for distributed units (DU) deployments is complex. The following is an overview of the tasks that you complete before the DU hosts are brought online in the production environment.
Develop a network model. The network model depends on various factors such as the size of the area of coverage, number of hosts, projected traffic load, DNS, and DHCP requirements.
Decide how many DU radio nodes are required to provide sufficient coverage and redundancy for your network.
Develop mechanical and electrical specifications for the DU host hardware.
Develop a construction plan for individual DU site installations.
Tune host BIOS settings for production, and deploy the BIOS configuration to the hosts.
Install the equipment on-site, connect hosts to the network, and apply power.
Configure on-site switches and routers.
Perform basic connectivity tests for the host machines.
Establish production network connectivity, and verify host connections to the network.
Provision and deploy on-site DU hosts at scale.
Test and verify on-site operations, performing load and scale testing of the DU hosts before finally bringing the DU infrastructure online in the live production environment.
Low latency is an integral part of the development of 5G networks. Telecommunications networks require as little signal delay as possible to ensure quality of service in a variety of critical use cases.
Low latency processing is essential for any communication with timing constraints that affect functionality and security. For example, 5G Telco applications require a guaranteed one millisecond one-way latency to meet Internet of Things (IoT) requirements. Low latency is also critical for the future development of autonomous vehicles, smart factories, and online gaming. Networks in these environments require almost a real-time flow of data.
Low latency systems are about guarantees with regards to response and processing times. This includes keeping a communication protocol running smoothly, ensuring device security with fast responses to error conditions, or just making sure a system is not lagging behind when receiving a lot of data. Low latency is key for optimal synchronization of radio transmissions.
OpenShift Container Platform enables low latency processing for DUs running on COTS hardware by using a number of technologies and specialized hardware devices:
Ensures workloads are handled with a high degree of process determinism.
Avoids CPU scheduling delays and ensures CPU capacity is available consistently.
Aligns memory and huge pages with CPU and PCI devices to pin guaranteed container memory and huge pages to the NUMA node. This decreases latency and improves performance of the node.
Using huge page sizes improves system performance by reducing the amount of system resources required to access page tables.
Allows synchronization between nodes in the network with sub-microsecond accuracy.
Distributed unit (DU) hosts require the BIOS to be configured before the host can be provisioned. The BIOS configuration is dependent on the specific hardware that runs your DUs and the particular requirements of your installation.
In this Developer Preview release, configuration and tuning of BIOS for DU bare-metal host machines is the responsibility of the customer. Automatic setting of BIOS is not handled by the zero touch provisioning workflow. |
Set the UEFI/BIOS Boot Mode to UEFI
.
In the host boot sequence order, set Hard drive first.
Apply the specific BIOS configuration for your hardware. The following table describes a representative BIOS configuration for an Intel Xeon Skylake or Intel Cascade Lake server, based on the Intel FlexRAN 4G and 5G baseband PHY reference design.
The exact BIOS configuration depends on your specific hardware and network requirements. The following sample configuration is for illustrative purposes only. |
BIOS Setting | Configuration |
---|---|
CPU Power and Performance Policy |
Performance |
Uncore Frequency Scaling |
Disabled |
Performance P-limit |
Disabled |
Enhanced Intel SpeedStep ® Tech |
Enabled |
Intel Configurable TDP |
Enabled |
Configurable TDP Level |
Level 2 |
Intel® Turbo Boost Technology |
Enabled |
Energy Efficient Turbo |
Disabled |
Hardware P-States |
Disabled |
Package C-State |
C0/C1 state |
C1E |
Disabled |
Processor C6 |
Disabled |
Enable global SR-IOV and VT-d settings in the BIOS for the host. These settings are relevant to bare-metal environments. |
Before you can provision distributed units (DU) at scale, you must install Red Hat Advanced Cluster Management (RHACM), which handles the provisioning of the DUs.
RHACM is deployed as an Operator on the OpenShift Container Platform hub cluster. It controls clusters and applications from a single console with built-in security policies. RHACM provisions and manage your DU hosts. To install RHACM in a disconnected environment, you create a mirror registry that mirrors the Operator Lifecycle Manager (OLM) catalog that contains the required Operator images. OLM manages, installs, and upgrades Operators and their dependencies in the cluster.
You also use a disconnected mirror host to serve the RHCOS ISO and RootFS disk images that provision the DU bare-metal host operating system.
Before you install a cluster on infrastructure that you provision in a restricted network, you must mirror the required container images into that environment. You can also use this procedure in unrestricted networks to ensure your clusters only use container images that have satisfied your organizational controls on external content.
You must have access to the internet to obtain the necessary container images. In this procedure, you place the mirror registry on a mirror host that has access to both your network and the internet. If you do not have access to a mirror host, use the disconnected procedure to copy images to a device that you can move across network boundaries. |
You must have a container image registry that supports Docker v2-2 in the location that will host the OpenShift Container Platform cluster, such as one of the following registries:
If you have an entitlement to Red Hat Quay, see the documentation on deploying Red Hat Quay for proof-of-concept purposes or by using the Quay Operator. If you need additional assistance selecting and installing a registry, contact your sales representative or Red Hat support.
Red Hat does not test third party registries with OpenShift Container Platform. |
You can mirror the images that are required for OpenShift Container Platform installation and subsequent product updates to a container mirror registry such as Red Hat Quay, JFrog Artifactory, Sonatype Nexus Repository, or Harbor. If you do not have access to a large-scale container registry, you can use the mirror registry for Red Hat OpenShift, a small-scale container registry included with OpenShift Container Platform subscriptions.
You can use any container registry that supports Docker v2-2, such as Red Hat Quay, the mirror registry for Red Hat OpenShift, Artifactory, Sonatype Nexus Repository, or Harbor. Regardless of your chosen registry, the procedure to mirror content from Red Hat hosted sites on the internet to an isolated image registry is the same. After you mirror the content, you configure each cluster to retrieve this content from your mirror registry.
The internal registry of the OpenShift Container Platform cluster cannot be used as the target registry because it does not support pushing without a tag, which is required during the mirroring process. |
If choosing a container registry that is not the mirror registry for Red Hat OpenShift, it must be reachable by every machine in the clusters that you provision. If the registry is unreachable, installation, updating, or normal operations such as workload relocation might fail. For that reason, you must run mirror registries in a highly available way, and the mirror registries must at least match the production availability of your OpenShift Container Platform clusters.
When you populate your mirror registry with OpenShift Container Platform images, you can follow two scenarios. If you have a host that can access both the internet and your mirror registry, but not your cluster nodes, you can directly mirror the content from that machine. This process is referred to as connected mirroring. If you have no such host, you must mirror the images to a file system and then bring that host or removable media into your restricted environment. This process is referred to as disconnected mirroring.
For mirrored registries, to view the source of pulled images, you must review the Trying to access
log entry in the CRI-O logs. Other methods to view the image pull source, such as using the crictl images
command on a node, show the non-mirrored image name, even though the image is pulled from the mirrored location.
Red Hat does not test third party registries with OpenShift Container Platform. |
For information on viewing the CRI-O logs to view the image source, see Viewing the image pull source.
Before you perform the mirror procedure, you must prepare the host to retrieve content and push it to the remote location.
You can install the OpenShift CLI (oc
) to interact with OpenShift Container Platform from a
command-line interface. You can install oc
on Linux, Windows, or macOS.
If you installed an earlier version of |
You can install the OpenShift CLI (oc
) binary on Linux by using the following procedure.
Navigate to the OpenShift Container Platform downloads page on the Red Hat Customer Portal.
Select the appropriate version in the Version drop-down menu.
Click Download Now next to the OpenShift v4.9 Linux Client entry and save the file.
Unpack the archive:
$ tar xvf <file>
Place the oc
binary in a directory that is on your PATH
.
To check your PATH
, execute the following command:
$ echo $PATH
After you install the OpenShift CLI, it is available using the oc
command:
$ oc <command>
You can install the OpenShift CLI (oc
) binary on Windows by using the following procedure.
Navigate to the OpenShift Container Platform downloads page on the Red Hat Customer Portal.
Select the appropriate version in the Version drop-down menu.
Click Download Now next to the OpenShift v4.9 Windows Client entry and save the file.
Unzip the archive with a ZIP program.
Move the oc
binary to a directory that is on your PATH
.
To check your PATH
, open the command prompt and execute the following command:
C:\> path
After you install the OpenShift CLI, it is available using the oc
command:
C:\> oc <command>
You can install the OpenShift CLI (oc
) binary on macOS by using the following procedure.
Navigate to the OpenShift Container Platform downloads page on the Red Hat Customer Portal.
Select the appropriate version in the Version drop-down menu.
Click Download Now next to the OpenShift v4.9 MacOSX Client entry and save the file.
Unpack and unzip the archive.
Move the oc
binary to a directory on your PATH.
To check your PATH
, open a terminal and execute the following command:
$ echo $PATH
After you install the OpenShift CLI, it is available using the oc
command:
$ oc <command>
Create a container image registry credentials file that allows mirroring images from Red Hat to your mirror.
You configured a mirror registry to use in your disconnected environment.
Complete the following steps on the installation host:
Download your registry.redhat.io
pull secret from the Red Hat OpenShift Cluster Manager and save it to a .json
file.
Generate the base64-encoded user name and password or token for your mirror registry:
$ echo -n '<user_name>:<password>' | base64 -w0 (1)
BGVtbYk3ZHAtqXs=
1 | For <user_name> and <password> , specify the user name and password that
you configured for your registry. |
Make a copy of your pull secret in JSON format:
$ cat ./pull-secret.text | jq . > <path>/<pull_secret_file_in_json>(1)
1 | Specify the path to the folder to store the pull secret in and a name for the JSON file that you create. |
Save the file either as ~/.docker/config.json
or $XDG_RUNTIME_DIR/containers/auth.json
.
The contents of the file resemble the following example:
{
"auths": {
"cloud.openshift.com": {
"auth": "b3BlbnNo...",
"email": "you@example.com"
},
"quay.io": {
"auth": "b3BlbnNo...",
"email": "you@example.com"
},
"registry.connect.redhat.com": {
"auth": "NTE3Njg5Nj...",
"email": "you@example.com"
},
"registry.redhat.io": {
"auth": "NTE3Njg5Nj...",
"email": "you@example.com"
}
}
}
Edit the new file and add a section that describes your registry to it:
"auths": {
"<mirror_registry>": { (1)
"auth": "<credentials>", (2)
"email": "you@example.com"
}
},
1 | For <mirror_registry> , specify the registry domain name, and optionally the
port, that your mirror registry uses to serve content. For example,
registry.example.com or registry.example.com:8443 |
2 | For <credentials> , specify the base64-encoded user name and password for
the mirror registry. |
The file resembles the following example:
{
"auths": {
"registry.example.com": {
"auth": "BGVtbYk3ZHAtqXs=",
"email": "you@example.com"
},
"cloud.openshift.com": {
"auth": "b3BlbnNo...",
"email": "you@example.com"
},
"quay.io": {
"auth": "b3BlbnNo...",
"email": "you@example.com"
},
"registry.connect.redhat.com": {
"auth": "NTE3Njg5Nj...",
"email": "you@example.com"
},
"registry.redhat.io": {
"auth": "NTE3Njg5Nj...",
"email": "you@example.com"
}
}
}
Mirror the OpenShift Container Platform image repository to your registry to use during cluster installation or upgrade.
Your mirror host has access to the internet.
You configured a mirror registry to use in your restricted network and can access the certificate and credentials that you configured.
You downloaded the pull secret from the Red Hat OpenShift Cluster Manager and modified it to include authentication to your mirror repository.
If you use self-signed certificates that do not set a Subject Alternative Name, you must precede the oc
commands in this procedure with GODEBUG=x509ignoreCN=0
. If you do not set this variable, the oc
commands will fail with the following error:
x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0
Complete the following steps on the mirror host:
Review the OpenShift Container Platform downloads page to determine the version of OpenShift Container Platform that you want to install and determine the corresponding tag on the Repository Tags page.
Set the required environment variables:
Export the release version:
$ OCP_RELEASE=<release_version>
For <release_version>
, specify the tag that corresponds to the version of OpenShift Container Platform to
install, such as 4.5.4
.
Export the local registry name and host port:
$ LOCAL_REGISTRY='<local_registry_host_name>:<local_registry_host_port>'
For <local_registry_host_name>
, specify the registry domain name for your mirror
repository, and for <local_registry_host_port>
, specify the port that it
serves content on.
Export the local repository name:
$ LOCAL_REPOSITORY='<local_repository_name>'
For <local_repository_name>
, specify the name of the repository to create in your
registry, such as ocp4/openshift4
.
Export the name of the repository to mirror:
$ PRODUCT_REPO='openshift-release-dev'
For a production release, you must specify openshift-release-dev
.
Export the path to your registry pull secret:
$ LOCAL_SECRET_JSON='<path_to_pull_secret>'
For <path_to_pull_secret>
, specify the absolute path to and file name of the pull secret for your mirror registry that you created.
Export the release mirror:
$ RELEASE_NAME="ocp-release"
For a production release, you must specify ocp-release
.
Export the type of architecture for your server, such as x86_64
:
$ ARCHITECTURE=<server_architecture>
Export the path to the directory to host the mirrored images:
$ REMOVABLE_MEDIA_PATH=<path> (1)
1 | Specify the full path, including the initial forward slash (/) character. |
Mirror the version images to the mirror registry:
If your mirror host does not have internet access, take the following actions:
Connect the removable media to a system that is connected to the internet.
Review the images and configuration manifests to mirror:
$ oc adm release mirror -a ${LOCAL_SECRET_JSON} \
--from=quay.io/${PRODUCT_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-${ARCHITECTURE} \
--to=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY} \
--to-release-image=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}-${ARCHITECTURE} --dry-run
Record the entire imageContentSources
section from the output of the previous
command. The information about your mirrors is unique to your mirrored repository, and you must add the imageContentSources
section to the install-config.yaml
file during installation.
Mirror the images to a directory on the removable media:
$ oc adm release mirror -a ${LOCAL_SECRET_JSON} --to-dir=${REMOVABLE_MEDIA_PATH}/mirror quay.io/${PRODUCT_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-${ARCHITECTURE}
Take the media to the restricted network environment and upload the images to the local container registry.
$ oc image mirror -a ${LOCAL_SECRET_JSON} --from-dir=${REMOVABLE_MEDIA_PATH}/mirror "file://openshift/release:${OCP_RELEASE}*" ${LOCAL_REGISTRY}/${LOCAL_REPOSITORY} (1)
1 | For REMOVABLE_MEDIA_PATH , you must use the same path that you specified when you mirrored the images. |
If the local container registry is connected to the mirror host, take the following actions:
Directly push the release images to the local registry by using following command:
$ oc adm release mirror -a ${LOCAL_SECRET_JSON} \
--from=quay.io/${PRODUCT_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-${ARCHITECTURE} \
--to=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY} \
--to-release-image=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}-${ARCHITECTURE}
This command pulls the release information as a digest, and its output includes
the imageContentSources
data that you require when you install your cluster.
Record the entire imageContentSources
section from the output of the previous
command. The information about your mirrors is unique to your mirrored repository, and you must add the imageContentSources
section to the install-config.yaml
file during installation.
The image name gets patched to Quay.io during the mirroring process, and the podman images will show Quay.io in the registry on the bootstrap virtual machine. |
To create the installation program that is based on the content that you mirrored, extract it and pin it to the release:
If your mirror host does not have internet access, run the following command:
$ oc adm release extract -a ${LOCAL_SECRET_JSON} --command=openshift-install "${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}"
If the local container registry is connected to the mirror host, run the following command:
$ oc adm release extract -a ${LOCAL_SECRET_JSON} --command=openshift-install "${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}-${ARCHITECTURE}"
To ensure that you use the correct images for the version of OpenShift Container Platform that you selected, you must extract the installation program from the mirrored content. You must perform this step on a machine with an active internet connection. If you are in a disconnected environment, use the |
For clusters using installer-provisioned infrastructure, run the following command:
$ openshift-install
Before you install a cluster on infrastructure that you provision, you must create Red Hat Enterprise Linux CoreOS (RHCOS) machines for it to use. Use a disconnected mirror to host the RHCOS images you require to provision your distributed unit (DU) bare-metal hosts.
Deploy and configure an HTTP server to host the RHCOS image resources on the network. You must be able to access the HTTP server from your computer, and from the machines that you create.
The RHCOS images might not change with every release of OpenShift Container Platform. You must download images with the highest version that is less than or equal to the OpenShift Container Platform version that you install. Use the image versions that match your OpenShift Container Platform version if they are available. You require ISO and RootFS images to install RHCOS on the DU hosts. RHCOS qcow2 images are not supported for this installation type. |
Log in to the mirror host.
Obtain the RHCOS ISO and RootFS images from mirror.openshift.com, for example:
Export the required image names and OpenShift Container Platform version as environment variables:
$ export ISO_IMAGE_NAME=<iso_image_name> (1)
$ export ROOTFS_IMAGE_NAME=<rootfs_image_name> (2)
$ export OCP_VERSION=<ocp_version> (3)
1 | ISO image name, for example, rhcos-4.9.0-fc.1-x86_64-live.x86_64.iso |
2 | RootFS image name, for example, rhcos-4.9.0-fc.1-x86_64-live-rootfs.x86_64.img |
3 | OpenShift Container Platform version, for example, latest-4.9 |
Download the required images:
$ sudo wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/${OCP_VERSION}/${ISO_IMAGE_NAME} -O /var/www/html/${ISO_IMAGE_NAME}
$ sudo wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/${OCP_VERSION}/${ROOTFS_IMAGE_NAME} -O /var/www/html/${ROOTFS_IMAGE_NAME}
Verify that the images downloaded successfully and are being served on the disconnected mirror host, for example:
$ wget http://$(hostname)/${ISO_IMAGE_NAME}
...
Saving to: rhcos-4.9.0-fc.1-x86_64-live.x86_64.iso
rhcos-4.9.0-fc.1-x86_64- 11%[====> ] 10.01M 4.71MB/s
...
You use Red Hat Advanced Cluster Management (RHACM) on a hub cluster in the disconnected environment to manage the deployment of distributed unit (DU) profiles on multiple managed spoke clusters.
Install the OpenShift Container Platform CLI (oc
).
Log in as a user with cluster-admin
privileges.
Configure a disconnected mirror registry for use in the cluster.
If you want to deploy Operators to the spoke clusters, you must also add them to this registry. See Mirroring an Operator catalog for more information. |
Install RHACM on the hub cluster in the disconnected environment. See Installing RHACM in a disconnected environment.
The Assisted Installer Service (AIS) deploys OpenShift Container Platform clusters. Red Hat Advanced Cluster Management (RHACM) ships with AIS. AIS is deployed when you enable the MultiClusterHub Operator on the RHACM hub cluster.
For distributed units (DUs), RHACM supports OpenShift Container Platform deployments that run on a single bare-metal host. The single-node cluster acts as both a control plane and a worker node.
Install OpenShift Container Platform 4.9 on a hub cluster.
Install RHACM and create the MultiClusterHub
resource.
Create persistent volume custom resources (CR) for database and file system storage.
You have installed the OpenShift CLI (oc
).
Modify the HiveConfig
resource to enable the feature gate for Assisted Installer:
$ oc patch hiveconfig hive --type merge -p '{"spec":{"targetNamespace":"hive","logLevel":"debug","featureGates":{"custom":{"enabled":["AlphaAgentInstallStrategy"]},"featureSet":"Custom"}}}'
Modify the Provisioning
resource to allow the Bare Metal Operator to watch all namespaces:
$ oc patch provisioning provisioning-configuration --type merge -p '{"spec":{"watchAllNamespaces": true }}'
Create the AgentServiceConfig
CR.
Save the following YAML in the agent_service_config.yaml
file:
apiVersion: agent-install.openshift.io/v1beta1
kind: AgentServiceConfig
metadata:
name: agent
spec:
databaseStorage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: <db_volume_size> (1)
filesystemStorage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: <fs_volume_size> (2)
osImages: (3)
- openshiftVersion: "<ocp_version>" (4)
version: "<ocp_release_version>" (5)
url: "<iso_url>" (6)
rootFSUrl: "<root_fs_url>" (7)
cpuArchitecture: "x86_64"
1 | Volume size for the databaseStorage field, for example 10Gi . |
2 | Volume size for the filesystemStorage field, for example 20Gi . |
3 | List of OS image details. Example describes a single OpenShift Container Platform OS version. |
4 | OpenShift Container Platform version to install, for example, 4.8 . |
5 | Specific install version, for example, 47.83.202103251640-0 . |
6 | ISO url, for example, https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.7/4.7.7/rhcos-4.7.7-x86_64-live.x86_64.iso . |
7 | Root FS image URL, for example, https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.7/4.7.7/rhcos-live-rootfs.x86_64.img |
Create the AgentServiceConfig
CR by running the following command:
$ oc create -f agent_service_config.yaml
agentserviceconfig.agent-install.openshift.io/agent created
Zero touch provisioning (ZTP) uses custom resource (CR) objects to extend the Kubernetes API or introduce your own API into a project or a cluster. These CRs contain the site-specific data required to install and configure a cluster for RAN applications.
A custom resource definition (CRD) file defines your own object kinds. Deploying a CRD into the managed cluster causes the Kubernetes API server to begin serving the specified CR for the entire lifecycle.
For each CR in the <site>.yaml
file on the managed cluster, ZTP uses the data to create installation CRs in a directory named for the cluster.
ZTP provides two ways for defining and installing CRs on managed clusters: a manual approach when you are provisioning a single cluster and an automated approach when provisioning multiple clusters.
Use this method when you are creating CRs for a single cluster. This is a good way to test your CRs before deploying on a larger scale.
Use the automated SiteConfig method when you are installing multiple managed clusters, for example, in batches of up to 100 clusters. SiteConfig uses ArgoCD as the engine for the GitOps method of site deployment. After completing a site plan that contains all of the required parameters for deployment, a policy generator creates the manifests and applies them to the hub cluster.
Both methods create the CRs shown in the following table. On the cluster site, an automated Discovery image ISO file creates a directory with the site name and a file with the cluster name. Every cluster has its own namespace, and all of the CRs are under that namespace. The namespace and the CR names match the cluster name.
Resource | Description | Usage |
---|---|---|
|
Contains the connection information for the Baseboard Management Controller (BMC) of the target bare-metal host. |
Provides access to the BMC in order to load and boot the Discovery image ISO on the target server by using the Redfish protocol. |
|
Contains information for pulling OpenShift Container Platform onto the target bare-metal host. |
Used with Clusterdeployment to generate the Discovery ISO for the managed cluster. |
|
Specifies the managed cluster’s configuration such as networking and the number of supervisor (control plane) nodes. Shows the |
Specifies the managed cluster configuration information and provides status during the installation of the cluster. |
|
References the |
Used with |
|
Provides network configuration information such as |
Sets up a static IP address for the managed cluster’s Kube API server. |
|
Contains hardware information about the target bare-metal host. |
Created automatically on the hub when the target machine’s Discovery image ISO boots. |
|
When a cluster is managed by the hub, it must be imported and known. This Kubernetes object provides that interface. |
The hub uses this resource to manage and show the status of managed clusters. |
|
Contains the list of services provided by the hub to be deployed to a |
Tells the hub which addon services to deploy to a |
|
Logical space for |
Propagates resources to the |
|
Two custom resources are created: |
|
|
Contains OpenShift Container Platform image information such as the repository and image name. |
Passed into resources to provide OpenShift Container Platform images. |
This procedure tells you how to manually create and deploy a single managed cluster. If you are creating multiple clusters, perhaps hundreds, use the SiteConfig
method described in
“Creating ZTP custom resources for multiple managed clusters”.
Enable Assisted Installer Service.
Ensure network connectivity:
The container within the hub must be able to reach the Baseboard Management Controller (BMC) address of the target bare-metal host.
The managed cluster must be able to resolve and reach the hub’s API hostname
and *.app
hostname.
Example of the hub’s API and *.app
hostname:
console-openshift-console.apps.hub-cluster.internal.domain.com
api.hub-cluster.internal.domain.com
The hub must be able to resolve and reach the API and *.app
hostname of the managed cluster.
Here is an example of the managed cluster’s API and *.app
hostname:
console-openshift-console.apps.sno-managed-cluster-1.internal.domain.com
api.sno-managed-cluster-1.internal.domain.com
A DNS Server that is IP reachable from the target bare-metal host.
A target bare-metal host for the managed cluster with the following hardware minimums:
4 CPU or 8 vCPU
32 GiB RAM
120 GiB Disk for root filesystem
When working in a disconnected environment, the release image needs to be mirrored. Use this command to mirror the release image:
oc adm release mirror -a <pull_secret.json>
--from=quay.io/openshift-release-dev/ocp-release:{{ mirror_version_spoke_release }}
--to={{ provisioner_cluster_registry }}/ocp4 --to-release-image={{
provisioner_cluster_registry }}/ocp4:{{ mirror_version_spoke_release }}
You mirrored the ISO and rootfs
used to generate the spoke cluster ISO to an HTTP server and configured the settings to pull images from there.
The images must match the version of the ClusterImageSet
. To deploy a 4.9.0 version, the rootfs
and ISO need to be set at 4.9.0.
Create a ClusterImageSet
for each specific cluster version that needs to be deployed. A ClusterImageSet
has the following format:
apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
name: openshift-4.9.0-rc.0 (1)
spec:
releaseImage: quay.io/openshift-release-dev/ocp-release:4.9.0-x86_64 (2)
1 | The descriptive version that you want to deploy. |
2 | Points to the specific release image to deploy. |
Create the Namespace
definition for the managed cluster:
apiVersion: v1
kind: Namespace
metadata:
name: <cluster_name> (1)
labels:
name: <cluster_name> (1)
1 | The name of the managed cluster to provision. |
Create the BMC Secret
custom resource:
apiVersion: v1
data:
password: <bmc_password> (1)
username: <bmc_username> (2)
kind: Secret
metadata:
name: <cluster_name>-bmc-secret
namespace: <cluster_name>
type: Opaque
1 | The password to the target bare-metal host. Must be base-64 encoded. |
2 | The username to the target bare-metal host. Must be base-64 encoded. |
Create the Image Pull Secret
custom resource:
apiVersion: v1
data:
.dockerconfigjson: <pull_secret> (1)
kind: Secret
metadata:
name: assisted-deployment-pull-secret
namespace: <cluster_name>
type: kubernetes.io/dockerconfigjson
1 | The OpenShift Container Platform pull secret. Must be base-64 encoded. |
Create the AgentClusterInstall
custom resource:
apiVersion: extensions.hive.openshift.io/v1beta1
kind: AgentClusterInstall
metadata:
# Only include the annotation if using OVN, otherwise omit the annotation
annotations:
agent-install.openshift.io/install-config-overrides: '{"networking":{"networkType":"OVNKubernetes"}}'
name: <cluster_name>
namespace: <cluster_name>
spec:
clusterdeploymentRef:
name: <cluster_name>
imageSetRef:
name: <cluster_image_set> (1)
networking:
clusterNetwork:
- cidr: <cluster_network_cidr> (2)
hostPrefix: 23
machineNetwork:
- cidr: <machine_network_cidr> (3)
serviceNetwork:
- <service_network_cidr> (4)
provisionRequirements:
controlPlaneAgents: 1
workerAgents: 0
sshPublicKey: <public_key> (5)
1 | The name of the ClusterImageSet custom resource used to install OpenShift Container Platform on the bare-metal host. |
2 | A block of IPv4 or IPv6 addresses in CIDR notation used for communication among cluster nodes. |
3 | A block of IPv4 or IPv6 addresses in CIDR notation used for the target bare-metal host external communication. Also used to determine the API and Ingress VIP addresses when provisioning DU single-node clusters. |
4 | A block of IPv4 or IPv6 addresses in CIDR notation used for cluster services internal communication. |
5 | Entered as plain text. You can use the public key to SSH into the node after it has finished installing. |
If you want to configure a static IP for the managed cluster at this point, see the procedure in this document for configuring static IP addresses for managed clusters. |
Create the Clusterdeployment
custom resource:
apiVersion: hive.openshift.io/v1
kind: Clusterdeployment
metadata:
name: <cluster_name>
namespace: <cluster_name>
spec:
baseDomain: <base_domain> (1)
clusterInstallRef:
group: extensions.hive.openshift.io
kind: AgentClusterInstall
name: <cluster_name>
version: v1beta1
clusterName: <cluster_name>
platform:
agentBareMetal:
agentSelector:
matchLabels:
cluster-name: <cluster_name>
pullSecretRef:
name: assisted-deployment-pull-secret
1 | The managed cluster’s base domain. |
Create the KlusterletAddonConfig
custom resource:
apiVersion: agent.open-cluster-management.io/v1
kind: KlusterletAddonConfig
metadata:
name: <cluster_name>
namespace: <cluster_name>
spec:
clusterName: <cluster_name>
clusterNamespace: <cluster_name>
clusterLabels:
cloud: auto-detect
vendor: auto-detect
applicationManager:
enabled: true
certPolicyController:
enabled: false
iamPolicyController:
enabled: false
policyController:
enabled: true
searchCollector:
enabled: false (1)
1 | Set to true to enable KlusterletAddonConfig or false to disable the KlusterletAddonConfig. Keep searchCollector disabled. |
Create the ManagedCluster
custom resource:
apiVersion: cluster.open-cluster-management.io/v1
kind: ManagedCluster
metadata:
name: <cluster_name>
spec:
hubAcceptsClient: true
Create the InfraEnv
custom resource:
apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
name: <cluster_name>
namespace: <cluster_name>
spec:
clusterRef:
name: <cluster_name>
namespace: <cluster_name>
sshAuthorizedKey: <public_key> (1)
agentLabels: (2)
location: "<label-name>"
pullSecretRef:
name: assisted-deployment-pull-secret
1 | Entered as plain text. You can use the public key to SSH into the target bare-metal host when it boots from the ISO. |
2 | Sets a label to match. The labels apply when the agents boot. |
Create the BareMetalHost
custom resource:
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: <cluster_name>
namespace: <cluster_name>
annotations:
inspect.metal3.io: disabled
labels:
infraenvs.agent-install.openshift.io: "<cluster_name>"
spec:
bootMode: "UEFI"
bmc:
address: <bmc_address> (1)
disableCertificateVerification: true
credentialsName: <cluster_name>-bmc-secret
bootMACAddress: <mac_address> (2)
automatedCleaningMode: disabled
online: true
1 | The baseboard management console address of the installation ISO on the target bare-metal host. |
2 | The MAC address of the target bare-metal host. |
Optionally, you can add bmac.agent-install.openshift.io/hostname: <host-name>
as an annotation to set the managed cluster’s hostname. If you don’t add the annotation, the hostname will default to either a hostname from the DHCP server or local host.
After you have created the custom resources, push the entire directory of generated custom resources to the Git repository you created for storing the custom resources.
To provision additional clusters, repeat this procedure for each cluster.
Optionally, after creating the AgentClusterInstall
custom resource, you can configure static IP addresses for the managed clusters.
You must create this custom resource before creating the |
Deploy and configure the AgentClusterInstall
custom resource.
Create a NMStateConfig
custom resource:
apiVersion: agent-install.openshift.io/v1beta1
kind: NMStateConfig
metadata:
name: <cluster_name>
namespace: <cluster_name>
labels:
sno-cluster-<cluster-name>: <cluster_name>
spec:
config:
interfaces:
- name: eth0
type: ethernet
state: up
ipv4:
enabled: true
address:
- ip: <ip_address> (1)
prefix-length: <public_network_prefix> (2)
dhcp: false
dns-resolver:
config:
server:
- <dns_resolver> (3)
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: <gateway> (4)
next-hop-interface: eth0
table-id: 254
interfaces:
- name: "eth0" (5)
macAddress: <mac_address> (6)
1 | The static IP address of the target bare-metal host. |
2 | The static IP address’s subnet prefix for the target bare-metal host. |
3 | The DNS server for the target bare-metal host. |
4 | The gateway for the target bare-metal host. |
5 | Must match the name specified in the interfaces section. |
6 | The mac address of the interface. |
When creating the BareMetalHost
custom resource, ensure that one of its mac addresses matches a mac address in the NMStateConfig
target bare-metal host.
When creating the InfraEnv
custom resource, reference the label from the NMStateConfig
custom resource in the InfraEnv
custom resource:
apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
name: <cluster_name>
namespace: <cluster_name>
spec:
clusterRef:
name: <cluster_name>
namespace: <cluster_name>
sshAuthorizedKey: <public_key>
agentLabels: (1)
location: "<label-name>"
pullSecretRef:
name: assisted-deployment-pull-secret
nmStateConfigLabelSelector:
matchLabels:
sno-cluster-<cluster-name>: <cluster_name> # Match this label
1 | Sets a label to match. The labels apply when the agents boot. |
After you create the custom resources, the following actions happen automatically:
A Discovery image ISO file is generated and booted on the target machine.
When the ISO file successfully boots on the target machine it reports the hardware information of the target machine.
After all hosts are discovered, OpenShift Container Platform is installed.
When OpenShift Container Platform finishes installing, the hub installs the klusterlet
service on the target cluster.
The requested add-on services are installed on the target cluster.
The Discovery image ISO process finishes when the Agent
custom resource is created on the hub for the managed cluster.
Ensure that cluster provisioning was successful by checking the cluster status.
All of the custom resources have been configured and provisioned, and the Agent
custom resource is created on the hub for the managed cluster.
Check the status of the managed cluster:
$ oc get managedcluster
True
indicates the managed cluster is ready.
Check the agent status:
$ oc get agent -n <cluster_name>
Use the describe
command to provide an in-depth description of the agent’s condition. Statuses to be aware of include BackendError
, InputError
, ValidationsFailing
, InstallationFailed
, and AgentIsConnected
. These statuses are relevant to the Agent
and AgentClusterInstall
custom resources.
$ oc describe agent -n <cluster_name>
Check the cluster provisioning status:
$ oc get agentclusterinstall -n <cluster_name>
Use the describe
command to provide an in-depth description of the cluster provisioning status:
$ oc describe agentclusterinstall -n <cluster_name>
Check the status of the managed cluster’s add-on services:
$ oc get managedclusteraddon -n <cluster_name>
Retrieve the authentication information of the kubeconfig
file for the managed cluster:
$ oc get secret -n <cluster_name> <cluster_name>-admin-kubeconfig -o jsonpath={.data.kubeconfig} | base64 -d > <directory>/<cluster_name>-kubeconfig
After you have completed the preceding procedure, follow these steps to configure the managed cluster for a disconnected environment.
A disconnected installation of Red Hat Advanced Cluster Management (RHACM) 2.3.
Host the rootfs
and iso
images on an HTTPD server.
Create a ConfigMap
containing the mirror registry config:
apiVersion: v1
kind: ConfigMap
metadata:
name: assisted-installer-mirror-config
namespace: assisted-installer
labels:
app: assisted-service
data:
ca-bundle.crt: <certificate> (1)
registries.conf: | (2)
unqualified-search-registries = ["registry.access.redhat.com", "docker.io"]
[[registry]]
location = <mirror_registry_url> (3)
insecure = false
mirror-by-digest-only = true
1 | The mirror registry’s certificate used when creating the mirror registry. |
2 | The configuration for the mirror registry. |
3 | The URL of the mirror registry. |
This updates mirrorRegistryRef
in the AgentServiceConfig
custom resource, as shown below:
apiVersion: agent-install.openshift.io/v1beta1
kind: AgentServiceConfig
metadata:
name: agent
namespace: assisted-installer
spec:
databaseStorage:
volumeName: <db_pv_name>
accessModes:
- ReadWriteOnce
resources:
requests:
storage: <db_storage_size>
filesystemStorage:
volumeName: <fs_pv_name>
accessModes:
- ReadWriteOnce
resources:
requests:
storage: <fs_storage_size>
mirrorRegistryRef:
name: 'assisted-installer-mirror-config'
osImages:
- openshiftVersion: <ocp_version>
rootfs: <rootfs_url> (1)
url: <iso_url> (1)
1 | Must match the URLs of the HTTPD server. |
For disconnected installations, you must deploy an NTP clock that is reachable through the disconnected network.
You can do this by configuring chrony to act as server, editing the /etc/chrony.conf
file, and adding the following allowed IPv6 range:
# Allow NTP client access from local network.
#allow 192.168.0.0/16
local stratum 10
bindcmdaddress ::
allow 2620:52:0:1310::/64
Optionally, when you are creating the AgentClusterInstall
custom resource, you can configure IPv6 addresses for the managed clusters.
In the AgentClusterInstall
custom resource, modify the IP addresses in clusterNetwork
and serviceNetwork
for IPv6 addresses:
apiVersion: extensions.hive.openshift.io/v1beta1
kind: AgentClusterInstall
metadata:
# Only include the annotation if using OVN, otherwise omit the annotation
annotations:
agent-install.openshift.io/install-config-overrides: '{"networking":{"networkType":"OVNKubernetes"}}'
name: <cluster_name>
namespace: <cluster_name>
spec:
clusterdeploymentRef:
name: <cluster_name>
imageSetRef:
name: <cluster_image_set>
networking:
clusterNetwork:
- cidr: "fd01::/48"
hostPrefix: 64
machineNetwork:
- cidr: <machine_network_cidr>
serviceNetwork:
- "fd02::/112"
provisionRequirements:
controlPlaneAgents: 1
workerAgents: 0
sshPublicKey: <public_key>
Update the NMStateConfig
custom resource with the IPv6 addresses you defined.
Use this procedure to diagnose any installation issues that might occur with the managed clusters.
Check the status of the managed cluster:
$ oc get managedcluster
NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE
SNO-cluster true True True 2d19h
If the status in the AVAILABLE
column is True
, the managed cluster is being managed by the hub.
If the status in the AVAILABLE
column is Unknown
, the managed cluster is not being managed by the hub.
Use the following steps to continue checking to get more information.
Check the AgentClusterInstall
install status:
$ oc get clusterdeployment -n <cluster_name>
NAME PLATFORM REGION CLUSTERTYPE INSTALLED INFRAID VERSION POWERSTATE AGE
Sno0026 agent-baremetal false Initialized
2d14h
If the status in the INSTALLED
column is false
, the installation was unsuccessful.
If the installation failed, enter the following command to review the status of the AgentClusterInstall
resource:
$ oc describe agentclusterinstall -n <cluster_name> <cluster_name>
Resolve the errors and reset the cluster:
Remove the cluster’s managed cluster resource:
$ oc delete managedcluster <cluster_name>
Remove the cluster’s namespace:
$ oc delete namespace <cluster_name>
This deletes all of the namespace-scoped custom resources created for this cluster. You must wait for the ManagedCluster
CR deletion to complete before proceeding.
Recreate the custom resources for the managed cluster.
Zero touch provisioning (ZTP) uses Red Hat Advanced Cluster Management (RHACM) to apply the radio access network (RAN) policies using a policy-based governance approach to automatically monitor cluster activity.
The policy generator (PolicyGen) is a Kustomize plugin that facilitates creating ACM policies from predefined custom resources. There are three main items: Policy Categorization, Source CR policy, and PolicyGenTemplate. PolicyGen relies on these to generate the policies and their placement bindings and rules.
The following diagram shows how the RAN policy generator interacts with GitOps and ACM.
RAN policies are categorized into three main groups:
A policy that exists in the Common
category is applied to all clusters to be represented by the site plan.
A policy that exists in the Groups
category is applied to a group of clusters. Every group of clusters could have their own policies that exist under the
Groups category. For example, Groups/group1
could have its own policies that are applied to the clusters belonging to group1
.
A policy that exists in the Sites
category is applied to a specific cluster. Any cluster could have its own policies that exist in the Sites
category.
For example, Sites/cluster1
will have its own policies applied to cluster1
.
The following diagram shows how policies are generated.
Source custom resource policies include the following:
SR-IOV policies
PTP policies
Performance Add-on Operator policies
MachineConfigPool policies
SCTP policies
You need to define the source custom resource that generates the ACM policy with consideration of possible overlay to its metadata or spec/data.
For example, a common-namespace-policy
contains a Namespace
definition that exists in all managed clusters.
This namespace
is placed under the Common category and there are no changes for its spec or data across all clusters.
The following example shows the source custom resource for this namespace:
apiVersion: v1
kind: Namespace
metadata:
name: openshift-sriov-network-operator
labels:
openshift.io/run-level: "1"
The generated policy that applies this namespace
includes the namespace
as it is defined above without any change, as shown in this example:
apiVersion: policy.open-cluster-management.io/v1
kind: Policy
metadata:
name: common-sriov-sub-ns-policy
namespace: common-sub
annotations:
policy.open-cluster-management.io/categories: CM Configuration Management
policy.open-cluster-management.io/controls: CM-2 Baseline Configuration
policy.open-cluster-management.io/standards: NIST SP 800-53
spec:
remediationAction: enforce
disabled: false
policy-templates:
- objectDefinition:
apiVersion: policy.open-cluster-management.io/v1
kind: ConfigurationPolicy
metadata:
name: common-sriov-sub-ns-policy-config
spec:
remediationAction: enforce
severity: low
namespaceselector:
exclude:
- kube-*
include:
- '*'
object-templates:
- complianceType: musthave
objectDefinition:
apiVersion: v1
kind: Namespace
metadata:
labels:
openshift.io/run-level: "1"
name: openshift-sriov-network-operator
The following example shows a SriovNetworkNodePolicy
definition that exists in different clusters with a different specification for each cluster.
The example also shows the source custom resource for the SriovNetworkNodePolicy
:
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: sriov-nnp
namespace: openshift-sriov-network-operator
spec:
# The $ tells the policy generator to overlay/remove the spec.item in the generated policy.
deviceType: $deviceType
isRdma: false
nicSelector:
pfNames: [$pfNames]
nodeSelector:
node-role.kubernetes.io/worker: ""
numVfs: $numVfs
priority: $priority
resourceName: $resourceName
The SriovNetworkNodePolicy
name and namespace
are the same for all clusters, so both are defined in the source SriovNetworkNodePolicy
.
However, the generated policy requires the $deviceType
, $numVfs
, as input parameters in order to adjust the policy for each cluster.
The generated policy is shown in this example:
apiVersion: policy.open-cluster-management.io/v1
kind: Policy
metadata:
name: site-du-sno-1-sriov-nnp-mh-policy
namespace: sites-sub
annotations:
policy.open-cluster-management.io/categories: CM Configuration Management
policy.open-cluster-management.io/controls: CM-2 Baseline Configuration
policy.open-cluster-management.io/standards: NIST SP 800-53
spec:
remediationAction: enforce
disabled: false
policy-templates:
- objectDefinition:
apiVersion: policy.open-cluster-management.io/v1
kind: ConfigurationPolicy
metadata:
name: site-du-sno-1-sriov-nnp-mh-policy-config
spec:
remediationAction: enforce
severity: low
namespaceselector:
exclude:
- kube-*
include:
- '*'
object-templates:
- complianceType: musthave
objectDefinition:
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: sriov-nnp-du-mh
namespace: openshift-sriov-network-operator
spec:
deviceType: vfio-pci
isRdma: false
nicSelector:
pfNames:
- ens7f0
nodeSelector:
node-role.kubernetes.io/worker: ""
numVfs: 8
resourceName: du_mh
Defining the required input parameters as |
The PolicyGenTemplate.yaml
file is a Custom Resource Definition (CRD) that tells PolicyGen where to categorize the generated policies and which items need to be overlaid.
The following example shows the PolicyGenTemplate.yaml
file:
apiVersion: ran.openshift.io/v1
kind: PolicyGenTemplate
metadata:
name: "group-du-sno"
namespace: "group-du-sno"
spec:
bindingRules:
group-du-sno: ""
mcp: "master"
sourceFiles:
- fileName: ConsoleOperatorDisable.yaml
policyName: "console-policy"
- fileName: ClusterLogging.yaml
policyName: "cluster-log-policy"
spec:
curation:
curator:
schedule: "30 3 * * *"
collection:
logs:
type: "fluentd"
fluentd: {}
The group-du-ranGen.yaml
file defines a group of policies under a group named group-du
. This file defines a MachineConfigPool
worker-du
that is used as the node selector for any other policy defined in sourceFiles
. An ACM policy is generated for every source file that exists in sourceFiles
. And, a single placement binding and placement rule is generated to apply the cluster selection rule for group-du
policies.
Using the source file PtpConfigSlave.yaml
as an example, the PtpConfigSlave
has a definition of a PtpConfig
custom resource (CR). The generated policy for the PtpConfigSlave
example is named group-du-ptp-config-policy
. The PtpConfig
CR defined in the generated group-du-ptp-config-policy
is named du-ptp-slave
. The spec
defined in PtpConfigSlave.yaml
is placed under du-ptp-slave
along with the other spec
items defined under the source file.
The following example shows the group-du-ptp-config-policy
:
apiVersion: policy.open-cluster-management.io/v1
kind: Policy
metadata:
name: group-du-ptp-config-policy
namespace: groups-sub
annotations:
policy.open-cluster-management.io/categories: CM Configuration Management
policy.open-cluster-management.io/controls: CM-2 Baseline Configuration
policy.open-cluster-management.io/standards: NIST SP 800-53
spec:
remediationAction: enforce
disabled: false
policy-templates:
- objectDefinition:
apiVersion: policy.open-cluster-management.io/v1
kind: ConfigurationPolicy
metadata:
name: group-du-ptp-config-policy-config
spec:
remediationAction: enforce
severity: low
namespaceselector:
exclude:
- kube-*
include:
- '*'
object-templates:
- complianceType: musthave
objectDefinition:
apiVersion: ptp.openshift.io/v1
kind: PtpConfig
metadata:
name: slave
namespace: openshift-ptp
spec:
recommend:
- match:
- nodeLabel: node-role.kubernetes.io/worker-du
priority: 4
profile: slave
profile:
- interface: ens5f0
name: slave
phc2sysOpts: -a -r -n 24
ptp4lConf: |
[global]
#
# Default Data Set
#
twoStepFlag 1
slaveOnly 0
priority1 128
priority2 128
domainNumber 24
.....
The custom resources used to create the ACM policies should be defined with consideration of possible overlay to its metadata and spec/data. For example, if the custom resource metadata.name
does not change between clusters then you should set the metadata.name
value in the custom resource file. If the custom resource will have multiple instances in the same cluster, then the custom resource metadata.name
must be defined in the policy template file.
In order to apply the node selector for a specific machine config pool, you have to set the node selector value as $mcp
in order to let the policy generator overlay the $mcp
value with the defined mcp in the policy template.
Subscription source files do not change.
Install Kustomize
Install the Kustomize Policy Generator plug-in
Configure the kustomization.yaml
file to reference the policyGenerator.yaml
file. The following example shows the PolicyGenerator definition:
apiVersion: policyGenerator/v1
kind: PolicyGenerator
metadata:
name: acm-policy
namespace: acm-policy-generator
# The arguments should be given and defined as below with same order --policyGenTempPath= --sourcePath= --outPath= --stdout --customResources
argsOneLiner: ./ranPolicyGenTempExamples ./sourcePolicies ./out true false
Where:
policyGenTempPath
is the path to the policyGenTemp
files.
sourcePath
: is the path to the source policies.
outPath
: is the path to save the generated ACM policies.
stdout
: If true
, prints the generated policies to the console.
customResources
: If true
generates the CRs from the sourcePolicies
files without ACM policies.
Test PolicyGen by running the following commands:
$ cd cnf-features-deploy/ztp/ztp-policy-generator/
$ XDG_CONFIG_HOME=./ kustomize build --enable-alpha-plugins
An out
directory is created with the expected policies, as shown in this example:
out
├── common
│ ├── common-log-sub-ns-policy.yaml
│ ├── common-log-sub-oper-policy.yaml
│ ├── common-log-sub-policy.yaml
│ ├── common-pao-sub-catalog-policy.yaml
│ ├── common-pao-sub-ns-policy.yaml
│ ├── common-pao-sub-oper-policy.yaml
│ ├── common-pao-sub-policy.yaml
│ ├── common-policies-placementbinding.yaml
│ ├── common-policies-placementrule.yaml
│ ├── common-ptp-sub-ns-policy.yaml
│ ├── common-ptp-sub-oper-policy.yaml
│ ├── common-ptp-sub-policy.yaml
│ ├── common-sriov-sub-ns-policy.yaml
│ ├── common-sriov-sub-oper-policy.yaml
│ └── common-sriov-sub-policy.yaml
├── groups
│ ├── group-du
│ │ ├── group-du-mc-chronyd-policy.yaml
│ │ ├── group-du-mc-mount-ns-policy.yaml
│ │ ├── group-du-mcp-du-policy.yaml
│ │ ├── group-du-mc-sctp-policy.yaml
│ │ ├── group-du-policies-placementbinding.yaml
│ │ ├── group-du-policies-placementrule.yaml
│ │ ├── group-du-ptp-config-policy.yaml
│ │ └── group-du-sriov-operconfig-policy.yaml
│ └── group-sno-du
│ ├── group-du-sno-policies-placementbinding.yaml
│ ├── group-du-sno-policies-placementrule.yaml
│ ├── group-sno-du-console-policy.yaml
│ ├── group-sno-du-log-forwarder-policy.yaml
│ └── group-sno-du-log-policy.yaml
└── sites
└── site-du-sno-1
├── site-du-sno-1-policies-placementbinding.yaml
├── site-du-sno-1-policies-placementrule.yaml
├── site-du-sno-1-sriov-nn-fh-policy.yaml
├── site-du-sno-1-sriov-nnp-mh-policy.yaml
├── site-du-sno-1-sriov-nw-fh-policy.yaml
├── site-du-sno-1-sriov-nw-mh-policy.yaml
└── site-du-sno-1-.yaml
The common policies are flat because they will be applied to all clusters. However, the groups and sites have subdirectories for each group and site as they will be applied to different clusters.
Zero touch provisioning (ZTP) provisions clusters using a layered approach. The base components consist of Red Hat Enterprise Linux CoreOS (RHCOS), the basic operating system for the cluster, and OpenShift Container Platform. After these components are installed, the worker node can join the existing cluster. When the node has joined the existing cluster, the 5G RAN profile Operators are applied.
The following diagram illustrates this architecture.
The following RAN Operators are deployed on every cluster:
Machine Config
Precision Time Protocol (PTP)
Performance Addon Operator
SR-IOV
Local Storage Operator
Logging Operator
The Machine Config Operator enables system definitions and low-level system settings such as workload partitioning, NTP, and SCTP. This Operator is installed with OpenShift Container Platform.
A performance profile and its created products are applied to a node according to an associated machine config pool (MCP). The MCP holds valuable information about the progress of applying the machine configurations created by performance addons that encompass kernel args, kube config, huge pages allocation, and deployment of the realtime kernel (rt-kernel). The performance addons controller monitors changes in the MCP and updates the performance profile status accordingly.
The Performance Addon Operator provides the ability to enable advanced node performance tunings on a set of nodes.
OpenShift Container Platform provides a Performance Addon Operator to implement automatic tuning to achieve low latency performance for OpenShift Container Platform applications. The cluster administrator uses this performance profile configuration that makes it easier to make these changes in a more reliable way.
The administrator can specify updating the kernel to rt-kernel
, reserving CPUs for management workloads,
and using CPUs for running the workloads.
The Single Root I/O Virtualization (SR-IOV) Network Operator manages the SR-IOV network devices and network attachments in your cluster.
The SR-IOV Operator allows network interfaces to be virtual and shared at a device level with networking functions running within the cluster.
The SR-IOV Network Operator adds the SriovOperatorConfig.sriovnetwork.openshift.io
CustomResourceDefinition resource. The Operator automatically creates a SriovOperatorConfig custom resource named default
in the openshift-sriov-network-operator
namespace. The default
custom resource contains the SR-IOV Network Operator configuration for your cluster.
The Precision Time Protocol (PTP) Operator is a protocol used to synchronize clocks in a network. When used in conjunction with hardware support, PTP is capable of sub-microsecond accuracy. PTP support is divided between the kernel and user space.
The clocks synchronized by PTP are organized in a master-worker hierarchy. The workers are synchronized to their masters, which may be workers to their own masters. The hierarchy is created and updated automatically by the best master clock (BMC) algorithm, which runs on every clock. When a clock has only one port, it can be master or worker, such a clock is called an ordinary clock (OC). A clock with multiple ports can be master on one port and worker on another, such a clock is called a boundary clock (BC). The top-level master is called the grandmaster clock, which can be synchronized by using a Global Positioning System (GPS) time source. By using a GPS-based time source, disparate networks can be synchronized with a high-degree of accuracy.
If you are installing multiple managed clusters, zero touch provisioning (ZTP) uses ArgoCD and SiteConfig
to manage the processes that create the custom resources (CR) and generate and apply the policies for multiple clusters, in batches of no more than 100, using the GitOps approach.
Installing and deploying the clusters is a two stage process, as shown here:
OpenShift Container Platform cluster version 4.8 or higher and Red Hat GitOps Operator is installed.
Red Hat Advanced Cluster Management (RHACM) version 2.3 or above is installed.
For disconnected environments, make sure your source data Git repository and ztp-site-generator
container image are accessible from the hub cluster.
If you want additional custom content, such as extra install manifests or custom resources (CR) for policies, add them to the /usr/src/hook/ztp/source-crs/extra-manifest/
directory. Similarly, you can add additional configuration CRs, as referenced from a PolicyGenTemplate
, to the /usr/src/hook/ztp/source-crs/
directory.
Create a Containerfile
that adds your additional manifests to the Red Hat provided image, for example:
FROM <registry fqdn>/ztp-site-generator:latest (1)
COPY myInstallManifest.yaml /usr/src/hook/ztp/source-crs/extra-manifest/
COPY mySourceCR.yaml /usr/src/hook/ztp/source-crs/
1 | <registry fqdn> must point to a registry containing the ztp-site-generator container image provided by Red Hat. |
Build a new container image that includes these additional files:
$> podman build Containerfile.example
The procedures in this section tell you how to complete the following tasks:
Prepare the Git repository you need to host site configuration data.
Configure the hub cluster for generating the required installation and policy custom resources (CR).
Deploy the managed clusters using zero touch provisioning (ZTP).
Create a Git repository for hosting site configuration data. The zero touch provisioning (ZTP) pipeline requires read access to this repository.
Create a directory structure with separate paths for the SiteConfig
and PolicyGenTemplate
custom resources (CR).
Add pre-sync.yaml
and post-sync.yaml
from resource-hook-example/<policygentemplates>/
to the path for the PolicyGenTemplate
CRs.
Add pre-sync.yaml
and post-sync.yaml
from resource-hook-example/<siteconfig>/
to the path for the SiteConfig
CRs.
If your hub cluster operates in a disconnected environment, you must update the |
Apply the policygentemplates.ran.openshift.io
and siteconfigs.ran.openshift.io
CR definitions.
You can configure your hub cluster with a set of ArgoCD applications that generate the required installation and policy custom resources (CR) for each site based on a zero touch provisioning (ZTP) GitOps flow.
Install the Red Hat OpenShift GitOps Operator on your hub cluster.
Extract the administrator password for ArgoCD:
$ oc get secret openshift-gitops-cluster -n openshift-gitops -o jsonpath='{.data.admin\.password}' | base64 -d
Prepare the ArgoCD pipeline configuration:
Extract the ArgoCD deployment CRs from the ZTP site generator container using the latest container image version:
$ mkdir ztp
$ podman run --rm -v `pwd`/ztp:/mnt/ztp:Z registry.redhat.io/openshift4/ztp-site-generate-rhel8:v4.9.0-1 /bin/bash -c "cp -ar /usr/src/hook/ztp/* /mnt/ztp/"
The remaining steps in this section relate to the ztp/gitops-subscriptions/argocd/
directory.
Modify the source values of the two ArgoCD applications, deployment/clusters-app.yaml
and deployment/policies-app.yaml
with appropriate URL, targetRevision
branch, and path values. The path values must match those used in your Git repository.
Modify deployment/clusters-app.yaml
:
apiVersion: v1
kind: Namespace
metadata:
name: clusters-sub
---
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: clusters
namespace: openshift-gitops
spec:
destination:
server: https://kubernetes.default.svc
namespace: clusters-sub
project: default
source:
path: ztp/gitops-subscriptions/argocd/resource-hook-example/siteconfig (1)
repoURL: https://github.com/openshift-kni/cnf-features-deploy (2)
targetRevision: master (3)
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
1 | The ztp/gitops-subscriptions/argocd/ file path that contains the siteconfig CRs for the clusters. |
2 | The URL of the Git repository that contains the siteconfig custom resources that define site configuration for installing clusters. |
3 | The branch on the Git repository that contains the relevant site configuration data. |
Modify deployment/policies-app.yaml
:
apiVersion: v1
kind: Namespace
metadata:
name: policies-sub
---
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: policies
namespace: openshift-gitops
spec:
destination:
server: https://kubernetes.default.svc
namespace: policies-sub
project: default
source:
directory:
recurse: true
path: ztp/gitops-subscriptions/argocd/resource-hook-example/policygentemplates (1)
repoURL: https://github.com/openshift-kni/cnf-features-deploy (2)
targetRevision: master (3)
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
1 | The ztp/gitops-subscriptions/argocd/ file path that contains the policygentemplates CRs for the clusters. |
2 | The URL of the Git repository that contains the policygentemplates custom resources that specify configuration data for the site. |
3 | The branch on the Git repository that contains the relevant configuration data. |
To apply the pipeline configuration to your hub cluster, enter this command:
$ oc apply -k ./deployment
Add the required secrets for the site to the hub cluster. These resources must be in a namespace with a name that matches the cluster name.
Create a secret for authenticating to the site Baseboard Management Controller (BMC). Ensure the secret name matches the name used in the SiteConfig
.
In this example, the secret name is test-sno-bmh-secret
:
apiVersion: v1
kind: Secret
metadata:
name: test-sno-bmh-secret
namespace: test-sno
data:
password: dGVtcA==
username: cm9vdA==
type: Opaque
Create the pull secret for the site. The pull secret must contain all credentials necessary for installing OpenShift and all add-on Operators. In this example, the secret name is assisted-deployment-pull-secret
:
apiVersion: v1
kind: Secret
metadata:
name: assisted-deployment-pull-secret
namespace: test-sno
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: <Your pull secret base64 encoded>
The secrets are referenced from the |
ArgoCD acts as the engine for the GitOps method of site deployment. After completing a site plan that contains the required custom resources for the site installation, a policy generator creates the manifests and applies them to the hub cluster.
Create one or more SiteConfig
custom resources, site-config.yaml
files, that contains the site-plan data for the
clusters. For example:
apiVersion: ran.openshift.io/v1
kind: SiteConfig
metadata:
name: "test-sno"
namespace: "test-sno"
spec:
baseDomain: "clus2.t5g.lab.eng.bos.redhat.com"
pullSecretRef:
name: "assisted-deployment-pull-secret"
clusterImageSetNameRef: "openshift-4.9"
sshPublicKey: "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDB3dwhI5X0ZxGBb9VK7wclcPHLc8n7WAyKjTNInFjYNP9J+Zoc/ii+l3YbGUTuqilDwZN5rVIwBux2nUyVXDfaM5kPd9kACmxWtfEWTyVRootbrNWwRfKuC2h6cOd1IlcRBM1q6IzJ4d7+JVoltAxsabqLoCbK3svxaZoKAaK7jdGG030yvJzZaNM4PiTy39VQXXkCiMDmicxEBwZx1UsA8yWQsiOQ5brod9KQRXWAAST779gbvtgXR2L+MnVNROEHf1nEjZJwjwaHxoDQYHYKERxKRHlWFtmy5dNT6BbvOpJ2e5osDFPMEd41d2mUJTfxXiC1nvyjk9Irf8YJYnqJgBIxi0IxEllUKH7mTdKykHiPrDH5D2pRlp+Donl4n+sw6qoDc/3571O93+RQ6kUSAgAsvWiXrEfB/7kGgAa/BD5FeipkFrbSEpKPVu+gue1AQeJcz9BuLqdyPUQj2VUySkSg0FuGbG7fxkKeF1h3Sga7nuDOzRxck4I/8Z7FxMF/e8DmaBpgHAUIfxXnRqAImY9TyAZUEMT5ZPSvBRZNNmLbfex1n3NLcov/GEpQOqEYcjG5y57gJ60/av4oqjcVmgtaSOOAS0kZ3y9YDhjsaOcpmRYYijJn8URAH7NrW8EZsvAoF6GUt6xHq5T258c6xSYUm5L0iKvBqrOW9EjbLw== root@cnfdc2.clus2.t5g.lab.eng.bos.redhat.com"
clusters:
- clusterName: "test-sno"
clusterType: "sno"
clusterProfile: "du"
clusterLabels:
group-du-sno: ""
common: true
sites : "test-sno"
clusterNetwork:
- cidr: 1001:db9::/48
hostPrefix: 64
machineNetwork:
- cidr: 2620:52:0:10e7::/64
serviceNetwork:
- 1001:db7::/112
additionalNTPSources:
- 2620:52:0:1310::1f6
nodes:
- hostName: "test-sno.clus2.t5g.lab.eng.bos.redhat.com"
bmcAddress: "idrac-virtualmedia+https://[2620:52::10e7:f602:70ff:fee4:f4e2]/redfish/v1/Systems/System.Embedded.1"
bmcCredentialsName:
name: "test-sno-bmh-secret"
bmcDisableCertificateVerification: true (1)
bootMACAddress: "0C:42:A1:8A:74:EC"
bootMode: "UEFI"
rootDeviceHints:
hctl: '0:1:0'
cpuset: "0-1,52-53"
nodeNetwork:
interfaces:
- name: eno1
macAddress: "0C:42:A1:8A:74:EC"
config:
interfaces:
- name: eno1
type: ethernet
state: up
macAddress: "0C:42:A1:8A:74:EC"
ipv4:
enabled: false
ipv6:
enabled: true
address:
- ip: 2620:52::10e7:e42:a1ff:fe8a:900
prefix-length: 64
dns-resolver:
config:
search:
- clus2.t5g.lab.eng.bos.redhat.com
server:
- 2620:52:0:1310::1f6
routes:
config:
- destination: ::/0
next-hop-interface: eno1
next-hop-address: 2620:52:0:10e7::fc
table-id: 254
1 | If you are using UEFI SecureBoot , add this line to prevent failures due to invalid or local certificates. |
Save the files and push them to the zero touch provisioning (ZTP) Git repository accessible from the hub cluster and defined as a source repository of the ArgoCD application.
ArgoCD detects that the application is out of sync. Upon sync, either automatic or manual, ArgoCD synchronizes the PolicyGenTemplate
to the hub cluster and launches the associated resource hooks. These hooks are responsible for generating the policy wrapped configuration CRs that apply to the spoke cluster. The resource hooks convert the site definitions to installation custom resources and applies them to the hub cluster:
Namespace
- Unique per site
AgentClusterInstall
BareMetalHost
Clusterdeployment
InfraEnv
NMStateConfig
ExtraManifestsConfigMap
- Extra manifests. The additional manifests include workload partitioning, chronyd, mountpoint hiding, sctp enablement, and more.
ManagedCluster
KlusterletAddonConfig
Red Hat Advanced Cluster Management (RHACM) (ACM) deploys the hub cluster.
Use the following procedure to create the PolicyGenTemplates
you will need for generating policies in your Git repository for the hub cluster.
Create the PolicyGenTemplates
and save them to the zero touch provisioning (ZTP) Git repository accessible from the hub cluster and defined as a source repository of the ArgoCD application.
ArgoCD detects that the application is out of sync. Upon sync, either automatic or manual, ArgoCD applies the new PolicyGenTemplate
to the hub cluster and launches the associated resource hooks. These hooks are responsible for generating the policy wrapped configuration CRs that apply to the spoke cluster and perform the following actions:
Create the Red Hat Advanced Cluster Management (RHACM) (ACM) policies according to the basic distributed unit (DU) profile and required customizations.
Apply the generated policies to the hub cluster.
The ZTP process creates policies that direct ACM to apply the desired configuration to the cluster nodes.
The ArgoCD pipeline detects the SiteConfig
and PolicyGenTemplate
custom resources (CRs) in the Git repository and syncs them to the hub cluster. In the process, it generates installation and policy CRs and applies them to the hub cluster. You can monitor the progress of this synchronization in the ArgoCD dashboard.
Monitor the progress of cluster installation using the following commands:
$ export CLUSTER=<cluster_name>
$ oc get agentclusterinstall -n $CLUSTER $CLUSTER -o jsonpath='{.status.conditions[?(@.type=="Completed")]}' | jq
$ curl -sk $(oc get agentclusterinstall -n $CLUSTER $CLUSTER -o jsonpath='{.status.debugInfo.eventsURL}') | jq '.[-2,-1]'
Use the Red Hat Advanced Cluster Management (RHACM) (ACM) dashboard to monitor the progress of policy reconciliation.
To remove a site and the associated installation and policy custom resources (CRs), remove the SiteConfig
and site-specific PolicyGenTemplate
CRs from the Git repository. The pipeline hooks remove the generated CRs.
Before removing a |
Use the following procedure if you want to remove the ArgoCD pipeline and all generated artifacts.
Detach all clusters from ACM.
Delete all SiteConfig
and PolicyGenTemplate
custom resources (CRs) from your Git repository.
Delete the following namespaces:
All policy namespaces:
$ oc get policy -A
clusters-sub
policies-sub
Process the directory using the Kustomize tool:
$ oc delete -k cnf-features-deploy/ztp/gitops-subscriptions/argocd/deployment
As noted, the ArgoCD pipeline synchronizes the SiteConfig
and PolicyGenTemplate
custom resources (CR) from the Git repository to the hub cluster. During this process, post-sync hooks create the installation and policy CRs that are also applied to the hub cluster. Use the following procedures to troubleshoot issues that might occur in this process.
SiteConfig
applies Installation custom resources (CR) to the hub cluster in a namespace with the name matching the site name. To check the status, enter the following command:
$ oc get AgentClusterInstall -n <cluster_name>
If no object is returned, use the following procedure to troubleshoot the ArgoCD pipeline flow from SiteConfig
to the installation CRs.
Check the synchronization of the SiteConfig
to the hub cluster using either of the following commands:
$ oc get siteconfig -A
or
$ oc get siteconfig -n clusters-sub
If the SiteConfig
is missing, one of the following situations has occurred:
The clusters application failed to synchronize the CR from the Git repository to the hub. Use the following command to verify this:
$ oc describe -n openshift-gitops application clusters
Check for Status: Synced
and that the Revision:
is the SHA of the commit you pushed to the subscribed repository.
The pre-sync hook failed, possibly due to a failure to pull the container image. Check the ArgoCD dashboard for the status of the pre-sync job in the clusters application.
Verify the post hook job ran:
$ oc describe job -n clusters-sub siteconfig-post
If successful, the returned output indicates succeeded: 1
.
If the job fails, ArgoCD retries it. In some cases, the first pass will fail and the second pass will indicate that the job passed.
Check for errors in the post hook job:
$ oc get pod -n clusters-sub
Note the name of the siteconfig-post-xxxxx
pod:
$ oc logs -n clusters-sub siteconfig-post-xxxxx
If the logs indicate errors, correct the conditions and push the corrected SiteConfig
or PolicyGenTemplate
to the Git repository.
ArgoCD generates the policy custom resources (CRs) in the same namespace as the PolicyGenTemplate
from which they were created. The same troubleshooting flow applies to all policy CRs generated from PolicyGenTemplates
regardless of whether they are common, group, or site based.
To check the status of the policy CRs, enter the following commands:
$ export NS=<namespace>
$ oc get policy -n $NS
The returned output displays the expected set of policy wrapped CRs. If no object is returned, use the following procedure to troubleshoot the ArgoCD pipeline flow from SiteConfig
to the policy CRs.
Check the synchronization of the PolicyGenTemplate
to the hub cluster:
$ oc get policygentemplate -A
or
$ oc get policygentemplate -n $NS
If the PolicyGenTemplate
is not synchronized, one of the following situations has occurred:
The clusters application failed to synchronize the CR from the Git repository to the hub. Use the following command to verify this:
$ oc describe -n openshift-gitops application clusters
Check for Status: Synced
and that the Revision:
is the SHA of the commit you pushed to the subscribed repository.
The pre-sync hook failed, possibly due to a failure to pull the container image. Check the ArgoCD dashboard for the status of the pre-sync job in the clusters application.
Ensure the policies were copied to the cluster namespace. When ACM recognizes that policies apply to a ManagedCluster
, ACM applies the policy CR objects to the cluster namespace:
$ oc get policy -n <cluster_name>
ACM copies all applicable common, group, and site policies here. The policy names are <policyNamespace>
and <policyName>
.
Check the placement rule for any policies not copied to the cluster namespace. The matchSelector
in the PlacementRule
for those policies should match the labels on the ManagedCluster
:
$ oc get placementrule -n $NS
Make a note of the PlacementRule
name for the missing common, group, or site policy:
oc get placementrule -n $NS <placmentRuleName> -o yaml
The status decisions
value should include your cluster name.
The key value
of the matchSelector
in the spec should match the labels on your managed cluster. Check the labels on ManagedCluster
:
oc get ManagedCluster $CLUSTER -o jsonpath='{.metadata.labels}' | jq
apiVersion: apps.open-cluster-management.io/v1
kind: PlacementRule
metadata:
name: group-test1-policies-placementrules
namespace: group-test1-policies
spec:
clusterSelector:
matchExpressions:
- key: group-test1
operator: In
values:
- ""
status:
decisions:
- clusterName: <cluster_name>
clusterNamespace: <cluster_name>
Ensure all policies are compliant:
oc get policy -n $CLUSTER
If the Namespace, OperatorGroup, and Subscription policies are compliant but the Operator configuration policies are not it is likely that the Operators did not install.