$ oc get managedclusters local-cluster
You can deploy hosted control planes by configuring a cluster to function as a hosting cluster. The hosting cluster is an OKD cluster where the control planes are hosted. The hosting cluster is also known as the management cluster.
The management cluster is not the managed cluster. A managed cluster is a cluster that the hub cluster manages. |
The multicluster engine Operator supports only the default local-cluster
, which is a hub cluster that is managed, and the hub cluster as the hosting cluster.
To provision hosted control planes on bare metal, you can use the Agent platform. The Agent platform uses the central infrastructure management service to add worker nodes to a hosted cluster. For more information, see "Enabling the central infrastructure management service".
Each IBM Power host must be started with a Discovery Image that the central infrastructure management provides. After each host starts, it runs an Agent process to discover the details of the host and completes the installation. An Agent custom resource represents each host.
When you create a hosted cluster with the Agent platform, HyperShift installs the Agent Cluster API provider in the hosted control plane namespace.
The multicluster engine for Kubernetes Operator version 2.7 and later installed on an OKD cluster. The multicluster engine Operator is automatically installed when you install Red Hat Advanced Cluster Management (RHACM). You can also install the multicluster engine Operator without RHACM as an Operator from the OKD OperatorHub.
The multicluster engine Operator must have at least one managed OKD cluster. The local-cluster
managed hub cluster is automatically imported in the multicluster engine Operator version 2.7 and later. For more information about local-cluster
, see Advanced configuration in the RHACM documentation. You can check the status of your hub cluster by running the following command:
$ oc get managedclusters local-cluster
You need a hosting cluster with at least 3 worker nodes to run the HyperShift Operator.
You need to enable the central infrastructure management service. For more information, see "Enabling the central infrastructure management service".
You need to install the hosted control plane command-line interface. For more information, see "Installing the hosted control plane command-line interface".
The hosted control planes feature is enabled by default. If you disabled the feature and want to manually enable it, see "Manually enabling the hosted control planes feature". If you need to disable the feature, see "Disabling the hosted control planes feature".
The Agent platform does not create any infrastructure, but requires the following resources for infrastructure:
Agents: An Agent represents a host that is booted with a discovery image and is ready to be provisioned as an OKD node.
DNS: The API and Ingress endpoints must be routable.
The API server for the hosted cluster is exposed. A DNS entry must exist for the api.<hosted_cluster_name>.<basedomain>
entry that points to the destination where the API server is reachable.
The DNS entry can be as simple as a record that points to one of the nodes in the managed cluster that is running the hosted control plane.
The entry can also point to a load balancer that is deployed to redirect incoming traffic to the ingress pods.
See the following example of a DNS configuration:
$ cat /var/named/<example.krnl.es.zone>
$ TTL 900
@ IN SOA bastion.example.krnl.es.com. hostmaster.example.krnl.es.com. (
2019062002
1D 1H 1W 3H )
IN NS bastion.example.krnl.es.com.
;
;
api IN A 1xx.2x.2xx.1xx (1)
api-int IN A 1xx.2x.2xx.1xx
;
;
*.apps.<hosted-cluster-name>.<basedomain> IN A 1xx.2x.2xx.1xx
;
;EOF
1 | The record refers to the IP address of the API load balancer that handles ingress and egress traffic for hosted control planes. |
For IBM Power, add IP addresses that correspond to the IP address of the agent.
compute-0 IN A 1xx.2x.2xx.1yy
compute-1 IN A 1xx.2x.2xx.1yy
When you create a hosted cluster with the Agent platform, HyperShift installs the Agent Cluster API provider in the hosted control plane namespace. You can create a hosted cluster on bare metal or import one.
As you create a hosted cluster, keep the following guidelines in mind:
Each hosted cluster must have a cluster-wide unique name. A hosted cluster name cannot be the same as any existing managed cluster in order for multicluster engine Operator to manage it.
Do not use clusters
as a hosted cluster name.
A hosted cluster cannot be created in the namespace of a multicluster engine Operator managed cluster.
The most common service publishing strategy is to expose services through a load balancer. That strategy is the preferred method for exposing the Kubernetes API server. If you create a hosted cluster by using the web console or by using Red Hat Advanced Cluster Management, to set a publishing strategy for a service besides the Kubernetes API server, you must manually specify the servicePublishingStrategy
information in the HostedCluster
custom resource.
Create the hosted control plane namespace by entering the following command:
$ oc create ns <hosted_cluster_namespace>-<hosted_cluster_name>
Replace <hosted_cluster_namespace>
with your hosted cluster namespace name, for example, clusters
. Replace <hosted_cluster_name>
with your hosted cluster name.
Verify that you have a default storage class configured for your cluster. Otherwise, you might see pending PVCs. Run the following command:
$ hcp create cluster agent \
--name=<hosted_cluster_name> \(1)
--pull-secret=<path_to_pull_secret> \(2)
--agent-namespace=<hosted_control_plane_namespace> \(3)
--base-domain=<basedomain> \(4)
--api-server-address=api.<hosted_cluster_name>.<basedomain> \(5)
--etcd-storage-class=<etcd_storage_class> \(6)
--ssh-key <path_to_ssh_public_key> \(7)
--namespace <hosted_cluster_namespace> \(8)
--control-plane-availability-policy HighlyAvailable \(9)
--release-image=quay.io/openshift-release-dev/ocp-release:<ocp_release_image> \(10)
--node-pool-replicas <node_pool_replica_count> (11)
1 | Specify the name of your hosted cluster, for instance, example . |
2 | Specify the path to your pull secret, for example, /user/name/pullsecret . |
3 | Specify your hosted control plane namespace, for example, clusters-example . Ensure that agents are available in this namespace by using the oc get agent -n <hosted_control_plane_namespace> command. |
4 | Specify your base domain, for example, krnl.es . |
5 | The --api-server-address flag defines the IP address that is used for the Kubernetes API communication in the hosted cluster. If you do not set the --api-server-address flag, you must log in to connect to the management cluster. |
6 | Specify the etcd storage class name, for example, lvm-storageclass . |
7 | Specify the path to your SSH public key. The default file path is ~/.ssh/id_rsa.pub . |
8 | Specify your hosted cluster namespace. |
9 | Specify the availability policy for the hosted control plane components. Supported options are SingleReplica and HighlyAvailable . The default value is HighlyAvailable . |
10 | Specify the supported OKD version that you want to use, for example, 4.19.0-multi . If you are using a disconnected environment, replace <ocp_release_image> with the digest image. To extract the OKD release image digest, see Extracting the OKD release image digest. |
11 | Specify the node pool replica count, for example, 3 . You must specify the replica count as 0 or greater to create the same number of replicas. Otherwise, no node pools are created. |
After a few moments, verify that your hosted control plane pods are up and running by entering the following command:
$ oc -n <hosted_cluster_namespace>-<hosted_cluster_name> get pods
NAME READY STATUS RESTARTS AGE
capi-provider-7dcf5fc4c4-nr9sq 1/1 Running 0 4m32s
catalog-operator-6cd867cc7-phb2q 2/2 Running 0 2m50s
certified-operators-catalog-884c756c4-zdt64 1/1 Running 0 2m51s
cluster-api-f75d86f8c-56wfz 1/1 Running 0 4m32s
On the agent platform, you can create heterogeneous node pools so that your clusters can run diverse machine types, such as x86_64
or ppc64le
, within a single hosted cluster.
A node pool is a group of nodes within a cluster that share the same configuration. Heterogeneous node pools are pools that have different configurations, allowing you to create pools optimized for various workloads.
You can create heterogeneous node pools on the agent platform. It enables clusters to run diverse machine types, for example, x86_64
or ppc64le
, within a single hosted cluster.
To create a heterogeneous node pool, perform the following general steps, as described in the following sections:
Create an AgentServiceConfig
custom resource (CR) that informs the Operator how much storage is needed for components such as the database and filesystem. The CR also defines which OKD versions to maintain.
Create an agent cluster.
Create the heterogeneous node pool.
Configure DNS for hosted control planes
Create an InfraEnv
custom resource (CR) for each architecture.
Add agents to the heterogeneous cluster.
To create heterogeneous node pools on an agent hosted cluster, you first need to create the AgentServiceConfig
CR with two heterogeneous architecture operating system (OS) images.
Run the following command:
$ envsubst <<"EOF" | oc apply -f -
apiVersion: agent-install.openshift.io/v1beta1
kind: AgentServiceConfig
metadata:
name: agent
spec:
databaseStorage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: <db_volume_name> (1)
filesystemStorage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: <fs_volume_name> (2)
osImages:
- openshiftVersion: <ocp_version> (3)
version: <ocp_release_version_x86> (4)
url: <iso_url_x86> (5)
rootFSUrl: <root_fs_url_x8> (6)
cpuArchitecture: <arch_x86> (7)
- openshiftVersion: <ocp_version> (8)
version: <ocp_release_version_ppc64le> (9)
url: <iso_url_ppc64le> (10)
rootFSUrl: <root_fs_url_ppc64le> (11)
cpuArchitecture: <arch_ppc64le> (12)
EOF
1 | Specify the multicluster engine for Kubernetes Operator agentserviceconfig config, database volume name. |
2 | Specify the multicluster engine Operator agentserviceconfig config, filesystem volume name. |
3 | Specify the current version of OKD. |
4 | Specify the current OKD release version for x86. |
5 | Specify the ISO URL for x86. |
6 | Specify the root filesystem URL for X86. |
7 | Specify the CPU architecture for x86. |
8 | Specify the current OKD version. |
9 | Specify the OKD release version for ppc64le . |
10 | Specify the ISO URL for ppc64le . |
11 | Specify the root filesystem URL for ppc64le . |
12 | Specify the CPU architecture for ppc64le . |
An agent cluster is a cluster that is managed and provisioned using an agent-based approach. An agent cluster can utilize heterogeneous node pools, allowing for different types of worker nodes to be used within the same cluster.
You used a multi-architecture release image to enable support for heterogeneous node pools when creating a hosted cluster. Find the latest multi-architecture images on the Multi-arch release images page.
Create an environment variable for the cluster namespace by running the following command:
$ export CLUSTERS_NAMESPACE=<hosted_cluster_namespace>
Create an environment variable for the machine classless inter-domain routing (CIDR) notation by running the following command:
$ export MACHINE_CIDR=192.168.122.0/24
Create the hosted control namespace by running the following command:
$ oc create ns <hosted_control_plane_namespace>
Create the cluster by running the following command:
$ hcp create cluster agent \
--name=<hosted_cluster_name> \(1)
--pull-secret=<pull_secret_file> \(2)
--agent-namespace=<hosted_control_plane_namespace> \(3)
--base-domain=<basedomain> \(4)
--api-server-address=api.<hosted_cluster_name>.<basedomain> \
--release-image=quay.io/openshift-release-dev/ocp-release:<ocp_release> (5)
1 | Specify the hosted cluster name. |
2 | Specify the pull secret file path. |
3 | Specify the namespace for the hosted control plane. |
4 | Specify the base domain for the hosted cluster. |
5 | Specify the current OKD release version. |
You create heterogeneous node pools by using the NodePool
custom resource (CR).
To define a NodePool
CR, create a YAML file similar to the following example:
envsubst <<"EOF" | oc apply -f -
apiVersion:apiVersion: hypershift.openshift.io/v1beta1
kind: NodePool
metadata:
name: <hosted_cluster_name>
namespace: <clusters_namespace>
spec:
arch: <arch_ppc64le>
clusterName: <hosted_cluster_name>
management:
autoRepair: false
upgradeType: InPlace
nodeDrainTimeout: 0s
nodeVolumeDetachTimeout: 0s
platform:
agent:
agentLabelSelector:
matchLabels:
inventory.agent-install.openshift.io/cpu-architecture: <arch_ppc64le> (1)
type: Agent
release:
image: quay.io/openshift-release-dev/ocp-release:<ocp_release>
replicas: 0
EOF
1 | This selector block is selects the agents that match the specified label. To create a node pool of architecture ppc64le with zero replicas, specify ppc64le . This ensures that it selects only agents from ppc64le architecture when it scales. |
Domain Name Service (DNS) configuration for hosted control planes enables external clients to reach ingress controllers so that they can route traffic to internal components. Configuring this setting ensures that traffic is routed to either ppc64le
or x86_64
compute node.
You can point an *.apps.<cluster_name>
record to either of the compute nodes where the ingress application is hosted. Or, if you are able to set up a load balancer on top of the compute nodes, point this record to this load balancer. When you are creating a heterogeneous node pool, make sure the compute nodes can reach each other or keep them in the same network.
For heterogeneous node pools, you must create an infraEnv
custom resource (CR) for each architecture. For example, for node pools with x86_64
and ppc64le
architectures, create an InfraEnv
CR for x86_64
and ppc64le
.
Before proceeding, make sure that the operating system (OS) images for both |
Create the InfraEnv
resource with x86_64
architecture for heterogeneous node pools by running the following command:
$ envsubst <<"EOF" | oc apply -f -
apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
name: <hosted_cluster_name>-<arch_x86> (1) (2)
namespace: <hosted_control_plane_namespace> (3)
spec:
cpuArchitecture: <arch_x86>
pullSecretRef:
name: pull-secret
sshAuthorizedKey: <ssh_pub_key> (4)
EOF
1 | The hosted cluster name. |
2 | The x86_64 architecture. |
3 | The hosted control plane namespace. |
4 | The ssh public key. |
Create the InfraEnv
resource with ppc64le
architecture for heterogeneous node pools by running the following command:
envsubst <<"EOF" | oc apply -f -
apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
name: <hosted_cluster_name>-<arch_ppc64le> (1) (2)
namespace: <hosted_control_plane_namespace> (3)
spec:
cpuArchitecture: <arch_ppc64le>
pullSecretRef:
name: pull-secret
sshAuthorizedKey: <ssh_pub_key> (4)
EOF
1 | The hosted cluster name. |
2 | The ppc64le architecture. |
3 | The hosted control plane namespace. |
4 | The ssh public key. |
To verify that the InfraEnv
resources are created successfully run the following commands:
Verify that the x86_64
InfraEnv
resource is created successfully:
$ oc describe InfraEnv <hosted_cluster_name>-<arch_x86>
Verify that the ppc64le
InfraEnv
resource is created successfully:
$ oc describe InfraEnv <hosted_cluster_name>-<arch_ppc64le>
Generate a live ISO that allows either a virtual machine or a bare-metal machine to join as agents by running the following commands:
Generate a live ISO for x86_64
:
$ oc -n <hosted_control_plane_namespace> get InfraEnv <hosted_cluster_name>-<arch_x86> -ojsonpath="{.status.isoDownloadURL}"
Generate a live ISO for ppc64le
:
$ oc -n <hosted_control_plane_namespace> get InfraEnv <hosted_cluster_name>-<arch_ppc64le> -ojsonpath="{.status.isoDownloadURL}"
You add agents by manually configuring the machine to boot with a live ISO. You can download the live ISO and use it to boot a bare-metal node or a virtual machine. On boot, the node communicates with the assisted-service
and registers as an agent in the same namespace as the InfraEnv
resource. When each agent is created, you can optionally set its installation_disk_id
and hostname
parameters in the specifications. When you are done, approve it to indicate that the agent is ready for use.
Obtain a list of agents by running the following command:
oc -n <hosted_control_plane_namespace> get agents
NAME CLUSTER APPROVED ROLE STAGE 86f7ac75-4fc4-4b36-8130-40fa12602218 auto-assign e57a637f-745b-496e-971d-1abbf03341ba auto-assign
Patch an agent by running the following command:
oc -n <hosted_control_plane_namespace> patch agent 86f7ac75-4fc4-4b36-8130-40fa12602218 -p '{"spec":{"installation_disk_id":"/dev/sda","approved":true,"hostname":"worker-0.example.krnl.es"}}' --type merge
Patch the second agent by running the following command:
oc -n <hosted_control_plane_namespace> patch agent 23d0c614-2caa-43f5-b7d3-0b3564688baa -p '{"spec":{"installation_disk_id":"/dev/sda","approved":true,"hostname":"worker-1.example.krnl.es"}}' --type merge
Check the agent approval status by running the following command:
oc -n <hosted_control_plane_namespace> get agents
NAME CLUSTER APPROVED ROLE STAGE 86f7ac75-4fc4-4b36-8130-40fa12602218 true auto-assign e57a637f-745b-496e-971d-1abbf03341ba true auto-assign
After your agents are approved, you can scale the node pools. The agentLabelSelector
value that is configured in the node pool ensures that only matching agents are added to the cluster. This also helps scale down the node pool. To remove specific architecture nodes from the cluster, scale down the corresponding node pool.
Scale the node pool by running the following command:
$ oc -n <clusters_namespace> scale nodepool <nodepool_name> --replicas 2
The Cluster API agent provider picks two agents randomly to assign to the hosted cluster. These agents pass through different states and then join the hosted cluster as OKD nodes. The various agent states are |
List the agents by running the following command:
$ oc -n <hosted_control_plane_namespace> get agent
NAME CLUSTER APPROVED ROLE STAGE 4dac1ab2-7dd5-4894-a220-6a3473b67ee6 hypercluster1 true auto-assign d9198891-39f4-4930-a679-65fb142b108b true auto-assign da503cf1-a347-44f2-875c-4960ddb04091 hypercluster1 true auto-assign
Check the status of a specific scaled agent by running the following command:
$ oc -n <hosted_control_plane_namespace> get agent -o jsonpath='{range .items[*]}BMH: {@.metadata.labels.agent-install\.openshift\.io/bmh} Agent: {@.metadata.name} State: {@.status.debugInfo.state}{"\n"}{end}'
BMH: ocp-worker-2 Agent: 4dac1ab2-7dd5-4894-a220-6a3473b67ee6 State: binding BMH: ocp-worker-0 Agent: d9198891-39f4-4930-a679-65fb142b108b State: known-unbound BMH: ocp-worker-1 Agent: da503cf1-a347-44f2-875c-4960ddb04091 State: insufficient
Once the agents reach the added-to-existing-cluster
state, verify that the OKD nodes are ready by running the following command:
$ oc --kubeconfig <hosted_cluster_name>.kubeconfig get nodes
NAME STATUS ROLES AGE VERSION ocp-worker-1 Ready worker 5m41s v1.24.0+3882f8f ocp-worker-2 Ready worker 6m3s v1.24.0+3882f8f
Adding workloads to the nodes can reconcile some cluster operators. The following command displays that two machines are created after scaling up the node pool:
$ oc -n <hosted_control_plane_namespace> get machines
NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION hypercluster1-c96b6f675-m5vch hypercluster1-b2qhl ocp-worker-1 agent://da503cf1-a347-44f2-875c-4960ddb04091 Running 15m 4.11.5 hypercluster1-c96b6f675-tl42p hypercluster1-b2qhl ocp-worker-2 agent://4dac1ab2-7dd5-4894-a220-6a3473b67ee6 Running 15m 4.11.5