This is a cache of https://docs.openshift.com/container-platform/4.17/installing/installing_with_agent_based_installer/installing-with-agent-based-installer.html. It is a snapshot of the page at 2024-11-27T06:11:45.100+0000.
Installing a cluster with Agent-based Installer - Installing an on-premise cluster with the Agent-based Installer | Installing | OpenShift Container Platform 4.17
×

Prerequisites

Installing OpenShift Container Platform with the Agent-based Installer

The following procedures deploy a single-node OpenShift Container Platform in a disconnected environment. You can use these procedures as a basis and modify according to your requirements.

Downloading the Agent-based Installer

Use this procedure to download the Agent-based Installer and the CLI needed for your installation.

Procedure
  1. Log in to the OpenShift Container Platform web console using your login credentials.

  2. Navigate to Datacenter.

  3. Click Run Agent-based Installer locally.

  4. Select the operating system and architecture for the OpenShift Installer and Command line interface.

  5. Click Download Installer to download and extract the install program.

  6. Download or copy the pull secret by clicking on Download pull secret or Copy pull secret.

  7. Click Download command-line tools and place the openshift-install binary in a directory that is on your PATH.

Verifying the supported architecture for an Agent-based installation

Before installing an OpenShift Container Platform cluster using the Agent-based Installer, you can verify the supported architecture on which you can install the cluster. This procedure is optional.

Prerequisites
  • You installed the OpenShift CLI (oc).

  • You have downloaded the installation program.

Procedure
  1. Log in to the OpenShift CLI (oc).

  2. Check your release payload by running the following command:

    $ ./openshift-install version
    Example output
    ./openshift-install 4.17.0
    built from commit abc123def456
    release image quay.io/openshift-release-dev/ocp-release@sha256:123abc456def789ghi012jkl345mno678pqr901stu234vwx567yz0
    release architecture amd64

    If you are using the release image with the multi payload, the release architecture displayed in the output of this command is the default architecture.

  3. To check the architecture of the payload, run the following command:

    $ oc adm release info <release_image> -o jsonpath="{ .metadata.metadata}" (1)
    1 Replace <release_image> with the release image. For example: quay.io/openshift-release-dev/ocp-release@sha256:123abc456def789ghi012jkl345mno678pqr901stu234vwx567yz0.
    Example output when the release image uses the multi payload
    {"release.openshift.io architecture":"multi"}

    If you are using the release image with the multi payload, you can install the cluster on different architectures such as arm64, amd64, s390x, and ppc64le. Otherwise, you can install the cluster only on the release architecture displayed in the output of the openshift-install version command.

Creating the preferred configuration inputs

Use this procedure to create the preferred configuration inputs used to create the agent image.

Procedure
  1. Install nmstate dependency by running the following command:

    $ sudo dnf install /usr/bin/nmstatectl -y
  2. Place the openshift-install binary in a directory that is on your PATH.

  3. Create a directory to store the install configuration by running the following command:

    $ mkdir ~/<directory_name>

    This is the preferred method for the Agent-based installation. Using GitOps ZTP manifests is optional.

  4. Create the install-config.yaml file by running the following command:

    $ cat << EOF > ./<directory_name>/install-config.yaml
    apiVersion: v1
    baseDomain: test.example.com
    compute:
    - architecture: amd64 (1)
      hyperthreading: Enabled
      name: worker
      replicas: 0
    controlPlane:
      architecture: amd64
      hyperthreading: Enabled
      name: master
      replicas: 1
    metadata:
      name: sno-cluster (2)
    networking:
      clusterNetwork:
      - cidr: 10.128.0.0/14
        hostPrefix: 23
      machineNetwork:
      - cidr: 192.168.0.0/16
      networkType: OVNKubernetes (3)
      serviceNetwork:
      - 172.30.0.0/16
    platform: (4)
      none: {}
    pullSecret: '<pull_secret>' (5)
    sshKey: '<ssh_pub_key>' (6)
    EOF
    1 Specify the system architecture. Valid values are amd64, arm64, ppc64le, and s390x.

    If you are using the release image with the multi payload, you can install the cluster on different architectures such as arm64, amd64, s390x, and ppc64le. Otherwise, you can install the cluster only on the release architecture displayed in the output of the openshift-install version command. For more information, see "Verifying the supported architecture for installing an Agent-based Installer cluster".

    2 Required. Specify your cluster name.
    3 The cluster network plugin to install. The default value OVNKubernetes is the only supported value.
    4 Specify your platform.

    For bare metal platforms, host settings made in the platform section of the install-config.yaml file are used by default, unless they are overridden by configurations made in the agent-config.yaml file.

    5 Specify your pull secret.
    6 Specify your SSH public key.

    If you set the platform to vSphere or baremetal, you can configure IP address endpoints for cluster nodes in three ways:

    • IPv4

    • IPv6

    • IPv4 and IPv6 in parallel (dual-stack)

    IPv6 is supported only on bare metal platforms.

    Example of dual-stack networking
    networking:
      clusterNetwork:
        - cidr: 172.21.0.0/16
          hostPrefix: 23
        - cidr: fd02::/48
          hostPrefix: 64
      machineNetwork:
        - cidr: 192.168.11.0/16
        - cidr: 2001:DB8::/32
      serviceNetwork:
        - 172.22.0.0/16
        - fd03::/112
      networkType: OVNKubernetes
    platform:
      baremetal:
        apiVIPs:
        - 192.168.11.3
        - 2001:DB8::4
        ingressVIPs:
        - 192.168.11.4
        - 2001:DB8::5

    When you use a disconnected mirror registry, you must add the certificate file that you created previously for your mirror registry to the additionalTrustBundle field of the install-config.yaml file.

  5. Create the agent-config.yaml file by running the following command:

    $ cat > agent-config.yaml << EOF
    apiVersion: v1beta1
    kind: AgentConfig
    metadata:
      name: sno-cluster
    rendezvousIP: 192.168.111.80 (1)
    hosts: (2)
      - hostname: master-0 (3)
        interfaces:
          - name: eno1
            macAddress: 00:ef:44:21:e6:a5
        rootDeviceHints: (4)
          deviceName: /dev/sdb
        networkConfig: (5)
          interfaces:
            - name: eno1
              type: ethernet
              state: up
              mac-address: 00:ef:44:21:e6:a5
              ipv4:
                enabled: true
                address:
                  - ip: 192.168.111.80
                    prefix-length: 23
                dhcp: false
          dns-resolver:
            config:
              server:
                - 192.168.111.1
          routes:
            config:
              - destination: 0.0.0.0/0
                next-hop-address: 192.168.111.2
                next-hop-interface: eno1
                table-id: 254
    EOF
    1 This IP address is used to determine which node performs the bootstrapping process as well as running the assisted-service component. You must provide the rendezvous IP address when you do not specify at least one host’s IP address in the networkConfig parameter. If this address is not provided, one IP address is selected from the provided hosts' networkConfig.
    2 Optional: Host configuration. The number of hosts defined must not exceed the total number of hosts defined in the install-config.yaml file, which is the sum of the values of the compute.replicas and controlPlane.replicas parameters.
    3 Optional: Overrides the hostname obtained from either the Dynamic Host Configuration Protocol (DHCP) or a reverse DNS lookup. Each host must have a unique hostname supplied by one of these methods.
    4 Enables provisioning of the Red Hat Enterprise Linux CoreOS (RHCOS) image to a particular device. The installation program examines the devices in the order it discovers them, and compares the discovered values with the hint values. It uses the first discovered device that matches the hint value.
    5 Optional: Configures the network interface of a host in NMState format.

Creating additional manifest files

As an optional task, you can create additional manifests to further configure your cluster beyond the configurations available in the install-config.yaml and agent-config.yaml files.

Creating a directory to contain additional manifests

If you create additional manifests to configure your Agent-based installation beyond the install-config.yaml and agent-config.yaml files, you must create an openshift subdirectory within your installation directory. All of your additional machine configurations must be located within this subdirectory.

The most common type of additional manifest you can add is a MachineConfig object. For examples of MachineConfig objects you can add during the Agent-based installation, see "Using MachineConfig objects to configure nodes" in the "Additional resources" section.

Procedure
  • On your installation host, create an openshift subdirectory within the installation directory by running the following command:

    $ mkdir <installation_directory>/openshift

Disk partitioning

In general, you should use the default disk partitioning that is created during the RHCOS installation. However, there are cases where you might want to create a separate partition for a directory that you expect to grow.

OpenShift Container Platform supports the addition of a single partition to attach storage to either the /var directory or a subdirectory of /var. For example:

  • /var/lib/containers: Holds container-related content that can grow as more images and containers are added to a system.

  • /var/lib/etcd: Holds data that you might want to keep separate for purposes such as performance optimization of etcd storage.

  • /var: Holds data that you might want to keep separate for purposes such as auditing.

    For disk sizes larger than 100GB, and especially larger than 1TB, create a separate /var partition.

Storing the contents of a /var directory separately makes it easier to grow storage for those areas as needed and reinstall OpenShift Container Platform at a later date and keep that data intact. With this method, you will not have to pull all your containers again, nor will you have to copy massive log files when you update systems.

The use of a separate partition for the /var directory or a subdirectory of /var also prevents data growth in the partitioned directory from filling up the root file system.

The following procedure sets up a separate /var partition by adding a machine config manifest that is wrapped into the Ignition config file for a node type during the preparation phase of an installation.

Prerequisites
  • You have created an openshift subdirectory within your installation directory.

Procedure
  1. Create a Butane config that configures the additional partition. For example, name the file $HOME/clusterconfig/98-var-partition.bu, change the disk device name to the name of the storage device on the worker systems, and set the storage size as appropriate. This example places the /var directory on a separate partition:

    variant: openshift
    version: 4.17.0
    metadata:
      labels:
        machineconfiguration.openshift.io/role: worker
      name: 98-var-partition
    storage:
      disks:
      - device: /dev/disk/by-id/<device_name> (1)
        partitions:
        - label: var
          start_mib: <partition_start_offset> (2)
          size_mib: <partition_size> (3)
          number: 5
      filesystems:
        - device: /dev/disk/by-partlabel/var
          path: /var
          format: xfs
          mount_options: [defaults, prjquota] (4)
          with_mount_unit: true
    1 The storage device name of the disk that you want to partition.
    2 When adding a data partition to the boot disk, a minimum offset value of 25000 mebibytes is recommended. The root file system is automatically resized to fill all available space up to the specified offset. If no offset value is specified, or if the specified value is smaller than the recommended minimum, the resulting root file system will be too small, and future reinstalls of RHCOS might overwrite the beginning of the data partition.
    3 The size of the data partition in mebibytes.
    4 The prjquota mount option must be enabled for filesystems used for container storage.

    When creating a separate /var partition, you cannot use different instance types for compute nodes, if the different instance types do not have the same device name.

  2. Create a manifest from the Butane config and save it to the clusterconfig/openshift directory. For example, run the following command:

    $ butane $HOME/clusterconfig/98-var-partition.bu -o $HOME/clusterconfig/openshift/98-var-partition.yaml

Using ZTP manifests

As an optional task, you can use GitOps Zero Touch Provisioning (ZTP) manifests to configure your installation beyond the options available through the install-config.yaml and agent-config.yaml files.

GitOps ZTP manifests can be generated with or without configuring the install-config.yaml and agent-config.yaml files beforehand. If you chose to configure the install-config.yaml and agent-config.yaml files, the configurations will be imported to the ZTP cluster manifests when they are generated.

Prerequisites
  • You have placed the openshift-install binary in a directory that is on your PATH.

  • Optional: You have created and configured the install-config.yaml and agent-config.yaml files.

Procedure
  1. Generate ZTP cluster manifests by running the following command:

    $ openshift-install agent create cluster-manifests --dir <installation_directory>

    If you have created the install-config.yaml and agent-config.yaml files, those files are deleted and replaced by the cluster manifests generated through this command.

    Any configurations made to the install-config.yaml and agent-config.yaml files are imported to the ZTP cluster manifests when you run the openshift-install agent create cluster-manifests command.

  2. Navigate to the cluster-manifests directory by running the following command:

    $ cd <installation_directory>/cluster-manifests
  3. Configure the manifest files in the cluster-manifests directory. For sample files, see the "Sample GitOps ZTP custom resources" section.

  4. Disconnected clusters: If you did not define mirror configuration in the install-config.yaml file before generating the ZTP manifests, perform the following steps:

    1. Navigate to the mirror directory by running the following command:

      $ cd ../mirror
    2. Configure the manifest files in the mirror directory.

Additional resources

Encrypting the disk

As an optional task, you can use this procedure to encrypt your disk or partition while installing OpenShift Container Platform with the Agent-based Installer.

Prerequisites
  • You have created and configured the install-config.yaml and agent-config.yaml files, unless you are using ZTP manifests.

  • You have placed the openshift-install binary in a directory that is on your PATH.

Procedure
  1. Generate ZTP cluster manifests by running the following command:

    $ openshift-install agent create cluster-manifests --dir <installation_directory>

    If you have created the install-config.yaml and agent-config.yaml files, those files are deleted and replaced by the cluster manifests generated through this command.

    Any configurations made to the install-config.yaml and agent-config.yaml files are imported to the ZTP cluster manifests when you run the openshift-install agent create cluster-manifests command.

    If you have already generated ZTP manifests, skip this step.

  2. Navigate to the cluster-manifests directory by running the following command:

    $ cd <installation_directory>/cluster-manifests
  3. Add the following section to the agent-cluster-install.yaml file:

    diskEncryption:
        enableOn: all (1)
        mode: tang (2)
        tangServers: "server1": "http://tang-server-1.example.com:7500" (3)
    1 Specify which nodes to enable disk encryption on. Valid values are none, all, master, and worker.
    2 Specify which disk encryption mode to use. Valid values are tpmv2 and tang.
    3 Optional: If you are using Tang, specify the Tang servers.
Additional resources

Creating and booting the agent image

Use this procedure to boot the agent image on your machines.

Procedure
  1. Create the agent image by running the following command:

    $ openshift-install --dir <install_directory> agent create image
    Red Hat Enterprise Linux CoreOS (RHCOS) supports multipathing on the primary disk, allowing stronger resilience to hardware failure to achieve higher host availability. Multipathing is enabled by default in the agent ISO image, with a default /etc/multipath.conf configuration.
  2. Boot the agent.x86_64.iso, agent.aarch64.iso, or agent.s390x.iso image on the bare metal machines.

Adding IBM Z agents with RHEL KVM

Use the following procedure to manually add IBM Z® agents with RHEL KVM. Only use this procedure for IBM Z® clusters with RHEL KVM.

The nmstateconfig parameter must be configured for the KVM boot.

Procedure
  1. Boot your RHEL KVM machine.

  2. To deploy the virtual server, run the virt-install command with the following parameters:

    $ virt-install
        --name <vm_name> \
        --autostart \
        --memory=<memory> \
        --cpu host \
        --vcpus=<vcpus> \
        --cdrom <agent_iso_image> \ (1)
        --disk pool=default,size=<disk_pool_size> \
        --network network:default,mac=<mac_address> \
        --graphics none \
        --noautoconsole \
        --os-variant rhel9.0 \
        --wait=-1
    1 For the --cdrom parameter, specify the location of the ISO image on the HTTP or HTTPS server.

Verifying that the current installation host can pull release images

After you boot the agent image and network services are made available to the host, the agent console application performs a pull check to verify that the current host can retrieve release images.

If the primary pull check passes, you can quit the application to continue with the installation. If the pull check fails, the application performs additional checks, as seen in the Additional checks section of the TUI, to help you troubleshoot the problem. A failure for any of the additional checks is not necessarily critical as long as the primary pull check succeeds.

If there are host network configuration issues that might cause an installation to fail, you can use the console application to make adjustments to your network configurations.

If the agent console application detects host network configuration issues, the installation workflow will be halted until the user manually stops the console application and signals the intention to proceed.

Procedure
  1. Wait for the agent console application to check whether or not the configured release image can be pulled from a registry.

  2. If the agent console application states that the installer connectivity checks have passed, wait for the prompt to time out to continue with the installation.

    You can still choose to view or change network configuration settings even if the connectivity checks have passed.

    However, if you choose to interact with the agent console application rather than letting it time out, you must manually quit the TUI to proceed with the installation.

  3. If the agent console application checks have failed, which is indicated by a red icon beside the Release image URL pull check, use the following steps to reconfigure the host’s network settings:

    1. Read the Check Errors section of the TUI. This section displays error messages specific to the failed checks.

      The home screen of the agent console application  displaying check errors
    2. Select Configure network to launch the NetworkManager TUI.

    3. Select Edit a connection and select the connection you want to reconfigure.

    4. Edit the configuration and select OK to save your changes.

    5. Select Back to return to the main screen of the NetworkManager TUI.

    6. Select Activate a Connection.

    7. Select the reconfigured network to deactivate it.

    8. Select the reconfigured network again to reactivate it.

    9. Select Back and then select Quit to return to the agent console application.

    10. Wait at least five seconds for the continuous network checks to restart using the new network configuration.

    11. If the Release image URL pull check succeeds and displays a green icon beside the URL, select Quit to exit the agent console application and continue with the installation.

Tracking and verifying installation progress

Use the following procedure to track installation progress and to verify a successful installation.

Prerequisites
  • You have configured a DNS record for the Kubernetes API server.

Procedure
  1. Optional: To know when the bootstrap host (rendezvous host) reboots, run the following command:

    $ ./openshift-install --dir <install_directory> agent wait-for bootstrap-complete \(1)
        --log-level=info (2)
    1 For <install_directory>, specify the path to the directory where the agent ISO was generated.
    2 To view different installation details, specify warn, debug, or error instead of info.
    Example output
    ...................................................................
    ...................................................................
    INFO Bootstrap configMap status is complete
    INFO cluster bootstrap is complete

    The command succeeds when the Kubernetes API server signals that it has been bootstrapped on the control plane machines.

  2. To track the progress and verify successful installation, run the following command:

    $ openshift-install --dir <install_directory> agent wait-for install-complete (1)
    1 For <install_directory> directory, specify the path to the directory where the agent ISO was generated.
    Example output
    ...................................................................
    ...................................................................
    INFO Cluster is installed
    INFO Install complete!
    INFO To access the cluster as the system:admin user when using 'oc', run
    INFO     export KUBECONFIG=/home/core/installer/auth/kubeconfig
    INFO Access the OpenShift web-console here: https://console-openshift-console.apps.sno-cluster.test.example.com

If you are using the optional method of GitOps ZTP manifests, you can configure IP address endpoints for cluster nodes through the AgentClusterInstall.yaml file in three ways:

  • IPv4

  • IPv6

  • IPv4 and IPv6 in parallel (dual-stack)

IPv6 is supported only on bare metal platforms.

Example of dual-stack networking
apiVIP: 192.168.11.3
ingressVIP: 192.168.11.4
clusterDeploymentRef:
  name: mycluster
imageSetRef:
  name: openshift-4.17
networking:
  clusterNetwork:
  - cidr: 172.21.0.0/16
    hostPrefix: 23
  - cidr: fd02::/48
    hostPrefix: 64
  machineNetwork:
  - cidr: 192.168.11.0/16
  - cidr: 2001:DB8::/32
  serviceNetwork:
  - 172.22.0.0/16
  - fd03::/112
  networkType: OVNKubernetes
Additional resources

Sample GitOps ZTP custom resources

You can optionally use GitOps Zero Touch Provisioning (ZTP) custom resource (CR) objects to install an OpenShift Container Platform cluster with the Agent-based Installer.

You can customize the following GitOps ZTP custom resources to specify more details about your OpenShift Container Platform cluster. The following sample GitOps ZTP custom resources are for a single-node cluster.

Example agent-cluster-install.yaml file
  apiVersion: extensions.hive.openshift.io/v1beta1
  kind: AgentClusterInstall
  metadata:
    name: test-agent-cluster-install
    namespace: cluster0
  spec:
    clusterDeploymentRef:
      name: ostest
    imageSetRef:
      name: openshift-4.17
    networking:
      clusterNetwork:
      - cidr: 10.128.0.0/14
        hostPrefix: 23
      serviceNetwork:
      - 172.30.0.0/16
    provisionRequirements:
      controlPlaneAgents: 1
      workerAgents: 0
    sshPublicKey: <ssh_public_key>
Example cluster-deployment.yaml file
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
  name: ostest
  namespace: cluster0
spec:
  baseDomain: test.metalkube.org
  clusterInstallRef:
    group: extensions.hive.openshift.io
    kind: AgentClusterInstall
    name: test-agent-cluster-install
    version: v1beta1
  clusterName: ostest
  controlPlaneConfig:
    servingCertificates: {}
  platform:
    agentBareMetal:
      agentSelector:
        matchLabels:
          bla: aaa
  pullSecretRef:
    name: pull-secret
Example cluster-image-set.yaml file
apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
  name: openshift-4.17
spec:
  releaseImage: registry.ci.openshift.org/ocp/release:4.17.0-0.nightly-2022-06-06-025509
Example infra-env.yaml file
apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
  name: myinfraenv
  namespace: cluster0
spec:
  clusterRef:
    name: ostest
    namespace: cluster0
  cpuArchitecture: aarch64
  pullSecretRef:
    name: pull-secret
  sshAuthorizedKey: <ssh_public_key>
  nmStateConfigLabelSelector:
    matchLabels:
      cluster0-nmstate-label-name: cluster0-nmstate-label-value
Example nmstateconfig.yaml file
apiVersion: agent-install.openshift.io/v1beta1
kind: NMStateConfig
metadata:
  name: master-0
  namespace: openshift-machine-api
  labels:
    cluster0-nmstate-label-name: cluster0-nmstate-label-value
spec:
  config:
    interfaces:
      - name: eth0
        type: ethernet
        state: up
        mac-address: 52:54:01:aa:aa:a1
        ipv4:
          enabled: true
          address:
            - ip: 192.168.122.2
              prefix-length: 23
          dhcp: false
    dns-resolver:
      config:
        server:
          - 192.168.122.1
    routes:
      config:
        - destination: 0.0.0.0/0
          next-hop-address: 192.168.122.1
          next-hop-interface: eth0
          table-id: 254
  interfaces:
    - name: "eth0"
      macAddress: 52:54:01:aa:aa:a1
Example pull-secret.yaml file
apiVersion: v1
kind: Secret
type: kubernetes.io/dockerconfigjson
metadata:
  name: pull-secret
  namespace: cluster0
stringData:
  .dockerconfigjson: <pull_secret>
Additional resources

Gathering log data from a failed Agent-based installation

Use the following procedure to gather log data about a failed Agent-based installation to provide for a support case.

Prerequisites
  • You have configured a DNS record for the Kubernetes API server.

Procedure
  1. Run the following command and collect the output:

    $ ./openshift-install --dir <installation_directory> agent wait-for bootstrap-complete --log-level=debug
    Example error message
    ...
    ERROR Bootstrap failed to complete: : bootstrap process timed out: context deadline exceeded
  2. If the output from the previous command indicates a failure, or if the bootstrap is not progressing, run the following command to connect to the rendezvous host and collect the output:

    $ ssh core@<node-ip> agent-gather -O >agent-gather.tar.xz

    Red Hat Support can diagnose most issues using the data gathered from the rendezvous host, but if some hosts are not able to register, gathering this data from every host might be helpful.

  3. If the bootstrap completes and the cluster nodes reboot, run the following command and collect the output:

    $ ./openshift-install --dir <install_directory> agent wait-for install-complete --log-level=debug
  4. If the output from the previous command indicates a failure, perform the following steps:

    1. Export the kubeconfig file to your environment by running the following command:

      $ export KUBECONFIG=<install_directory>/auth/kubeconfig
    2. Gather information for debugging by running the following command:

      $ oc adm must-gather
    3. Create a compressed file from the must-gather directory that was just created in your working directory by running the following command:

      $ tar cvaf must-gather.tar.gz <must_gather_directory>
  5. Excluding the /auth subdirectory, attach the installation directory used during the deployment to your support case on the Red Hat Customer Portal.

  6. Attach all other data gathered from this procedure to your support case.