Understanding the custom metrics autoscaler triggers - Automatically scaling pods with the Custom Metrics Autoscaler Operator | Nodes

Understanding the Prometheus trigger
- Configuring the custom metrics autoscaler to use OKD monitoring
Understanding the CPU trigger
Understanding the memory trigger
Understanding the Kafka trigger
Understanding the Cron trigger

Triggers, also known as scalers, provide the metrics that the Custom Metrics Autoscaler Operator uses to scale your pods.

The custom metrics autoscaler currently supports only the Prometheus, CPU, memory, and Apache Kafka triggers.

You use a ScaledObject or ScaledJob custom resource to configure triggers for specific objects, as described in the sections that follow.

You can configure a certificate authority to use with your scaled objects or for all scalers in the cluster.

Understanding the Prometheus trigger

You can scale pods based on Prometheus metrics, which can use the installed OKD monitoring or an external Prometheus server as the metrics source. See "Configuring the custom metrics autoscaler to use OKD monitoring" for information on the configurations required to use the OKD monitoring as a source for metrics.

If Prometheus is collecting metrics from the application that the custom metrics autoscaler is scaling, do not set the minimum replicas to 0 in the custom resource. If there are no application pods, the custom metrics autoscaler does not have any metrics to scale on.

Example scaled object with a Prometheus target

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prom-scaledobject
  namespace: my-namespace
spec:
# ...
  triggers:
  - type: prometheus (1)
    metadata:
      serverAddress: https://thanos-querier.openshift-monitoring.svc.cluster.local:9092 (2)
      namespace: kedatest (3)
      metricName: http_requests_total (4)
      threshold: '5' (5)
      query: sum(rate(http_requests_total{job="test-app"}[1m])) (6)
      authModes: basic (7)
      cortexOrgID: my-org (8)
      ignoreNullValues: "false" (9)
      unsafeSsl: "false" (10)

1 Specifies Prometheus as the trigger type.

2 Specifies the address of the Prometheus server. This example uses OKD monitoring.

3 Optional: Specifies the namespace of the object you want to scale. This parameter is mandatory if using OKD monitoring as a source for the metrics.

4 Specifies the name to identify the metric in the external.metrics.k8s.io API. If you are using more than one trigger, all metric names must be unique.

5 Specifies the value that triggers scaling. Must be specified as a quoted string value.

6 Specifies the Prometheus query to use.

7 Specifies the authentication method to use. Prometheus scalers support bearer authentication (bearer), basic authentication (basic), or TLS authentication (tls). You configure the specific authentication parameters in a trigger authentication, as discussed in a following section. As needed, you can also use a secret.

8 Optional: Passes the X-Scope-OrgID header to multi-tenant Cortex or Mimir storage for Prometheus. This parameter is required only with multi-tenant Prometheus storage, to indicate which data Prometheus should return.

Optional: Specifies how the trigger should proceed if the Prometheus target is lost.

If true, the trigger continues to operate if the Prometheus target is lost. This is the default behavior.
If false, the trigger returns an error if the Prometheus target is lost.

Optional: Specifies whether the certificate check should be skipped. For example, you might skip the check if you are running in a test environment and using self-signed certificates at the Prometheus endpoint.

If false, the certificate check is performed. This is the default behavior.
If true, the certificate check is not performed.

Skipping the check is not recommended.

Configuring the custom metrics autoscaler to use OKD monitoring

You can use the installed OKD Prometheus monitoring as a source for the metrics used by the custom metrics autoscaler. However, there are some additional configurations you must perform.

These steps are not required for an external Prometheus source.

You must perform the following tasks, as described in this section:

Create a service account.
Create a secret that generates a token for the service account.
Create the trigger authentication.
Create a role.
Add that role to the service account.
Reference the token in the trigger authentication object used by Prometheus.

Prerequisites

OKD monitoring must be installed.
Monitoring of user-defined workloads must be enabled in OKD monitoring, as described in the Creating a user-defined workload monitoring config map section.
The Custom Metrics Autoscaler Operator must be installed.

Procedure

Change to the project with the object you want to scale:
```
$ oc project my-project
```

Create a service account and token, if your cluster does not have one:

Create a service account object by using the following command:
```
$ oc create serviceaccount thanos (1)
```
1 Specifies the name of the service account.

Create a secret YAML to generate a service account token:

apiVersion: v1
kind: Secret
metadata:
  name: thanos-token
  annotations:
    kubernetes.io/service-account.name: thanos (1)
type: kubernetes.io/service-account-token

1	Specifies the name of the service account.

Create the secret object by using the following command:
```
$ oc create -f <file_name>.yaml
```

Use the following command to locate the token assigned to the service account:

$ oc describe serviceaccount thanos (1)

1	Specifies the name of the service account.

Example output

Name:                thanos
Namespace:           my-project
Labels:              <none>
Annotations:         <none>
Image pull secrets:  thanos-dockercfg-nnwgj
Mountable secrets:   thanos-dockercfg-nnwgj
Tokens:              thanos-token (1)
Events:              <none>

1	Use this token in the trigger authentication.

Create a trigger authentication with the service account token:

Create a YAML file similar to the following:

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: keda-trigger-auth-prometheus
spec:
  secretTargetRef: (1)
  - parameter: bearerToken (2)
    name: thanos-token (3)
    key: token (4)
  - parameter: ca
    name: thanos-token
    key: ca.crt

1	Specifies that this object uses a secret for authorization.
2	Specifies the authentication parameter to supply by using the token.
3	Specifies the name of the token to use.
4	Specifies the key in the token to use with the specified parameter.

Create the CR object:
```
$ oc create -f <file-name>.yaml
```

Create a role for reading Thanos metrics:

Create a YAML file with the following parameters:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: thanos-metrics-reader
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch

Create the CR object:
```
$ oc create -f <file-name>.yaml
```

Create a role binding for reading Thanos metrics:

Create a YAML file similar to the following:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: thanos-metrics-reader (1)
  namespace: my-project (2)
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: thanos-metrics-reader
subjects:
- kind: ServiceAccount
  name: thanos (3)
  namespace: my-project (4)

1	Specifies the name of the role you created.
2	Specifies the namespace of the object you want to scale.
3	Specifies the name of the service account to bind to the role.
4	Specifies the namespace of the object you want to scale.

Create the CR object:
```
$ oc create -f <file-name>.yaml
```

You can now deploy a scaled object or scaled job to enable autoscaling for your application, as described in "Understanding how to add custom metrics autoscalers". To use OKD monitoring as the source, in the trigger, or scaler, you must include the following parameters:

triggers.type must be prometheus
triggers.metadata.serverAddress must be https://thanos-querier.openshift-monitoring.svc.cluster.local:9092
triggers.metadata.authModes must be bearer
triggers.metadata.namespace must be set to the namespace of the object to scale
triggers.authenticationRef must point to the trigger authentication resource specified in the previous step

Understanding the CPU trigger

You can scale pods based on CPU metrics. This trigger uses cluster metrics as the source for metrics.

The custom metrics autoscaler scales the pods associated with an object to maintain the CPU usage that you specify. The autoscaler increases or decreases the number of replicas between the minimum and maximum numbers to maintain the specified CPU utilization across all pods. The memory trigger considers the memory utilization of the entire pod. If the pod has multiple containers, the memory trigger considers the total memory utilization of all containers in the pod.

This trigger cannot be used with the ScaledJob custom resource.
When using a memory trigger to scale an object, the object does not scale to 0, even if you are using multiple triggers.

Example scaled object with a CPU target

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cpu-scaledobject
  namespace: my-namespace
spec:
# ...
  triggers:
  - type: cpu (1)
    metricType: Utilization (2)
    metadata:
      value: '60' (3)
  minReplicaCount: 1 (4)

1	Specifies CPU as the trigger type.
2	Specifies the type of metric to use, either `Utilization` or `AverageValue`.
3	Specifies the value that triggers scaling. Must be specified as a quoted string value. When using `Utilization`, the target value is the average of the resource metrics across all relevant pods, represented as a percentage of the requested value of the resource for the pods. When using `AverageValue`, the target value is the average of the metrics across all relevant pods.
4	Specifies the minimum number of replicas when scaling down. For a CPU trigger, enter a value of `1` or greater, because the HPA cannot scale to zero if you are using only CPU metrics.

Understanding the memory trigger

You can scale pods based on memory metrics. This trigger uses cluster metrics as the source for metrics.

The custom metrics autoscaler scales the pods associated with an object to maintain the average memory usage that you specify. The autoscaler increases and decreases the number of replicas between the minimum and maximum numbers to maintain the specified memory utilization across all pods. The memory trigger considers the memory utilization of entire pod. If the pod has multiple containers, the memory utilization is the sum of all of the containers.

This trigger cannot be used with the ScaledJob custom resource.
When using a memory trigger to scale an object, the object does not scale to 0, even if you are using multiple triggers.

Example scaled object with a memory target

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: memory-scaledobject
  namespace: my-namespace
spec:
# ...
  triggers:
  - type: memory (1)
    metricType: Utilization (2)
    metadata:
      value: '60' (3)
      containerName: api (4)

1	Specifies memory as the trigger type.
2	Specifies the type of metric to use, either `Utilization` or `AverageValue`.
3	Specifies the value that triggers scaling. Must be specified as a quoted string value. When using `Utilization`, the target value is the average of the resource metrics across all relevant pods, represented as a percentage of the requested value of the resource for the pods. When using `AverageValue`, the target value is the average of the metrics across all relevant pods.
4	Optional: Specifies an individual container to scale, based on the memory utilization of only that container, rather than the entire pod. In this example, only the container named `api` is to be scaled.

Understanding the Kafka trigger

You can scale pods based on an Apache Kafka topic or other services that support the Kafka protocol. The custom metrics autoscaler does not scale higher than the number of Kafka partitions, unless you set the allowIdleConsumers parameter to true in the scaled object or scaled job.

If the number of consumer groups exceeds the number of partitions in a topic, the extra consumer groups remain idle. To avoid this, by default the number of replicas does not exceed:

The number of partitions on a topic, if a topic is specified
The number of partitions of all topics in the consumer group, if no topic is specified
The maxReplicaCount specified in scaled object or scaled job CR

You can use the allowIdleConsumers parameter to disable these default behaviors.

Example scaled object with a Kafka target

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-scaledobject
  namespace: my-namespace
spec:
# ...
  triggers:
  - type: kafka (1)
    metadata:
      topic: my-topic (2)
      bootstrapServers: my-cluster-kafka-bootstrap.openshift-operators.svc:9092 (3)
      consumerGroup: my-group (4)
      lagThreshold: '10' (5)
      activationLagThreshold: '5' (6)
      offsetResetPolicy: latest (7)
      allowIdleConsumers: true (8)
      scaleToZeroOnInvalidOffset: false (9)
      excludePersistentLag: false (10)
      version: '1.0.0' (11)
      partitionLimitation: '1,2,10-20,31' (12)
      tls: enable (13)

1	Specifies Kafka as the trigger type.
2	Specifies the name of the Kafka topic on which Kafka is processing the offset lag.
3	Specifies a comma-separated list of Kafka brokers to connect to.
4	Specifies the name of the Kafka consumer group used for checking the offset on the topic and processing the related lag.
5	Optional: Specifies the average target value that triggers scaling. Must be specified as a quoted string value. The default is `5`.
6	Optional: Specifies the target value for the activation phase. Must be specified as a quoted string value.
7	Optional: Specifies the Kafka offset reset policy for the Kafka consumer. The available values are: `latest` and `earliest`. The default is `latest`.
8	Optional: Specifies whether the number of Kafka replicas can exceed the number of partitions on a topic. If `true`, the number of Kafka replicas can exceed the number of partitions on a topic. This allows for idle Kafka consumers. If `false`, the number of Kafka replicas cannot exceed the number of partitions on a topic. This is the default.
9	Specifies how the trigger behaves when a Kafka partition does not have a valid offset. If `true`, the consumers are scaled to zero for that partition. If `false`, the scaler keeps a single consumer for that partition. This is the default.
10	Optional: Specifies whether the trigger includes or excludes partition lag for partitions whose current offset is the same as the current offset of the previous polling cycle. If `true`, the scaler excludes partition lag in these partitions. If `false`, the trigger includes all consumer lag in all partitions. This is the default.
11	Optional: Specifies the version of your Kafka brokers. Must be specified as a quoted string value. The default is `1.0.0`.
12	Optional: Specifies a comma-separated list of partition IDs to scope the scaling on. If set, only the listed IDs are considered when calculating lag. Must be specified as a quoted string value. The default is to consider all partitions.
13	Optional: Specifies whether to use TSL client authentication for Kafka. The default is `disable`. For information on configuring TLS, see "Understanding custom metrics autoscaler trigger authentications".

Understanding the Cron trigger

You can scale pods based on a time range.

When the time range starts, the custom metrics autoscaler scales the pods associated with an object from the configured minimum number of pods to the specified number of desired pods. At the end of the time range, the pods are scaled back to the configured minimum. The time period must be configured in cron format.

The following example scales the pods associated with this scaled object from 0 to 100 from 6:00 AM to 6:30 PM India Standard Time.

Example scaled object with a Cron trigger

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cron-scaledobject
  namespace: default
spec:
  scaleTargetRef:
    name: my-deployment
  minReplicaCount: 0 (1)
  maxReplicaCount: 100 (2)
  cooldownPeriod: 300
  triggers:
  - type: cron (3)
    metadata:
      timezone: Asia/Kolkata (4)
      start: "0 6 * * *" (5)
      end: "30 18 * * *" (6)
      desiredReplicas: "100" (7)

1	Specifies the minimum number of pods to scale down to at the end of the time frame.
2	Specifies the maximum number of replicas when scaling up. This value should be the same as `desiredReplicas`. The default is `100`.
3	Specifies a Cron trigger.
4	Specifies the timezone for the time frame. This value must be from the IANA Time Zone Database.
5	Specifies the start of the time frame.
6	Specifies the end of the time frame.
7	Specifies the number of pods to scale to between the start and end of the time frame. This value should be the same as `maxReplicaCount`.

Understanding custom metrics autoscaler triggers