Parts of Cluster Monitoring are configurable. The API is accessible through parameters defined in various configmaps.
Depending on which part of the stack you want to configure, edit the following:
The configuration of OpenShift Container Platform monitoring components in a configmap called cluster-monitoring-config
in the openshift-monitoring
namespace. Defined by ClusterMonitoringConfiguration.
The configuration of components that monitor user-defined projects in a configmap called user-workload-monitoring-config
in the openshift-user-workload-monitoring
namespace. Defined by UserWorkloadConfiguration.
The configuration file itself is always defined under the config.yaml
key within the configmap data.
Not all configuration parameters are exposed. Configuring Cluster Monitoring is optional. If the configuration does not exist or is empty or malformed, defaults are used. |
AdditionalAlertmanagerConfig
defines configuration on how a component should communicate with aditional Alertmanager instances.
apiVersion
Appears in: PrometheusK8sConfig, PrometheusRestrictedConfig, ThanosRulerConfig
Property | Type | Description |
---|---|---|
apiVersion |
string |
APIVersion defines the api version of Alertmanager. |
bearerToken |
BearerToken defines the bearer token to use when authenticating to Alertmanager. |
|
pathPrefix |
string |
PathPrefix defines the path prefix to add in front of the push endpoint path. |
scheme |
string |
Scheme the URL scheme to use when talking to Alertmanagers. |
staticConfigs |
array(string) |
StaticConfigs a list of statically configured Alertmanagers. |
timeout |
string |
Timeout defines the timeout used when sending alerts. |
tlsConfig |
TLSConfig defines the TLS Config to use for alertmanager connection. |
AlertmanagerMainConfig
defines configuration related with the main Alertmanager instance.
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
enabled |
bool |
Enabled a boolean flag to enable or disable the main Alertmanager instance under openshift-monitoring default: true |
enableUserAlertmanagerConfig |
bool |
EnableUserAlertManagerConfig boolean flag to enable or disable user-defined namespaces to be selected for AlertmanagerConfig lookup, by default Alertmanager only looks for configuration in the namespace where it was deployed to. This will only work if the UWM Alertmanager instance is not enabled. default: false |
logLevel |
string |
LogLevel defines the log level for Alertmanager. Possible values are: error, warn, info, debug. default: info |
nodeSelector |
map[string]string |
NodeSelector defines which Nodes the Pods are scheduled on. |
resources |
Resources define resources requests and limits for single Pods. |
|
tolerations |
array(v1.Toleration) |
Tolerations defines the Pods tolerations. |
volumeClaimTemplate |
VolumeClaimTemplate defines persistent storage for Alertmanager. It’s possible to configure storageClass and size of volume. |
ClusterMonitoringConfiguration
defines configuration that allows users to customise the platform monitoring stack through the cluster-monitoring-config configmap in the openshift-monitoring namespace
Property | Type | Description |
---|---|---|
alertmanagerMain |
AlertmanagerMainConfig defines configuration related with the main Alertmanager instance. |
|
enableUserWorkload |
bool |
UserWorkloadEnabled boolean flag to enable monitoring for user-defined projects. |
k8sPrometheusAdapter |
K8sPrometheusAdapter defines configuration related with prometheus-adapter |
|
kubeStateMetrics |
KubeStateMetricsConfig defines configuration related with kube-state-metrics agent |
|
prometheusK8s |
PrometheusK8sConfig defines configuration related with prometheus |
|
prometheusOperator |
PrometheusOperatorConfig defines configuration related with prometheus-operator |
|
openshiftStateMetrics |
OpenShiftMetricsConfig defines configuration related with openshift-state-metrics agent |
|
thanosQuerier |
ThanosQuerierConfig defines configuration related with the Thanos Querier component |
K8sPrometheusAdapter
defines configuration related with Prometheus Adapater
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
audit |
Audit |
Audit defines the audit configuration to be used by the prometheus adapter instance. Possible profile values are: "metadata, request, requestresponse, none". default: metadata |
nodeSelector |
map[string]string |
NodeSelector defines which Nodes the Pods are scheduled on. |
tolerations |
array(v1.Toleration) |
Tolerations defines the Pods tolerations. |
KubeStateMetricsConfig
defines configuration related with the kube-state-metrics agent.
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
nodeSelector |
map[string]string |
NodeSelector defines which Nodes the Pods are scheduled on. |
tolerations |
array(v1.Toleration) |
Tolerations defines the Pods tolerations. |
OpenShiftStateMetricsConfig
holds configuration related to openshift-state-metrics agent.
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
nodeSelector |
map[string]string |
NodeSelector defines which Nodes the Pods are scheduled on. |
tolerations |
array(v1.Toleration) |
Tolerations defines the Pods tolerations. |
PrometheusK8sConfig
holds configuration related to the Prometheus component.
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
additionalAlertmanagerConfigs |
array(AdditionalAlertmanagerConfig) |
AlertmanagerConfigs holds configuration about how the Prometheus component should communicate with aditional Alertmanager instances. default: nil |
externalLabels |
map[string]string |
ExternalLabels defines labels to be added to any time series or alerts when communicating with external systems (federation, remote storage, Alertmanager). default: nil |
logLevel |
string |
LogLevel defines the log level for Prometheus. Possible values are: error, warn, info, debug. default: info |
nodeSelector |
map[string]string |
NodeSelector defines which Nodes the Pods are scheduled on. |
queryLogFile |
string |
QueryLogFile specifies the file to which PromQL
queries are logged. Suports both just a filename in which case they will
be saved to an emptyDir volume at |
remoteWrite |
array(remotewritespec) |
RemoteWrite Holds the remote write configuration, everything from url, authorization to relabeling |
resources |
Resources define resources requests and limits for single Pods. |
|
retention |
string |
Retention defines the Time duration Prometheus shall retain data for. Must match the regular expression [0-9]+(ms|s|m|h|d|w|y) (milliseconds seconds minutes hours days weeks years). default: 15d |
tolerations |
array(v1.Toleration) |
Tolerations defines the Pods tolerations. |
volumeClaimTemplate |
VolumeClaimTemplate defines persistent storage for Prometheus. It’s possible to configure storageClass and size of volume. |
PrometheusOperatorConfig
holds configuration related to Prometheus Operator.
Appears in: ClusterMonitoringConfiguration, UserWorkloadConfiguration
Property | Type | Description |
---|---|---|
logLevel |
string |
LogLevel defines the log level for Prometheus Operator. Possible values are: error, warn, info, debug. default: info |
nodeSelector |
map[string]string |
NodeSelector defines which Nodes the Pods are scheduled on. |
tolerations |
array(v1.Toleration) |
Tolerations defines the Pods tolerations. |
PrometheusRestrictedConfig
defines configuration related to the Prometheus component that will monitor user-defined projects.
Appears in: UserWorkloadConfiguration
Property | Type | Description |
---|---|---|
additionalAlertmanagerConfigs |
array(additionalalertmanagerconfig) |
AlertmanagerConfigs holds configuration about how the Prometheus component should communicate with aditional Alertmanager instances. default: nil |
enforcedSampleLimit |
uint64 |
EnforcedSampleLimit defines a global limit on the number of scraped samples that will be accepted. This overrides any SampleLimit set per ServiceMonitor or/and PodMonitor. It is meant to be used by admins to enforce the SampleLimit to keep the overall number of samples/series under the desired limit. Note that if SampleLimit is lower that value will be taken instead. default: 0 |
enforcedTargetLimit |
uint64 |
EnforcedTargetLimit defines a global limit on the number of scraped targets. This overrides any TargetLimit set per ServiceMonitor or/and PodMonitor. It is meant to be used by admins to enforce the TargetLimit to keep the overall number of targets under the desired limit. Note that if TargetLimit is lower, that value will be taken instead, except if either value is zero, in which case the non-zero value will be used. If both values are zero, no limit is enforced. default: 0 |
externalLabels |
map[string]string |
ExternalLabels defines labels to be added to any time series or alerts when communicating with external systems (federation, remote storage, Alertmanager). default: nil |
logLevel |
string |
LogLevel defines the log level for Prometheus. Possible values are: error, warn, info, debug. default: info |
nodeSelector |
map[string]string |
NodeSelector defines which Nodes the Pods are scheduled on. |
queryLogFile |
string |
QueryLogFile specifies the file to which PromQL queries are logged. Suports both just a filename in which case they will be saved to an emptyDir volume at /var/log/prometheus, if a full path is given an emptyDir volume will be mounted at that location. Relative paths not supported, also not supported writing to linux std streams. default: "" |
remoteWrite |
array(remotewritespec) |
RemoteWrite Holds the remote write configuration, everything from url, authorization to relabeling |
resources |
Resources define resources requests and limits for single Pods. |
|
retention |
string |
Retention defines the Time duration Prometheus shall retain data for. Must match the regular expression [0-9]+(ms|s|m|h|d|w|y) (milliseconds seconds minutes hours days weeks years). default: 15d |
tolerations |
array(v1.Toleration) |
Tolerations defines the Pods tolerations. |
volumeClaimTemplate |
VolumeClaimTemplate defines persistent storage for Prometheus. It’s possible to configure storageClass and size of volume. |
RemoteWriteSpec
is an almost identical copy of monv1.RemoteWriteSpec
but with the BearerToken
field removed. In the future other fields might be added here.
url
Appears in: PrometheusK8sConfig, PrometheusRestrictedConfig
Property | Type | Description |
---|---|---|
authorization |
monv1.SafeAuthorization |
Authorization defines the authorization section for remote write |
basicAuth |
BasicAuth defines configuration for basic authentication for the URL. |
|
bearerTokenFile |
string |
BearerTokenFile defines the file where the bearer token for remote write resides. |
headers |
map[string]string |
Headers custom HTTP headers to be sent along with each remote write request. Be aware that headers that are set by Prometheus itself can’t be overwritten. |
metadataConfig |
MetadataConfig configures the sending of series metadata to remote storage. |
|
name |
string |
Name defines the name of the remote write queue, must be unique if specified. The name is used in metrics and logging in order to differentiate queues. |
oauth2 |
monv1.OAuth2 |
OAuth2 configures OAuth2 authentication for remote write. |
proxyUrl |
string |
ProxyURL defines an optional proxy URL |
queueConfig |
QueueConfig allows tuning of the remote write queue parameters. |
|
remoteTimeout |
string |
RemoteTimeout defines the timeout for requests to the remote write endpoint. |
sigv4 |
monv1.Sigv4 |
Sigv4 allows to configures AWS’s Signature Verification 4 |
tlsConfig |
TLSConfig defines the TLS configuration to use for remote write. |
|
url |
string |
URL defines the URL of the endpoint to send samples to. |
writeRelabelConfigs |
array(monv1.RelabelConfig) |
WriteRelabelConfigs defines the list of remote write relabel configurations. |
insecureSkipVerify
Appears in: AdditionalAlertmanagerConfig
Property | Type | Description |
---|---|---|
ca |
CA defines the CA cert in the Prometheus container to use for the targets. |
|
cert |
Cert defines the client cert in the Prometheus container to use for the targets. |
|
key |
Key defines the client key in the Prometheus container to use for the targets. |
|
serverName |
string |
ServerName used to verify the hostname for the targets. |
insecureSkipVerify |
bool |
InsecureSkipVerify disable target certificate validation. |
ThanosQuerierConfig
holds configuration related to Thanos Querier component.
Appears in: ClusterMonitoringConfiguration
Property | Type | Description |
---|---|---|
enableRequestLogging |
bool |
EnableRequestLogging boolean flag to enable or disable request logging default: false |
logLevel |
string |
LogLevel defines the log level for Thanos Querier. Possible values are: error, warn, info, debug. default: info |
nodeSelector |
map[string]string |
NodeSelector defines which Nodes the Pods are scheduled on. |
resources |
Resources define resources requests and limits for single Pods. |
|
tolerations |
array(v1.Toleration) |
Tolerations defines the Pods tolerations. |
ThanosRulerConfig
defines configuration for the Thanos Ruler instance for user-defined projects.
Appears in: UserWorkloadConfiguration
Property | Type | Description |
---|---|---|
additionalAlertmanagerConfigs |
array(additionalalertmanagerconfig) |
AlertmanagerConfigs holds configuration about how the Thanos Ruler component should communicate with aditional Alertmanager instances. default: nil |
logLevel |
string |
LogLevel defines the log level for Thanos Ruler. Possible values are: error, warn, info, debug. default: info |
nodeSelector |
map[string]string |
NodeSelector defines which Nodes the Pods are scheduled on. |
resources |
Resources define resources requests and limits for single Pods. |
|
retention |
string |
Retention defines the time duration Thanos Ruler shall retain data for. Must match the regular expression [0-9]+(ms|s|m|h|d|w|y) (milliseconds seconds minutes hours days weeks years). default: 15d |
tolerations |
array(v1.Toleration) |
Tolerations defines the Pods tolerations. |
volumeClaimTemplate |
VolumeClaimTemplate defines persistent storage for Thanos Ruler. It’s possible to configure storageClass and size of volume. |
UserWorkloadConfiguration
defines configuration that allows users to customise the monitoring stack responsible for user-defined projects through the user-workload-monitoring-config configmap in the openshift-user-workload-monitoring namespace
Property | Type | Description |
---|---|---|
prometheus |
Prometheus defines configuration for Prometheus component. |
|
prometheusOperator |
PrometheusOperator defines configuration for prometheus-operator component. |
|
thanosRuler |
ThanosRuler defines configuration for the Thanos Ruler component |