This is a cache of https://docs.openshift.com/container-platform/4.9/logging/cluster-logging-release-notes.html. It is a snapshot of the page at 2024-11-29T18:44:22.909+0000.
Release notes | Logging | OpenShift Container Platform 4.9
×
Logging Compatibility

The logging subsystem for Red Hat OpenShift is provided as an installable component, with a distinct release cycle from the core OpenShift Container Platform. The Red Hat OpenShift Container Platform Life Cycle Policy outlines release compatibility.

Logging 5.5.8

Bug fixes

  • Before this update, the priority field was missing from systemd logs due to an error in how the collector set level fields. With this update, these fields are set correctly, resolving the issue. (LOG-3630)

Logging 5.5.7

Bug fixes

  • Before this update, the LokiStack Gateway Labels Enforcer generated parsing errors for valid LogQL queries when using combined label filters with boolean expressions. With this update, the LokiStack LogQL implementation supports label filters with boolean expression and resolves the issue. (LOG-3534)

  • Before this update, the ClusterLogForwarder custom resource (CR) did not pass TLS credentials for syslog output to Fluentd, resulting in errors during forwarding. With this update, credentials pass correctly to Fluentd, resolving the issue. (LOG-3533)

Logging 5.5.6

Known issues

Bug fixes

  • Before this update, the Pod Security admission controller added the label podSecurityLabelSync = true to the openshift-logging namespace. This resulted in our specified security labels being overwritten, and as a result Collector pods would not start. With this update, the label podSecurityLabelSync = false preserves security labels. Collector pods deploy as expected. (LOG-3340)

  • Before this update, the Operator installed the console view plugin, even when it was not enabled on the cluster. This caused the Operator to crash. With this update, if an account for a cluster does not have the console view enabled, the Operator functions normally and does not install the console view. (LOG-3407)

  • Before this update, a prior fix to support a regression where the status of the Elasticsearch deployment was not being updated caused the Operator to crash unless the Red Hat Elasticsearch Operator was deployed. With this update, that fix has been reverted so the Operator is now stable but re-introduces the previous issue related to the reported status. (LOG-3428)

  • Before this update, the Loki Operator only deployed one replica of the LokiStack gateway regardless of the chosen stack size. With this update, the number of replicas is correctly configured according to the selected size. (LOG-3478)

  • Before this update, records written to Elasticsearch would fail if multiple label keys had the same prefix and some keys included dots. With this update, underscores replace dots in label keys, resolving the issue. (LOG-3341)

  • Before this update, the logging view plugin contained an incompatible feature for certain versions of OpenShift Container Platform. With this update, the correct release stream of the plugin resolves the issue. (LOG-3467)

  • Before this update, the reconciliation of the ClusterLogForwarder custom resource would incorrectly report a degraded status of one or more pipelines causing the collector pods to restart every 8-10 seconds. With this update, reconciliation of the ClusterLogForwarder custom resource processes correctly, resolving the issue. (LOG-3469)

  • Before this change the spec for the outputDefaults field of the ClusterLogForwarder custom resource would apply the settings to every declared Elasticsearch output type. This change corrects the behavior to match the enhancement specification where the setting specifically applies to the default managed Elasticsearch store. (LOG-3342)

  • Before this update, the the OpenShift CLI (oc) must-gather script did not complete because the OpenShift CLI (oc) needs a folder with write permission to build its cache. With this update, the OpenShift CLI (oc) has write permissions to a folder, and the must-gather script completes successfully. (LOG-3472)

  • Before this update, the Loki Operator webhook server caused TLS errors. With this update, the Loki Operator webhook PKI is managed by the Operator Lifecycle Manager’s dynamic webhook management resolving the issue. (LOG-3511)

Logging 5.5.5

Bug fixes

  • Before this update, Kibana had a fixed 24h OAuth cookie expiration time, which resulted in 401 errors in Kibana whenever the accessTokenInactivityTimeout field was set to a value lower than 24h. With this update, Kibana’s OAuth cookie expiration time synchronizes to the accessTokenInactivityTimeout, with a default value of 24h. (LOG-3305)

  • Before this update, Vector parsed the message field when JSON parsing was enabled without also defining structuredTypeKey or structuredTypeName values. With this update, a value is required for either structuredTypeKey or structuredTypeName when writing structured logs to Elasticsearch. (LOG-3284)

  • Before this update, the FluentdQueueLengthIncreasing alert could fail to fire when there was a cardinality issue with the set of labels returned from this alert expression. This update reduces labels to only include those required for the alert. (LOG-3226)

  • Before this update, Loki did not have support to reach an external storage in a disconnected cluster. With this update, proxy environment variables and proxy trusted CA bundles are included in the container image to support these connections. (LOG-2860)

  • Before this update, OpenShift Container Platform web console users could not choose the ConfigMap object that includes the CA certificate for Loki, causing pods to operate without the CA. With this update, web console users can select the config map, resolving the issue. (LOG-3310)

  • Before this update, the CA key was used as volume name for mounting the CA into Loki, causing error states when the CA Key included non-conforming characters (such as dots). With this update, the volume name is standardized to an internal string which resolves the issue. (LOG-3332)

Logging 5.5.4

Bug fixes

  • Before this update, an error in the query parser of the logging view plugin caused parts of the logs query to disappear if the query contained curly brackets {}. This made the queries invalid, leading to errors being returned for valid queries. With this update, the parser correctly handles these queries. (LOG-3042)

  • Before this update, the Operator could enter a loop of removing and recreating the collector daemonset while the Elasticsearch or Kibana deployments changed their status. With this update, a fix in the status handling of the Operator resolves the issue. (LOG-3049)

  • Before this update, no alerts were implemented to support the collector implementation of Vector. This change adds Vector alerts and deploys separate alerts, depending upon the chosen collector implementation. (LOG-3127)

  • Before this update, the secret creation component of the Elasticsearch Operator modified internal secrets constantly. With this update, the existing secret is properly handled. (LOG-3138)

  • Before this update, a prior refactoring of the logging must-gather scripts removed the expected location for the artifacts. This update reverts that change to write artifacts to the /must-gather folder. (LOG-3213)

  • Before this update, on certain clusters, the Prometheus exporter would bind on IPv4 instead of IPv6. After this update, Fluentd detects the IP version and binds to 0.0.0.0 for IPv4 or [::] for IPv6. (LOG-3162)

Logging 5.5.3

Bug fixes

  • Before this update, log entries that had structured messages included the original message field, which made the entry larger. This update removes the message field for structured logs to reduce the increased size. (LOG-2759)

  • Before this update, the collector configuration excluded logs from collector, default-log-store, and visualization pods, but was unable to exclude logs archived in a .gz file. With this update, archived logs stored as .gz files of collector, default-log-store, and visualization pods are also excluded. (LOG-2844)

  • Before this update, when requests to an unavailable pod were sent through the gateway, no alert would warn of the disruption. With this update, individual alerts will generate if the gateway has issues completing a write or read request. (LOG-2884)

  • Before this update, pod metadata could be altered by fluent plugins because the values passed through the pipeline by reference. This update ensures each log message receives a copy of the pod metadata so each message processes independently. (LOG-3046)

  • Before this update, selecting unknown severity in the OpenShift Console Logs view excluded logs with a level=unknown value. With this update, logs without level and with level=unknown values are visible when filtering by unknown severity. (LOG-3062)

  • Before this update, log records sent to Elasticsearch had an extra field named write-index that contained the name of the index to which the logs needed to be sent. This field is not a part of the data model. After this update, this field is no longer sent. (LOG-3075)

  • With the introduction of the new built-in Pod Security Admission Controller, Pods not configured in accordance with the enforced security standards defined globally or on the namespace level cannot run. With this update, the Operator and collectors allow privileged execution and run without security audit warnings or errors. (LOG-3077)

  • Before this update, the Operator removed any custom outputs defined in the ClusterLogForwarder custom resource when using LokiStack as the default log storage. With this update, the Operator merges custom outputs with the default outputs when processing the ClusterLogForwarder custom resource. (LOG-3095)

Logging 5.5.2

Bug fixes

  • Before this update, alerting rules for the Fluentd collector did not adhere to the OpenShift Container Platform monitoring style guidelines. This update modifies those alerts to include the namespace label, resolving the issue. (LOG-1823)

  • Before this update, the index management rollover script failed to generate a new index name whenever there was more than one hyphen character in the name of the index. With this update, index names generate correctly. (LOG-2644)

  • Before this update, the Kibana route was setting a caCertificate value without a certificate present. With this update, no caCertificate value is set. (LOG-2661)

  • Before this update, a change in the collector dependencies caused it to issue a warning message for unused parameters. With this update, removing unused configuration parameters resolves the issue. (LOG-2859)

  • Before this update, pods created for deployments that Loki Operator created were mistakenly scheduled on nodes with non-Linux operating systems, if such nodes were available in the cluster the Operator was running in. With this update, the Operator attaches an additional node-selector to the pod definitions which only allows scheduling the pods on Linux-based nodes. (LOG-2895)

  • Before this update, the OpenShift Console Logs view did not filter logs by severity due to a LogQL parser issue in the LokiStack gateway. With this update, a parser fix resolves the issue and the OpenShift Console Logs view can filter by severity. (LOG-2908)

  • Before this update, a refactoring of the Fluentd collector plugins removed the timestamp field for events. This update restores the timestamp field, sourced from the event’s received time. (LOG-2923)

  • Before this update, absence of a level field in audit logs caused an error in vector logs. With this update, the addition of a level field in the audit log record resolves the issue. (LOG-2961)

  • Before this update, if you deleted the Kibana Custom Resource, the OpenShift Container Platform web console continued displaying a link to Kibana. With this update, removing the Kibana Custom Resource also removes that link. (LOG-3053)

  • Before this update, each rollover job created empty indices when the ClusterLogForwarder custom resource had JSON parsing defined. With this update, new indices are not empty. (LOG-3063)

  • Before this update, when the user deleted the LokiStack after an update to Loki Operator 5.5 resources originally created by Loki Operator 5.4 remained. With this update, the resources' owner-references point to the 5.5 LokiStack. (LOG-2945)

  • Before this update, a user was not able to view the application logs of namespaces they have access to. With this update, the Loki Operator automatically creates a cluster role and cluster role binding allowing users to read application logs. (LOG-2918)

  • Before this update, users with cluster-admin privileges were not able to properly view infrastructure and audit logs using the logging console. With this update, the authorization check has been extended to also recognize users in cluster-admin and dedicated-admin groups as admins. (LOG-2970)

Logging 5.5.1

Enhancements

  • This enhancement adds an Aggregated Logs tab to the Pod Details page of the OpenShift Container Platform web console when the Logging Console Plugin is in use. This enhancement is only available on OpenShift Container Platform 4.10 and later. (LOG-2647)

  • This enhancement adds Google Cloud Logging as an output option for log forwarding. (LOG-1482)

Bug fixes

  • Before this update, the Operator did not ensure that the pod was ready, which caused the cluster to reach an inoperable state during a cluster restart. With this update, the Operator marks new pods as ready before continuing to a new pod during a restart, which resolves the issue. (LOG-2745)

  • Before this update, Fluentd would sometimes not recognize that the Kubernetes platform rotated the log file and would no longer read log messages. This update corrects that by setting the configuration parameter suggested by the upstream development team. (LOG-2995)

  • Before this update, the addition of multi-line error detection caused internal routing to change and forward records to the wrong destination. With this update, the internal routing is correct. (LOG-2801)

  • Before this update, changing the OpenShift Container Platform web console’s refresh interval created an error when the Query field was empty. With this update, changing the interval is not an available option when the Query field is empty. (LOG-2917)

Logging 5.5

The following advisories are available for Logging 5.5:Release 5.5

Enhancements

  • With this update, you can forward structured logs from different containers within the same pod to different indices. To use this feature, you must configure the pipeline with multi-container support and annotate the pods. (LOG-1296)

JSON formatting of logs varies by application. Because creating too many indices impacts performance, limit your use of this feature to creating indices for logs that have incompatible JSON formats. Use queries to separate logs from different namespaces, or applications with compatible JSON formats.

  • With this update, you can filter logs with Elasticsearch outputs by using the Kubernetes common labels, app.kubernetes.io/component, app.kubernetes.io/managed-by, app.kubernetes.io/part-of, and app.kubernetes.io/version. Non-Elasticsearch output types can use all labels included in kubernetes.labels. (LOG-2388)

  • With this update, clusters with AWS Security Token Service (STS) enabled may use STS authentication to forward logs to Amazon CloudWatch. (LOG-1976)

  • With this update, the 'LokiOperator' Operator and Vector collector move from Technical Preview to General Availability. Full feature parity with prior releases are pending, and some APIs remain Technical Previews. See the Logging with the LokiStack section for details.

Bug fixes

  • Before this update, clusters configured to forward logs to Amazon CloudWatch wrote rejected log files to temporary storage, causing cluster instability over time. With this update, chunk backup for all storage options has been disabled, resolving the issue. (LOG-2746)

  • Before this update, the Operator was using versions of some APIs that are deprecated and planned for removal in future versions of OpenShift Container Platform. This update moves dependencies to the supported API versions. (LOG-2656)

Before this update, the Operator was using versions of some APIs that are deprecated and planned for removal in future versions of OpenShift Container Platform. This update moves dependencies to the supported API versions. (LOG-2656)

  • Before this update, multiple ClusterLogForwarder pipelines configured for multiline error detection caused the collector to go into a crashloopbackoff error state. This update fixes the issue where multiple configuration sections had the same unique ID. (LOG-2241)

  • Before this update, the collector could not save non UTF-8 symbols to the Elasticsearch storage logs. With this update the collector encodes non UTF-8 symbols, resolving the issue. (LOG-2203)

  • Before this update, non-latin characters displayed incorrectly in Kibana. With this update, Kibana displays all valid UTF-8 symbols correctly. (LOG-2784)

Logging 5.4.13

Bug fixes

  • Before this update, a problem with the Fluentd collector caused it to not capture OAuth login events stored in /var/log/auth-server/audit.log. This led to incomplete collection of login events from the OAuth service. With this update, the Fluentd collector now resolves this issue by capturing all login events from the OAuth service, including those stored in /var/log/auth-server/audit.log, as expected. (LOG-3731)

Logging 5.4.9

Bug fixes

  • Before this update, the Fluentd collector would warn of unused configuration parameters. This update removes those configuration parameters and their warning messages. (LOG-3074)

  • Before this update, Kibana had a fixed 24h OAuth cookie expiration time, which resulted in 401 errors in Kibana whenever the accessTokenInactivityTimeout field was set to a value lower than 24h. With this update, Kibana’s OAuth cookie expiration time synchronizes to the accessTokenInactivityTimeout, with a default value of 24h. (LOG-3306)

Logging 5.4.6

Bug fixes

  • Before this update, Fluentd would sometimes not recognize that the Kubernetes platform rotated the log file and would no longer read log messages. This update corrects that by setting the configuration parameter suggested by the upstream development team. (LOG-2792)

  • Before this update, each rollover job created empty indices when the ClusterLogForwarder custom resource had JSON parsing defined. With this update, new indices are not empty. (LOG-2823)

  • Before this update, if you deleted the Kibana Custom Resource, the OpenShift Container Platform web console continued displaying a link to Kibana. With this update, removing the Kibana Custom Resource also removes that link. (LOG-3054)

Logging 5.4.5

Bug fixes

  • Before this update, the Operator did not ensure that the pod was ready, which caused the cluster to reach an inoperable state during a cluster restart. With this update, the Operator marks new pods as ready before continuing to a new pod during a restart, which resolves the issue. (LOG-2881)

  • Before this update, the addition of multi-line error detection caused internal routing to change and forward records to the wrong destination. With this update, the internal routing is correct. (LOG-2946)

  • Before this update, the Operator could not decode index setting JSON responses with a quoted Boolean value and would result in an error. With this update, the Operator can properly decode this JSON response. (LOG-3009)

  • Before this update, Elasticsearch index templates defined the fields for labels with the wrong types. This change updates those templates to match the expected types forwarded by the log collector. (LOG-2972)

Logging 5.4.4

Bug fixes

  • Before this update, non-latin characters displayed incorrectly in Elasticsearch. With this update, Elasticsearch displays all valid UTF-8 symbols correctly. (LOG-2794)

  • Before this update, non-latin characters displayed incorrectly in Fluentd. With this update, Fluentd displays all valid UTF-8 symbols correctly. (LOG-2657)

  • Before this update, the metrics server for the collector attempted to bind to the address using a value exposed by an environment value. This change modifies the configuration to bind to any available interface. (LOG-2821)

  • Before this update, the cluster-logging Operator relied on the cluster to create a secret. This cluster behavior changed in OpenShift Container Platform 4.11, which caused logging deployments to fail. With this update, the cluster-logging Operator resolves the issue by creating the secret if needed. (LOG-2840)

Logging 5.4.3

Elasticsearch Operator deprecation notice

In logging subsystem 5.4.3 the Elasticsearch Operator is deprecated and is planned to be removed in a future release. Red Hat will provide bug fixes and support for this feature during the current release lifecycle, but this feature will no longer receive enhancements and will be removed. As an alternative to using the Elasticsearch Operator to manage the default log storage, you can use the Loki Operator.

Bug fixes

  • Before this update, the OpenShift Logging Dashboard showed the number of active primary shards instead of all active shards. With this update, the dashboard displays all active shards. (LOG-2781)

  • Before this update, a bug in a library used by elasticsearch-operator contained a denial of service attack vulnerability. With this update, the library has been updated to a version that does not contain this vulnerability. (LOG-2816)

  • Before this update, when configuring Vector to forward logs to Loki, it was not possible to set a custom bearer token or use the default token if Loki had TLS enabled. With this update, Vector can forward logs to Loki using tokens with TLS enabled. (LOG-2786

  • Before this update, the ElasticSearch Operator omitted the referencePolicy property of the ImageStream custom resource when selecting an oauth-proxy image. This omission caused the Kibana deployment to fail in specific environments. With this update, using referencePolicy resolves the issue, and the Operator can deploy Kibana successfully. (LOG-2791)

  • Before this update, alerting rules for the ClusterLogForwarder custom resource did not take multiple forward outputs into account. This update resolves the issue. (LOG-2640)

  • Before this update, clusters configured to forward logs to Amazon CloudWatch wrote rejected log files to temporary storage, causing cluster instability over time. With this update, chunk backup for CloudWatch has been disabled, resolving the issue. (LOG-2768)

Logging 5.4.2

Bug fixes

  • Before this update, editing the Collector configuration using oc edit was difficult because it had inconsistent use of white-space. This change introduces logic to normalize and format the configuration prior to any updates by the Operator so that it is easy to edit using oc edit. (LOG-2319)

  • Before this update, the FluentdNodeDown alert could not provide instance labels in the message section appropriately. This update resolves the issue by fixing the alert rule to provide instance labels in cases of partial instance failures. (LOG-2607)

  • Before this update, several log levels, such as`critical`, that were documented as supported by the product were not. This update fixes the discrepancy so the documented log levels are now supported by the product. (LOG-2033)

Logging 5.4.1

Bug fixes

  • Before this update, the log file metric exporter only reported logs created while the exporter was running, which resulted in inaccurate log growth data. This update resolves this issue by monitoring /var/log/pods. (LOG-2442)

  • Before this update, the collector would be blocked because it continually tried to use a stale connection when forwarding logs to fluentd forward receivers. With this release, the keepalive_timeout value has been set to 30 seconds (30s) so that the collector recycles the connection and re-attempts to send failed messages within a reasonable amount of time. (LOG-2534)

  • Before this update, an error in the gateway component enforcing tenancy for reading logs limited access to logs with a Kubernetes namespace causing "audit" and some "infrastructure" logs to be unreadable. With this update, the proxy correctly detects users with admin access and allows access to logs without a namespace. (LOG-2448)

  • Before this update, the system:serviceaccount:openshift-monitoring:prometheus-k8s service account had cluster level privileges as a clusterrole and clusterrolebinding. This update restricts the service account` to the openshift-logging namespace with a role and rolebinding. (LOG-2437)

  • Before this update, Linux audit log time parsing relied on an ordinal position of a key/value pair. This update changes the parsing to use a regular expression to find the time entry. (LOG-2321)

Logging 5.4

The following advisories are available for logging 5.4: Logging subsystem for Red Hat OpenShift Release 5.4

Technology Previews

Vector is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

About Vector

Vector is a log collector offered as a tech-preview alternative to the current default collector for the logging subsystem.

The following outputs are supported:

  • elasticsearch. An external Elasticsearch instance. The elasticsearch output can use a TLS connection.

  • kafka. A Kafka broker. The kafka output can use an unsecured or TLS connection.

  • loki. Loki, a horizontally scalable, highly available, multi-tenant log aggregation system.

Enabling Vector

Vector is not enabled by default. Use the following steps to enable Vector on your OpenShift Container Platform cluster.

Vector does not support FIPS Enabled Clusters.

Prerequisites
  • OpenShift Container Platform: 4.10

  • Logging subsystem for Red Hat OpenShift: 5.4

  • FIPS disabled

Procedure
  1. Edit the ClusterLogging custom resource (CR) in the openshift-logging project:

    $ oc -n openshift-logging edit ClusterLogging instance
  2. Add a logging.openshift.io/preview-vector-collector: enabled annotation to the ClusterLogging custom resource (CR).

  3. Add vector as a collection type to the ClusterLogging custom resource (CR).

  apiVersion: "logging.openshift.io/v1"
  kind: "ClusterLogging"
  metadata:
    name: "instance"
    namespace: "openshift-logging"
    annotations:
      logging.openshift.io/preview-vector-collector: enabled
  spec:
    collection:
    logs:
      type: "vector"
      vector: {}
Additional resources

Loki Operator is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

About Loki

Loki is a horizontally scalable, highly available, multi-tenant log aggregation system currently offered as an alternative to Elasticsearch as a log store for the logging subsystem.

Additional resources

Deploying the Lokistack

You can use the OpenShift Container Platform web console to install the LokiOperator.

Prerequisites
  • OpenShift Container Platform: 4.10

  • Logging subsystem for Red Hat OpenShift: 5.4

To install the LokiOperator using the OpenShift Container Platform web console:

  1. Install the LokiOperator:

    1. In the OpenShift Container Platform web console, click OperatorsOperatorHub.

    2. Choose LokiOperator from the list of available Operators, and click Install.

    3. Under Installation Mode, select All namespaces on the cluster.

    4. Under Installed Namespace, select openshift-operators-redhat.

      You must specify the openshift-operators-redhat namespace. The openshift-operators namespace might contain Community Operators, which are untrusted and could publish a metric with the same name as an OpenShift Container Platform metric, which would cause conflicts.

    5. Select Enable operator recommended cluster monitoring on this namespace.

      This option sets the openshift.io/cluster-monitoring: "true" label in the Namespace object. You must select this option to ensure that cluster monitoring scrapes the openshift-operators-redhat namespace.

    6. Select an Approval Strategy.

      • The Automatic strategy allows Operator Lifecycle Manager (OLM) to automatically update the Operator when a new version is available.

      • The Manual strategy requires a user with appropriate credentials to approve the Operator update.

    7. Click Install.

    8. Verify that you installed the LokiOperator. Visit the OperatorsInstalled Operators page and look for "LokiOperator."

    9. Ensure that LokiOperator is listed in all the projects whose Status is Succeeded.

Bug fixes

  • Before this update, the cluster-logging-operator used cluster scoped roles and bindings to establish permissions for the Prometheus service account to scrape metrics. These permissions were created when deploying the Operator using the console interface but were missing when deploying from the command line. This update fixes the issue by making the roles and bindings namespace-scoped. (LOG-2286)

  • Before this update, a prior change to fix dashboard reconciliation introduced a ownerReferences field to the resource across namespaces. As a result, both the config map and dashboard were not created in the namespace. With this update, the removal of the ownerReferences field resolves the issue, and the OpenShift Logging dashboard is available in the console. (LOG-2163)

  • Before this update, changes to the metrics dashboards did not deploy because the cluster-logging-operator did not correctly compare existing and modified config maps that contain the dashboard. With this update, the addition of a unique hash value to object labels resolves the issue. (LOG-2071)

  • Before this update, the OpenShift Logging dashboard did not correctly display the pods and namespaces in the table, which displays the top producing containers collected over the last 24 hours. With this update, the pods and namespaces are displayed correctly. (LOG-2069)

  • Before this update, when the ClusterLogForwarder was set up with Elasticsearch OutputDefault and Elasticsearch outputs did not have structured keys, the generated configuration contained the incorrect values for authentication. This update corrects the secret and certificates used. (LOG-2056)

  • Before this update, the OpenShift Logging dashboard displayed an empty CPU graph because of a reference to an invalid metric. With this update, the correct data point has been selected, resolving the issue. (LOG-2026)

  • Before this update, the Fluentd container image included builder tools that were unnecessary at run time. This update removes those tools from the image.(LOG-1927)

  • Before this update, a name change of the deployed collector in the 5.3 release caused the logging collector to generate the FluentdNodeDown alert. This update resolves the issue by fixing the job name for the Prometheus alert. (LOG-1918)

  • Before this update, the log collector was collecting its own logs due to a refactoring of the component name change. This lead to a potential feedback loop of the collector processing its own log that might result in memory and log message size issues. This update resolves the issue by excluding the collector logs from the collection. (LOG-1774)

  • Before this update, Elasticsearch generated the error Unable to create PersistentVolumeClaim due to forbidden: exceeded quota: infra-storage-quota. if the PVC already existed. With this update, Elasticsearch checks for existing PVCs, resolving the issue. (LOG-2131)

  • Before this update, Elasticsearch was unable to return to the ready state when the elasticsearch-signing secret was removed. With this update, Elasticsearch is able to go back to the ready state after that secret is removed. (LOG-2171)

  • Before this update, the change of the path from which the collector reads container logs caused the collector to forward some records to the wrong indices. With this update, the collector now uses the correct configuration to resolve the issue. (LOG-2160)

  • Before this update, clusters with a large number of namespaces caused Elasticsearch to stop serving requests because the list of namespaces reached the maximum header size limit. With this update, headers only include a list of namespace names, resolving the issue. (LOG-1899)

  • Before this update, the OpenShift Container Platform Logging dashboard showed the number of shards 'x' times larger than the actual value when Elasticsearch had 'x' nodes. This issue occurred because it was printing all primary shards for each Elasticsearch pod and calculating a sum on it, although the output was always for the whole Elasticsearch cluster. With this update, the number of shards is now correctly calculated. (LOG-2156)

  • Before this update, the secrets kibana and kibana-proxy were not recreated if they were deleted manually. With this update, the elasticsearch-operator will watch the resources and automatically recreate them if deleted. (LOG-2250)

  • Before this update, tuning the buffer chunk size could cause the collector to generate a warning about the chunk size exceeding the byte limit for the event stream. With this update, you can also tune the read line limit, resolving the issue. (LOG-2379)

  • Before this update, the logging console link in OpenShift web console was not removed with the ClusterLogging CR. With this update, deleting the CR or uninstalling the Cluster Logging Operator removes the link. (LOG-2373)

  • Before this update, a change to the container logs path caused the collection metric to always be zero with older releases configured with the original path. With this update, the plugin which exposes metrics about collected logs supports reading from either path to resolve the issue. (LOG-2462)

Logging 5.3.11

Bug fixes

  • Before this update, the Operator did not ensure that the pod was ready, which caused the cluster to reach an inoperable state during a cluster restart. With this update, the Operator marks new pods as ready before continuing to a new pod during a restart, which resolves the issue. (LOG-2871)

Logging 5.3.9

Bug fixes

  • Before this update, the logging collector included a path as a label for the metrics it produced. This path changed frequently and contributed to significant storage changes for the Prometheus server. With this update, the label has been dropped to resolve the issue and reduce storage consumption. (LOG-2682)

OpenShift Logging 5.3.7

Bug fixes

  • Before this update, Linux audit log time parsing relied on an ordinal position of key/value pair. This update changes the parsing to utilize a regex to find the time entry. (LOG-2322)

  • Before this update, some log forwarder outputs could re-order logs with the same time-stamp. With this update, a sequence number has been added to the log record to order entries that have matching timestamps. (LOG-2334)

  • Before this update, clusters with a large number of namespaces caused Elasticsearch to stop serving requests because the list of namespaces reached the maximum header size limit. With this update, headers only include a list of namespace names, resolving the issue. (LOG-2450)

  • Before this update, system:serviceaccount:openshift-monitoring:prometheus-k8s had cluster level privileges as a clusterrole and clusterrolebinding. This update restricts the serviceaccount to the openshift-logging namespace with a role and rolebinding. (LOG-2481))

OpenShift Logging 5.3.6

Bug fixes

  • Before this update, defining a toleration with no key and the existing Operator caused the Operator to be unable to complete an upgrade. With this update, this toleration no longer blocks the upgrade from completing. (LOG-2126)

  • Before this change, it was possible for the collector to generate a warning where the chunk byte limit was exceeding an emitted event. With this change, you can tune the readline limit to resolve the issue as advised by the upstream documentation. (LOG-2380)

OpenShift Logging 5.3.5

Bug fixes

  • Before this update, if you removed OpenShift Logging from OpenShift Container Platform, the web console continued displaying a link to the Logging page. With this update, removing or uninstalling OpenShift Logging also removes that link. (LOG-2182)

OpenShift Logging 5.3.4

Bug fixes

  • Before this update, changes to the metrics dashboards had not yet been deployed because the cluster-logging-operator did not correctly compare existing and desired config maps that contained the dashboard. This update fixes the logic by adding a unique hash value to the object labels. (LOG-2066)

  • Before this update, Elasticsearch pods failed to start after updating with FIPS enabled. With this update, Elasticsearch pods start successfully. (LOG-1974)

  • Before this update, elasticsearch generated the error "Unable to create PersistentVolumeClaim due to forbidden: exceeded quota: infra-storage-quota." if the PVC already existed. With this update, elasticsearch checks for existing PVCs, resolving the issue. (LOG-2127)

OpenShift Logging 5.3.3

Bug fixes

  • Before this update, changes to the metrics dashboards had not yet been deployed because the cluster-logging-operator did not correctly compare existing and desired configmaps containing the dashboard. This update fixes the logic by adding a dashboard unique hash value to the object labels.(LOG-2066)

  • This update changes the log4j dependency to 2.17.1 to resolve CVE-2021-44832.(LOG-2102)

OpenShift Logging 5.3.2

Bug fixes

  • Before this update, Elasticsearch rejected logs from the Event Router due to a parsing error. This update changes the data model to resolve the parsing error. However, as a result, previous indices might cause warnings or errors within Kibana. The kubernetes.event.metadata.resourceVersion field causes errors until existing indices are removed or reindexed. If this field is not used in Kibana, you can ignore the error messages. If you have a retention policy that deletes old indices, the policy eventually removes the old indices and stops the error messages. Otherwise, manually reindex to stop the error messages. (LOG-2087)

  • Before this update, the OpenShift Logging Dashboard displayed the wrong pod namespace in the table that displays top producing and collected containers over the last 24 hours. With this update, the OpenShift Logging Dashboard displays the correct pod namespace. (LOG-2051)

  • Before this update, if outputDefaults.elasticsearch.structuredTypeKey in the ClusterLogForwarder custom resource (CR) instance did not have a structured key, the CR replaced the output secret with the default secret used to communicate to the default log store. With this update, the defined output secret is correctly used. (LOG-2046)

OpenShift Logging 5.3.1

Bug fixes

  • Before this update, the Fluentd container image included builder tools that were unnecessary at run time. This update removes those tools from the image. (LOG-1998)

  • Before this update, the Logging dashboard displayed an empty CPU graph because of a reference to an invalid metric. With this update, the Logging dashboard displays CPU graphs correctly. (LOG-1925)

  • Before this update, the Elasticsearch Prometheus exporter plugin compiled index-level metrics using a high-cost query that impacted the Elasticsearch node performance. This update implements a lower-cost query that improves performance. (LOG-1897)

OpenShift Logging 5.3.0

New features and enhancements

  • With this update, authorization options for Log Forwarding have been expanded. Outputs may now be configured with SASL, username/password, or TLS.

Bug fixes

  • Before this update, if you forwarded logs using the syslog protocol, serializing a ruby hash encoded key/value pairs to contain a '⇒' character and replaced tabs with "#11". This update fixes the issue so that log messages are correctly serialized as valid JSON. (LOG-1494)

  • Before this update, application logs were not correctly configured to forward to the proper Cloudwatch stream with multi-line error detection enabled. (LOG-1939)

  • Before this update, a name change of the deployed collector in the 5.3 release caused the alert 'fluentnodedown' to generate. (LOG-1918)

  • Before this update, a regression introduced in a prior release configuration caused the collector to flush its buffered messages before shutdown, creating a delay the termination and restart of collector Pods. With this update, fluentd no longer flushes buffers at shutdown, resolving the issue. (LOG-1735)

  • Before this update, a regression introduced in a prior release intentionally disabled JSON message parsing. This update re-enables JSON parsing. It also sets the log entry "level" based on the "level" field in parsed JSON message or by using regex to extract a match from a message field. (LOG-1199)

  • Before this update, the ClusterLogging custom resource (CR) applied the value of the totalLimitSize field to the Fluentd total_limit_size field, even if the required buffer space was not available. With this update, the CR applies the lesser of the two totalLimitSize or 'default' values to the Fluentd total_limit_size field, resolving the issue. (LOG-1776)

Known issues

  • If you forward logs to an external Elasticsearch server and then change a configured value in the pipeline secret, such as the username and password, the Fluentd forwarder loads the new secret but uses the old value to connect to an external Elasticsearch server. This issue happens because the Red Hat OpenShift Logging Operator does not currently monitor secrets for content changes. (LOG-1652)

    As a workaround, if you change the secret, you can force the Fluentd pods to redeploy by entering:

    $ oc delete pod -l component=collector

Deprecated and removed features

Some features available in previous releases have been deprecated or removed.

Deprecated functionality is still included in OpenShift Logging and continues to be supported; however, it will be removed in a future release of this product and is not recommended for new deployments.

Forwarding logs using the legacy Fluentd and legacy syslog methods have been removed

In OpenShift Logging 5.3, the legacy methods of forwarding logs to Syslog and Fluentd are removed. Bug fixes and support are provided through the end of the OpenShift Logging 5.2 life cycle. After which, no new feature enhancements are made.

Instead, use the following non-legacy methods:

Configuration mechanisms for legacy forwarding methods have been removed

In OpenShift Logging 5.3, the legacy configuration mechanism for log forwarding is removed: You cannot forward logs using the legacy Fluentd method and legacy Syslog method. Use the standard log forwarding methods instead.

CVEs

Click to expand CVEs

OpenShift Logging 5.2.10

Bug fixes

  • Before this update some log forwarder outputs could re-order logs with the same time-stamp. With this update, a sequence number has been added to the log record to order entries that have matching timestamps.(LOG-2335)

  • Before this update, clusters with a large number of namespaces caused Elasticsearch to stop serving requests because the list of namespaces reached the maximum header size limit. With this update, headers only include a list of namespace names, resolving the issue. (LOG-2475)

  • Before this update, system:serviceaccount:openshift-monitoring:prometheus-k8s had cluster level privileges as a clusterrole and clusterrolebinding. This update restricts the serviceaccount to the openshift-logging namespace with a role and rolebinding. (LOG-2480)

  • Before this update, the cluster-logging-operator utilized cluster scoped roles and bindings to establish permissions for the Prometheus service account to scrape metrics. These permissions were only created when deploying the Operator using the console interface and were missing when the Operator was deployed from the command line. This fixes the issue by making this role and binding namespace scoped. (LOG-1972)

OpenShift Logging 5.2.9

Bug fixes

  • Before this update, defining a toleration with no key and the existing Operator caused the Operator to be unable to complete an upgrade. With this update, this toleration no longer blocks the upgrade from completing. (LOG-2304)

OpenShift Logging 5.2.8

Bug fixes

  • Before this update, if you removed OpenShift Logging from OpenShift Container Platform, the web console continued displaying a link to the Logging page. With this update, removing or uninstalling OpenShift Logging also removes that link. (LOG-2180)

OpenShift Logging 5.2.7

Bug fixes

  • Before this update, Elasticsearch pods with FIPS enabled failed to start after updating. With this update, Elasticsearch pods start successfully. (LOG-2000)

  • Before this update, if a persistent volume claim (PVC) already existed, Elasticsearch generated an error, "Unable to create PersistentVolumeClaim due to forbidden: exceeded quota: infra-storage-quota." With this update, Elasticsearch checks for existing PVCs, resolving the issue. (LOG-2118)

OpenShift Logging 5.2.6

Bug fixes

  • Before this update, the release did not include a filter change which caused Fluentd to crash. With this update, the missing filter has been corrected. (LOG-2104)

  • This update changes the log4j dependency to 2.17.1 to resolve CVE-2021-44832.(LOG-2101)

OpenShift Logging 5.2.5

Bug fixes

  • Before this update, Elasticsearch rejected logs from the Event Router due to a parsing error. This update changes the data model to resolve the parsing error. However, as a result, previous indices might cause warnings or errors within Kibana. The kubernetes.event.metadata.resourceVersion field causes errors until existing indices are removed or reindexed. If this field is not used in Kibana, you can ignore the error messages. If you have a retention policy that deletes old indices, the policy eventually removes the old indices and stops the error messages. Otherwise, manually reindex to stop the error messages. LOG-2087)

OpenShift Logging 5.2.4

Bug fixes

  • Before this update, records shipped via syslog would serialize a ruby hash encoding key/value pairs to contain a '⇒' character, as well as replace tabs with "#11". This update serializes the message correctly as proper JSON. (LOG-1775)

  • Before this update, the Elasticsearch Prometheus exporter plugin compiled index-level metrics using a high-cost query that impacted the Elasticsearch node performance. This update implements a lower-cost query that improves performance. (LOG-1970)

  • Before this update, Elasticsearch sometimes rejected messages when Log Forwarding was configured with multiple outputs. This happened because configuring one of the outputs modified message content to be a single message. With this update, Log Forwarding duplicates the messages for each output so that output-specific processing does not affect the other outputs. (LOG-1824)

OpenShift Logging 5.2.3

Bug fixes

  • Before this update, some alerts did not include a namespace label. This omission does not comply with the OpenShift Monitoring Team’s guidelines for writing alerting rules in OpenShift Container Platform. With this update, all the alerts in Elasticsearch Operator include a namespace label and follow all the guidelines for writing alerting rules in OpenShift Container Platform. (LOG-1857)

  • Before this update, a regression introduced in a prior release intentionally disabled JSON message parsing. This update re-enables JSON parsing. It also sets the log entry level based on the level field in parsed JSON message or by using regex to extract a match from a message field. (LOG-1759)

OpenShift Logging 5.2.2

Bug fixes

  • Before this update, the ClusterLogging custom resource (CR) applied the value of the totalLimitSize field to the Fluentd total_limit_size field, even if the required buffer space was not available. With this update, the CR applies the lesser of the two totalLimitSize or 'default' values to the Fluentd total_limit_size field, resolving the issue.(LOG-1738)

  • Before this update, a regression introduced in a prior release configuration caused the collector to flush its buffered messages before shutdown, creating a delay to the termination and restart of collector pods. With this update, Fluentd no longer flushes buffers at shutdown, resolving the issue. (LOG-1739)

  • Before this update, an issue in the bundle manifests prevented installation of the Elasticsearch Operator through OLM on OpenShift Container Platform 4.9. With this update, a correction to bundle manifests re-enables installation and upgrade in 4.9.(LOG-1780)

OpenShift Logging 5.2.1

Bug fixes

  • Before this update, due to an issue in the release pipeline scripts, the value of the olm.skipRange field remained unchanged at 5.2.0 instead of reflecting the current release number. This update fixes the pipeline scripts to update the value of this field when the release numbers change. (LOG-1743)

CVEs

(None)

OpenShift Logging 5.2.0

New features and enhancements

  • With this update, you can forward log data to Amazon CloudWatch, which provides application and infrastructure monitoring. For more information, see Forwarding logs to Amazon CloudWatch. (LOG-1173)

  • With this update, you can forward log data to Loki, a horizontally scalable, highly available, multi-tenant log aggregation system. For more information, see Forwarding logs to Loki. (LOG-684)

  • With this update, if you use the Fluentd forward protocol to forward log data over a TLS-encrypted connection, now you can use a password-encrypted private key file and specify the passphrase in the Cluster Log Forwarder configuration. For more information, see Forwarding logs using the Fluentd forward protocol. (LOG-1525)

  • This enhancement enables you to use a username and password to authenticate a log forwarding connection to an external Elasticsearch instance. For example, if you cannot use mutual TLS (mTLS) because a third-party operates the Elasticsearch instance, you can use HTTP or HTTPS and set a secret that contains the username and password. For more information, see Forwarding logs to an external Elasticsearch instance. (LOG-1022)

  • With this update, you can collect OVN network policy audit logs for forwarding to a logging server. (LOG-1526)

  • By default, the data model introduced in OpenShift Container Platform 4.5 gave logs from different namespaces a single index in common. This change made it harder to see which namespaces produced the most logs.

    The current release adds namespace metrics to the Logging dashboard in the OpenShift Container Platform console. With these metrics, you can see which namespaces produce logs and how many logs each namespace produces for a given timestamp.

    To see these metrics, open the Administrator perspective in the OpenShift Container Platform web console, and navigate to ObserveDashboardsLogging/Elasticsearch. (LOG-1680)

  • The current release, OpenShift Logging 5.2, enables two new metrics: For a given timestamp or duration, you can see the total logs produced or logged by individual containers, and the total logs collected by the collector. These metrics are labeled by namespace, pod, and container name so that you can see how many logs each namespace and pod collects and produces. (LOG-1213)

Bug fixes

  • Before this update, when the OpenShift Elasticsearch Operator created index management cronjobs, it added the POLICY_MAPPING environment variable twice, which caused the apiserver to report the duplication. This update fixes the issue so that the POLICY_MAPPING environment variable is set only once per cronjob, and there is no duplication for the apiserver to report. (LOG-1130)

  • Before this update, suspending an Elasticsearch cluster to zero nodes did not suspend the index-management cronjobs, which put these cronjobs into maximum backoff. Then, after unsuspending the Elasticsearch cluster, these cronjobs stayed halted due to maximum backoff reached. This update resolves the issue by suspending the cronjobs and the cluster. (LOG-1268)

  • Before this update, in the Logging dashboard in the OpenShift Container Platform console, the list of top 10 log-producing containers was missing the "chart namespace" label and provided the incorrect metric name, fluentd_input_status_total_bytes_logged. With this update, the chart shows the namespace label and the correct metric name, log_logged_bytes_total. (LOG-1271)

  • Before this update, if an index management cronjob terminated with an error, it did not report the error exit code: instead, its job status was "complete." This update resolves the issue by reporting the error exit codes of index management cronjobs that terminate with errors. (LOG-1273)

  • The priorityclasses.v1beta1.scheduling.k8s.io was removed in 1.22 and replaced by priorityclasses.v1.scheduling.k8s.io (v1beta1 was replaced by v1). Before this update, APIRemovedInNextReleaseInUse alerts were generated for priorityclasses because v1beta1 was still present . This update resolves the issue by replacing v1beta1 with v1. The alert is no longer generated. (LOG-1385)

  • Previously, the OpenShift Elasticsearch Operator and Red Hat OpenShift Logging Operator did not have the annotation that was required for them to appear in the OpenShift Container Platform web console list of Operators that can run in a disconnected environment. This update adds the operators.openshift.io/infrastructure-features: '["Disconnected"]' annotation to these two Operators so that they appear in the list of Operators that run in disconnected environments. (LOG-1420)

  • Before this update, Red Hat OpenShift Logging Operator pods were scheduled on CPU cores that were reserved for customer workloads on performance-optimized single-node clusters. With this update, cluster logging Operator pods are scheduled on the correct CPU cores. (LOG-1440)

  • Before this update, some log entries had unrecognized UTF-8 bytes, which caused Elasticsearch to reject the messages and block the entire buffered payload. With this update, rejected payloads drop the invalid log entries and resubmit the remaining entries to resolve the issue. (LOG-1499)

  • Before this update, the kibana-proxy pod sometimes entered the CrashLoopBackoff state and logged the following message Invalid configuration: cookie_secret must be 16, 24, or 32 bytes to create an AES cipher when pass_access_token == true or cookie_refresh != 0, but is 29 bytes. The exact actual number of bytes could vary. With this update, the generation of the Kibana session secret has been corrected, and the kibana-proxy pod no longer enters a CrashLoopBackoff state due to this error. (LOG-1446)

  • Before this update, the AWS CloudWatch Fluentd plugin logged its AWS API calls to the Fluentd log at all log levels, consuming additional OpenShift Container Platform node resources. With this update, the AWS CloudWatch Fluentd plugin logs AWS API calls only at the "debug" and "trace" log levels. This way, at the default "warn" log level, Fluentd does not consume extra node resources. (LOG-1071)

  • Before this update, the Elasticsearch OpenDistro security plugin caused user index migrations to fail. This update resolves the issue by providing a newer version of the plugin. Now, index migrations proceed without errors. (LOG-1276)

  • Before this update, in the Logging dashboard in the OpenShift Container Platform console, the list of top 10 log-producing containers lacked data points. This update resolves the issue, and the dashboard displays all data points. (LOG-1353)

  • Before this update, if you were tuning the performance of the Fluentd log forwarder by adjusting the chunkLimitSize and totalLimitSize values, the Setting queued_chunks_limit_size for each buffer to message reported values that were too low. The current update fixes this issue so that this message reports the correct values. (LOG-1411)

  • Before this update, the Kibana OpenDistro security plugin caused user index migrations to fail. This update resolves the issue by providing a newer version of the plugin. Now, index migrations proceed without errors. (LOG-1558)

  • Before this update, using a namespace input filter prevented logs in that namespace from appearing in other inputs. With this update, logs are sent to all inputs that can accept them. (LOG-1570)

  • Before this update, a missing license file for the viaq/logerr dependency caused license scanners to abort without success. With this update, the viaq/logerr dependency is licensed under Apache 2.0 and the license scanners run successfully. (LOG-1590)

  • Before this update, an incorrect brew tag for curator5 within the elasticsearch-operator-bundle build pipeline caused the pull of an image pinned to a dummy SHA1. With this update, the build pipeline uses the logging-curator5-rhel8 reference for curator5, enabling index management cronjobs to pull the correct image from registry.redhat.io. (LOG-1624)

  • Before this update, an issue with the ServiceAccount permissions caused errors such as no permissions for [indices:admin/aliases/get]. With this update, a permission fix resolves the issue. (LOG-1657)

  • Before this update, the Custom Resource Definition (CRD) for the Red Hat OpenShift Logging Operator was missing the Loki output type, which caused the admission controller to reject the ClusterLogForwarder custom resource object. With this update, the CRD includes Loki as an output type so that administrators can configure ClusterLogForwarder to send logs to a Loki server. (LOG-1683)

  • Before this update, OpenShift Elasticsearch Operator reconciliation of the ServiceAccounts overwrote third-party-owned fields that contained secrets. This issue caused memory and CPU spikes due to frequent recreation of secrets. This update resolves the issue. Now, the OpenShift Elasticsearch Operator does not overwrite third-party-owned fields. (LOG-1714)

  • Before this update, in the ClusterLogging custom resource (CR) definition, if you specified a flush_interval value but did not set flush_mode to interval, the Red Hat OpenShift Logging Operator generated a Fluentd configuration. However, the Fluentd collector generated an error at runtime. With this update, the Red Hat OpenShift Logging Operator validates the ClusterLogging CR definition and only generates the Fluentd configuration if both fields are specified. (LOG-1723)

Known issues

  • If you forward logs to an external Elasticsearch server and then change a configured value in the pipeline secret, such as the username and password, the Fluentd forwarder loads the new secret but uses the old value to connect to an external Elasticsearch server. This issue happens because the Red Hat OpenShift Logging Operator does not currently monitor secrets for content changes. (LOG-1652)

    As a workaround, if you change the secret, you can force the Fluentd pods to redeploy by entering:

    $ oc delete pod -l component=collector

Deprecated and removed features

Some features available in previous releases have been deprecated or removed.

Deprecated functionality is still included in OpenShift Logging and continues to be supported; however, it will be removed in a future release of this product and is not recommended for new deployments.

Forwarding logs using the legacy Fluentd and legacy syslog methods have been deprecated

From OpenShift Container Platform 4.6 to the present, forwarding logs by using the following legacy methods have been deprecated and will be removed in a future release:

  • Forwarding logs using the legacy Fluentd method

  • Forwarding logs using the legacy syslog method

Instead, use the following non-legacy methods: