This is a cache of https://docs.openshift.com/container-platform/4.10/logging/cluster-logging-release-notes.html. It is a snapshot of the page at 2024-11-29T16:56:54.401+0000.
Release notes | Logging | OpenShift Container Platform 4.10
×

The logging subsystem for Red Hat OpenShift is provided as an installable component, with a distinct release cycle from the core OpenShift Container Platform. The Red Hat OpenShift Container Platform Life Cycle Policy outlines release compatibility.

The stable channel only provides updates to the most recent release of logging. To continue receiving updates for prior releases, you must change your subscription channel to stable-X where X is the version of logging you have installed.

Logging 5.6.11

Bug fixes

  • Before this update, the LokiStack gateway cached authorized requests very broadly. As a result, this caused wrong authorization results. With this update, LokiStack gateway caches on a more fine-grained basis which resolves this issue. (LOG-4435)

Logging 5.6.9

Bug fixes

  • Before this update, when multiple roles were used to authenticate using STS with AWS Cloudwatch forwarding, a recent update caused the credentials to be non-unique. With this update, multiple combinations of STS roles and static credentials can once again be used to authenticate with AWS Cloudwatch. (LOG-4084)

  • Before this update, the Vector collector occasionally panicked with the following error message in its log: thread 'vector-worker' panicked at 'all branches are disabled and there is no else branch', src/kubernetes/reflector.rs:26:9. With this update, the error has been resolved. (LOG-4276)

  • Before this update, Loki filtered label values for active streams but did not remove duplicates, making Grafana’s Label Browser unusable. With this update, Loki filters out duplicate label values for active streams, resolving the issue. (LOG-4390)

Logging 5.6.8

Bug fixes

  • Before this update, the vector collector terminated unexpectedly when input match label values contained a / character within the ClusterLogForwarder. This update resolves the issue by quoting the match label, enabling the collector to start and collect logs. (LOG-4091)

  • Before this update, when viewing logs within the OpenShift Container Platform web console, clicking the more data available option loaded more log entries only the first time it was clicked. With this update, more entries are loaded with each click. (OU-187)

  • Before this update, when viewing logs within the OpenShift Container Platform web console, clicking the streaming option would only display the streaming logs message without showing the actual logs. With this update, both the message and the log stream are displayed correctly. (OU-189)

  • Before this update, the Loki Operator reset errors in a way that made identifying configuration problems difficult to troubleshoot. With this update, errors persist until the configuration error is resolved. (LOG-4158)

  • Before this update, clusters with more than 8,000 namespaces caused Elasticsearch to reject queries because the list of namespaces was larger than the http.max_header_size setting. With this update, the default value for header size has been increased, resolving the issue. (LOG-4278)

Logging 5.6.7

Bug fixes

  • Before this update, the LokiStack gateway returned label values for namespaces without applying the access rights of a user. With this update, the LokiStack gateway applies permissions to label value requests, resolving the issue. (LOG-3728)

  • Before this update, the time field of log messages did not parse as structured.time by default in Fluentd when the messages included a timestamp. With this update, parsed log messages will include a structured.time field if the output destination supports it. (LOG-4090)

  • Before this update, the LokiStack route configuration caused queries running longer than 30 seconds to time out. With this update, the LokiStack global and per-tenant queryTimeout settings affect the route timeout settings, resolving the issue. (LOG-4130)

  • Before this update, LokiStack CRs with values defined for tenant limits but not global limits caused the Loki Operator to crash. With this update, the Operator is able to process LokiStack CRs with only tenant limits defined, resolving the issue. (LOG-4199)

  • Before this update, the OpenShift Container Platform web console generated errors after an upgrade due to cached files of the prior version retained by the web browser. With this update, these files are no longer cached, resolving the issue. (LOG-4099)

  • Before this update, Vector generated certificate errors when forwarding to the default Loki instance. With this update, logs can be forwarded without errors to Loki by using Vector. (LOG-4184)

  • Before this update, the Cluster Logging Operator API required a certificate to be provided by a secret when the tls.insecureSkipVerify option was set to true. With this update, the Cluster Logging Operator API no longer requires a certificate to be provided by a secret in such cases. The following configuration has been added to the Operator’s CR:

    tls.verify_certificate = false
    tls.verify_hostname = false

Logging 5.6.6

Bug fixes

  • Before this update, dropping of messages occurred when configuring the ClusterLogForwarder custom resource to write to a Kafka output topic that matched a key in the payload due to an error. With this update, the issue is resolved by prefixing Fluentd’s buffer name with an underscore. (LOG-3458)

  • Before this update, premature closure of watches occurred in Fluentd when inodes were reused and there were multiple entries with the same inode. With this update, the issue of premature closure of watches in the Fluentd position file is resolved. (LOG-3629)

  • Before this update, the detection of JavaScript client multi-line exceptions by Fluentd failed, resulting in printing them as multiple lines. With this update, exceptions are output as a single line, resolving the issue.(LOG-3761)

  • Before this update, direct upgrades from the Red Hat Openshift Logging Operator version 4.6 to version 5.6 were allowed, resulting in functionality issues. With this update, upgrades must be within two versions, resolving the issue. (LOG-3837)

  • Before this update, metrics were not displayed for Splunk or Google Logging outputs. With this update, the issue is resolved by sending metrics for HTTP endpoints.(LOG-3932)

  • Before this update, when the ClusterLogForwarder custom resource was deleted, collector pods remained running. With this update, collector pods do not run when log forwarding is not enabled. (LOG-4030)

  • Before this update, a time range could not be selected in the OpenShift Container Platform web console by clicking and dragging over the logs histogram. With this update, clicking and dragging can be used to successfully select a time range. (LOG-4101)

  • Before this update, Fluentd hash values for watch files were generated using the paths to log files, resulting in a non unique hash upon log rotation. With this update, hash values for watch files are created with inode numbers, resolving the issue. (LOG-3633)

  • Before this update, clicking on the Show Resources link in the OpenShift Container Platform web console did not produce any effect. With this update, the issue is resolved by fixing the functionality of the Show Resources link to toggle the display of resources for each log entry. (LOG-4118)

Logging 5.6.5

Bug fixes

  • Before this update, the template definitions prevented Elasticsearch from indexing some labels and namespace_labels, causing issues with data ingestion. With this update, the fix replaces dots and slashes in labels to ensure proper ingestion, effectively resolving the issue. (LOG-3419)

  • Before this update, if the Logs page of the OpenShift Web Console failed to connect to the LokiStack, a generic error message was displayed, providing no additional context or troubleshooting suggestions. With this update, the error message has been enhanced to include more specific details and recommendations for troubleshooting. (LOG-3750)

  • Before this update, time range formats were not validated, leading to errors selecting a custom date range. With this update, time formats are now validated, enabling users to select a valid range. If an invalid time range format is selected, an error message is displayed to the user. (LOG-3583)

  • Before this update, when searching logs in Loki, even if the length of an expression did not exceed 5120 characters, the query would fail in many cases. With this update, query authorization label matchers have been optimized, resolving the issue. (LOG-3480)

  • Before this update, the Loki Operator failed to produce a memberlist configuration that was sufficient for locating all the components when using a memberlist for private IPs. With this update, the fix ensures that the generated configuration includes the advertised port, allowing for successful lookup of all components. (LOG-4008)

Logging 5.6.4

Bug fixes

  • Before this update, when LokiStack was deployed as the log store, the logs generated by Loki pods were collected and sent to LokiStack. With this update, the logs generated by Loki are excluded from collection and will not be stored. (LOG-3280)

  • Before this update, when the query editor on the Logs page of the OpenShift Web Console was empty, the drop-down menus did not populate. With this update, if an empty query is attempted, an error message is displayed and the drop-down menus now populate as expected. (LOG-3454)

  • Before this update, when the tls.insecureSkipVerify option was set to true, the Cluster Logging Operator would generate incorrect configuration. As a result, the operator would fail to send data to Elasticsearch when attempting to skip certificate validation. With this update, the Cluster Logging Operator generates the correct TLS configuration even when tls.insecureSkipVerify is enabled. As a result, data can be sent successfully to Elasticsearch even when attempting to skip certificate validation. (LOG-3475)

  • Before this update, when structured parsing was enabled and messages were forwarded to multiple destinations, they were not deep copied. This resulted in some of the received logs including the structured message, while others did not. With this update, the configuration generation has been modified to deep copy messages before JSON parsing. As a result, all received messages now have structured messages included, even when they are forwarded to multiple destinations. (LOG-3640)

  • Before this update, if the collection field contained {} it could result in the Operator crashing. With this update, the Operator will ignore this value, allowing the operator to continue running smoothly without interruption. (LOG-3733)

  • Before this update, the nodeSelector attribute for the Gateway component of LokiStack did not have any effect. With this update, the nodeSelector attribute functions as expected. (LOG-3783)

  • Before this update, the static LokiStack memberlist configuration relied solely on private IP networks. As a result, when the OpenShift Container Platform cluster pod network was configured with a public IP range, the LokiStack pods would crashloop. With this update, the LokiStack administrator now has the option to use the pod network for the memberlist configuration. This resolves the issue and prevents the LokiStack pods from entering a crashloop state when the OpenShift Container Platform cluster pod network is configured with a public IP range. (LOG-3814)

  • Before this update, if the tls.insecureSkipVerify field was set to true, the Cluster Logging Operator would generate an incorrect configuration. As a result, the Operator would fail to send data to Elasticsearch when attempting to skip certificate validation. With this update, the Operator generates the correct TLS configuration even when tls.insecureSkipVerify is enabled. As a result, data can be sent successfully to Elasticsearch even when attempting to skip certificate validation. (LOG-3838)

  • Before this update, if the Cluster Logging Operator (CLO) was installed without the Elasticsearch Operator, the CLO pod would continuously display an error message related to the deletion of Elasticsearch. With this update, the CLO now performs additional checks before displaying any error messages. As a result, error messages related to Elasticsearch deletion are no longer displayed in the absence of the Elasticsearch Operator.(LOG-3763)

Logging 5.6.3

Bug fixes

  • Before this update, the operator stored gateway tenant secret information in a config map. With this update, the operator stores this information in a secret. (LOG-3717)

  • Before this update, the Fluentd collector did not capture OAuth login events stored in /var/log/auth-server/audit.log. With this update, Fluentd captures these OAuth login events, resolving the issue. (LOG-3729)

Logging 5.6.2

Bug fixes

  • Before this update, the collector did not set level fields correctly based on priority for systemd logs. With this update, level fields are set correctly. (LOG-3429)

  • Before this update, the Operator incorrectly generated incompatibility warnings on OpenShift Container Platform 4.12 or later. With this update, the Operator max OpenShift Container Platform version value has been corrected, resolving the issue. (LOG-3584)

  • Before this update, creating a ClusterLogForwarder custom resource (CR) with an output value of default did not generate any errors. With this update, an error warning that this value is invalid generates appropriately. (LOG-3437)

  • Before this update, when the ClusterLogForwarder custom resource (CR) had multiple pipelines configured with one output set as default, the collector pods restarted. With this update, the logic for output validation has been corrected, resolving the issue. (LOG-3559)

  • Before this update, collector pods restarted after being created. With this update, the deployed collector does not restart on its own. (LOG-3608)

  • Before this update, patch releases removed previous versions of the Operators from the catalog. This made installing the old versions impossible. This update changes bundle configurations so that previous releases of the same minor version stay in the catalog. (LOG-3635)

Logging 5.6.1

Bug fixes

  • Before this update, the compactor would report TLS certificate errors from communications with the querier when retention was active. With this update, the compactor and querier no longer communicate erroneously over HTTP. (LOG-3494)

  • Before this update, the Loki Operator would not retry setting the status of the LokiStack CR, which caused stale status information. With this update, the Operator retries status information updates on conflict. (LOG-3496)

  • Before this update, the Loki Operator Webhook server caused TLS errors when the kube-apiserver-operator Operator checked the webhook validity. With this update, the Loki Operator Webhook PKI is managed by the Operator Lifecycle Manager (OLM), resolving the issue. (LOG-3510)

  • Before this update, the LokiStack Gateway Labels Enforcer generated parsing errors for valid LogQL queries when using combined label filters with boolean expressions. With this update, the LokiStack LogQL implementation supports label filters with boolean expression and resolves the issue. (LOG-3441), (LOG-3397)

  • Before this update, records written to Elasticsearch would fail if multiple label keys had the same prefix and some keys included dots. With this update, underscores replace dots in label keys, resolving the issue. (LOG-3463)

  • Before this update, the Red Hat OpenShift Logging Operator was not available for OpenShift Container Platform 4.10 clusters because of an incompatibility between OpenShift Container Platform console and the logging-view-plugin. With this update, the plugin is properly integrated with the OpenShift Container Platform 4.10 admin console. (LOG-3447)

  • Before this update the reconciliation of the ClusterLogForwarder custom resource would incorrectly report a degraded status of pipelines that reference the default logstore. With this update, the pipeline validates properly.(LOG-3477)

Logging 5.6

This release includes OpenShift Logging Release 5.6.

Deprecation notice

In Logging 5.6, Fluentd is deprecated and is planned to be removed in a future release. Red Hat will provide bug fixes and support for this feature during the current release lifecycle, but this feature will no longer receive enhancements and will be removed. As an alternative to fluentd, you can use Vector instead.

Enhancements

  • With this update, Logging is compliant with OpenShift Container Platform cluster-wide cryptographic policies. (LOG-895)

  • With this update, you can declare per-tenant, per-stream, and global policies retention policies through the LokiStack custom resource, ordered by priority. (LOG-2695)

  • With this update, Splunk is an available output option for log forwarding. (LOG-2913)

  • With this update, Vector replaces Fluentd as the default Collector. (LOG-2222)

  • With this update, the Developer role can access the per-project workload logs they are assigned to within the Log Console Plugin on clusters running OpenShift Container Platform 4.11 and higher. (LOG-3388)

  • With this update, logs from any source contain a field openshift.cluster_id, the unique identifier of the cluster in which the Operator is deployed. You can view the clusterID value with the command below. (LOG-2715)

$ oc get clusterversion/version -o jsonpath='{.spec.clusterID}{"\n"}'

Known Issues

  • Before this update, Elasticsearch would reject logs if multiple label keys had the same prefix and some keys included the . character. This fixes the limitation of Elasticsearch by replacing . in the label keys with _. As a workaround for this issue, remove the labels that cause errors, or add a namespace to the label. (LOG-3463)

Bug fixes

  • Before this update, if you deleted the Kibana Custom Resource, the OpenShift Container Platform web console continued displaying a link to Kibana. With this update, removing the Kibana Custom Resource also removes that link. (LOG-2993)

  • Before this update, a user was not able to view the application logs of namespaces they have access to. With this update, the Loki Operator automatically creates a cluster role and cluster role binding allowing users to read application logs. (LOG-3072)

  • Before this update, the Operator removed any custom outputs defined in the ClusterLogForwarder custom resource when using LokiStack as the default log storage. With this update, the Operator merges custom outputs with the default outputs when processing the ClusterLogForwarder custom resource. (LOG-3090)

  • Before this update, the CA key was used as the volume name for mounting the CA into Loki, causing error states when the CA Key included non-conforming characters, such as dots. With this update, the volume name is standardized to an internal string which resolves the issue. (LOG-3331)

  • Before this update, a default value set within the LokiStack Custom Resource Definition, caused an inability to create a LokiStack instance without a ReplicationFactor of 1. With this update, the operator sets the actual value for the size used. (LOG-3296)

  • Before this update, Vector parsed the message field when JSON parsing was enabled without also defining structuredTypeKey or structuredTypeName values. With this update, a value is required for either structuredTypeKey or structuredTypeName when writing structured logs to Elasticsearch. (LOG-3195)

  • Before this update, the secret creation component of the Elasticsearch Operator modified internal secrets constantly. With this update, the existing secret is properly handled. (LOG-3161)

  • Before this update, the Operator could enter a loop of removing and recreating the collector daemonset while the Elasticsearch or Kibana deployments changed their status. With this update, a fix in the status handling of the Operator resolves the issue. (LOG-3157)

  • Before this update, Kibana had a fixed 24h OAuth cookie expiration time, which resulted in 401 errors in Kibana whenever the accessTokenInactivityTimeout field was set to a value lower than 24h. With this update, Kibana’s OAuth cookie expiration time synchronizes to the accessTokenInactivityTimeout, with a default value of 24h. (LOG-3129)

  • Before this update, the Operators general pattern for reconciling resources was to try and create before attempting to get or update which would lead to constant HTTP 409 responses after creation. With this update, Operators first attempt to retrieve an object and only create or update it if it is either missing or not as specified. (LOG-2919)

  • Before this update, the .level and`.structure.level` fields in Fluentd could contain different values. With this update, the values are the same for each field. (LOG-2819)

  • Before this update, the Operator did not wait for the population of the trusted CA bundle and deployed the collector a second time once the bundle updated. With this update, the Operator waits briefly to see if the bundle has been populated before it continues the collector deployment. (LOG-2789)

  • Before this update, logging telemetry info appeared twice when reviewing metrics. With this update, logging telemetry info displays as expected. (LOG-2315)

  • Before this update, Fluentd pod logs contained a warning message after enabling the JSON parsing addition. With this update, that warning message does not appear. (LOG-1806)

  • Before this update, the must-gather script did not complete because oc needs a folder with write permission to build its cache. With this update, oc has write permissions to a folder, and the must-gather script completes successfully. (LOG-3446)

  • Before this update the log collector SCC could be superseded by other SCCs on the cluster, rendering the collector unusable. This update sets the priority of the log collector SCC so that it takes precedence over the others. (LOG-3235)

  • Before this update, Vector was missing the field sequence, which was added to fluentd as a way to deal with a lack of actual nanoseconds precision. With this update, the field openshift.sequence has been added to the event logs. (LOG-3106)

Logging 5.5.16

Bug fixes

  • Before this update, the LokiStack gateway cached authorized requests very broadly. As a result, this caused wrong authorization results. With this update, LokiStack gateway caches on a more fine-grained basis which resolves this issue. (LOG-4434)

Logging 5.5.14

Bug fixes

  • Before this update, the Vector collector occasionally panicked with the following error message in its log: thread 'vector-worker' panicked at 'all branches are disabled and there is no else branch', src/kubernetes/reflector.rs:26:9. With this update, the error has been resolved. (LOG-4279)

Logging 5.5.11

Bug fixes

  • Before this update, a time range could not be selected in the OpenShift Container Platform web console by clicking and dragging over the logs histogram. With this update, clicking and dragging can be used to successfully select a time range. (LOG-4102)

  • Before this update, clicking on the Show Resources link in the OpenShift Container Platform web console did not produce any effect. With this update, the issue is resolved by fixing the functionality of the Show Resources link to toggle the display of resources for each log entry. (LOG-4117)

Logging 5.5.10

Bug fixes

  • Before this update, the logging view plugin of the OpenShift Web Console showed only an error text when the LokiStack was not reachable. After this update the plugin shows a proper error message with details on how to fix the unreachable LokiStack. (LOG-2874)

Logging 5.5.9

Bug fixes

  • Before this update, a problem with the Fluentd collector caused it to not capture OAuth login events stored in /var/log/auth-server/audit.log. This led to incomplete collection of login events from the OAuth service. With this update, the Fluentd collector now resolves this issue by capturing all login events from the OAuth service, including those stored in /var/log/auth-server/audit.log, as expected.(LOG-3730)

  • Before this update, when structured parsing was enabled and messages were forwarded to multiple destinations, they were not deep copied. This resulted in some of the received logs including the structured message, while others did not. With this update, the configuration generation has been modified to deep copy messages before JSON parsing. As a result, all received logs now have structured messages included, even when they are forwarded to multiple destinations.(LOG-3767)

Logging 5.5.8

Bug fixes

  • Before this update, the priority field was missing from systemd logs due to an error in how the collector set level fields. With this update, these fields are set correctly, resolving the issue. (LOG-3630)

Logging 5.5.7

Bug fixes

  • Before this update, the LokiStack Gateway Labels Enforcer generated parsing errors for valid LogQL queries when using combined label filters with boolean expressions. With this update, the LokiStack LogQL implementation supports label filters with boolean expression and resolves the issue. (LOG-3534)

  • Before this update, the ClusterLogForwarder custom resource (CR) did not pass TLS credentials for syslog output to Fluentd, resulting in errors during forwarding. With this update, credentials pass correctly to Fluentd, resolving the issue. (LOG-3533)

Logging 5.5.6

Bug fixes

  • Before this update, the Pod Security admission controller added the label podSecurityLabelSync = true to the openshift-logging namespace. This resulted in our specified security labels being overwritten, and as a result Collector pods would not start. With this update, the label podSecurityLabelSync = false preserves security labels. Collector pods deploy as expected. (LOG-3340)

  • Before this update, the Operator installed the console view plugin, even when it was not enabled on the cluster. This caused the Operator to crash. With this update, if an account for a cluster does not have the console view enabled, the Operator functions normally and does not install the console view. (LOG-3407)

  • Before this update, a prior fix to support a regression where the status of the Elasticsearch deployment was not being updated caused the Operator to crash unless the Red Hat Elasticsearch Operator was deployed. With this update, that fix has been reverted so the Operator is now stable but re-introduces the previous issue related to the reported status. (LOG-3428)

  • Before this update, the Loki Operator only deployed one replica of the LokiStack gateway regardless of the chosen stack size. With this update, the number of replicas is correctly configured according to the selected size. (LOG-3478)

  • Before this update, records written to Elasticsearch would fail if multiple label keys had the same prefix and some keys included dots. With this update, underscores replace dots in label keys, resolving the issue. (LOG-3341)

  • Before this update, the logging view plugin contained an incompatible feature for certain versions of OpenShift Container Platform. With this update, the correct release stream of the plugin resolves the issue. (LOG-3467)

  • Before this update, the reconciliation of the ClusterLogForwarder custom resource would incorrectly report a degraded status of one or more pipelines causing the collector pods to restart every 8-10 seconds. With this update, reconciliation of the ClusterLogForwarder custom resource processes correctly, resolving the issue. (LOG-3469)

  • Before this change the spec for the outputDefaults field of the ClusterLogForwarder custom resource would apply the settings to every declared Elasticsearch output type. This change corrects the behavior to match the enhancement specification where the setting specifically applies to the default managed Elasticsearch store. (LOG-3342)

  • Before this update, the OpenShift CLI (oc) must-gather script did not complete because the OpenShift CLI (oc) needs a folder with write permission to build its cache. With this update, the OpenShift CLI (oc) has write permissions to a folder, and the must-gather script completes successfully. (LOG-3472)

  • Before this update, the Loki Operator webhook server caused TLS errors. With this update, the Loki Operator webhook PKI is managed by the Operator Lifecycle Manager’s dynamic webhook management resolving the issue. (LOG-3511)

Logging 5.5.5

Bug fixes

  • Before this update, Kibana had a fixed 24h OAuth cookie expiration time, which resulted in 401 errors in Kibana whenever the accessTokenInactivityTimeout field was set to a value lower than 24h. With this update, Kibana’s OAuth cookie expiration time synchronizes to the accessTokenInactivityTimeout, with a default value of 24h. (LOG-3305)

  • Before this update, Vector parsed the message field when JSON parsing was enabled without also defining structuredTypeKey or structuredTypeName values. With this update, a value is required for either structuredTypeKey or structuredTypeName when writing structured logs to Elasticsearch. (LOG-3284)

  • Before this update, the FluentdQueueLengthIncreasing alert could fail to fire when there was a cardinality issue with the set of labels returned from this alert expression. This update reduces labels to only include those required for the alert. (LOG-3226)

  • Before this update, Loki did not have support to reach an external storage in a disconnected cluster. With this update, proxy environment variables and proxy trusted CA bundles are included in the container image to support these connections. (LOG-2860)

  • Before this update, OpenShift Container Platform web console users could not choose the ConfigMap object that includes the CA certificate for Loki, causing pods to operate without the CA. With this update, web console users can select the config map, resolving the issue. (LOG-3310)

  • Before this update, the CA key was used as volume name for mounting the CA into Loki, causing error states when the CA Key included non-conforming characters (such as dots). With this update, the volume name is standardized to an internal string which resolves the issue. (LOG-3332)

Logging 5.5.4

Bug fixes

  • Before this update, an error in the query parser of the logging view plugin caused parts of the logs query to disappear if the query contained curly brackets {}. This made the queries invalid, leading to errors being returned for valid queries. With this update, the parser correctly handles these queries. (LOG-3042)

  • Before this update, the Operator could enter a loop of removing and recreating the collector daemonset while the Elasticsearch or Kibana deployments changed their status. With this update, a fix in the status handling of the Operator resolves the issue. (LOG-3049)

  • Before this update, no alerts were implemented to support the collector implementation of Vector. This change adds Vector alerts and deploys separate alerts, depending upon the chosen collector implementation. (LOG-3127)

  • Before this update, the secret creation component of the Elasticsearch Operator modified internal secrets constantly. With this update, the existing secret is properly handled. (LOG-3138)

  • Before this update, a prior refactoring of the logging must-gather scripts removed the expected location for the artifacts. This update reverts that change to write artifacts to the /must-gather folder. (LOG-3213)

  • Before this update, on certain clusters, the Prometheus exporter would bind on IPv4 instead of IPv6. After this update, Fluentd detects the IP version and binds to 0.0.0.0 for IPv4 or [::] for IPv6. (LOG-3162)

Logging 5.5.3

Bug fixes

  • Before this update, log entries that had structured messages included the original message field, which made the entry larger. This update removes the message field for structured logs to reduce the increased size. (LOG-2759)

  • Before this update, the collector configuration excluded logs from collector, default-log-store, and visualization pods, but was unable to exclude logs archived in a .gz file. With this update, archived logs stored as .gz files of collector, default-log-store, and visualization pods are also excluded. (LOG-2844)

  • Before this update, when requests to an unavailable pod were sent through the gateway, no alert would warn of the disruption. With this update, individual alerts will generate if the gateway has issues completing a write or read request. (LOG-2884)

  • Before this update, pod metadata could be altered by fluent plugins because the values passed through the pipeline by reference. This update ensures each log message receives a copy of the pod metadata so each message processes independently. (LOG-3046)

  • Before this update, selecting unknown severity in the OpenShift Console Logs view excluded logs with a level=unknown value. With this update, logs without level and with level=unknown values are visible when filtering by unknown severity. (LOG-3062)

  • Before this update, log records sent to Elasticsearch had an extra field named write-index that contained the name of the index to which the logs needed to be sent. This field is not a part of the data model. After this update, this field is no longer sent. (LOG-3075)

  • With the introduction of the new built-in Pod Security Admission Controller, Pods not configured in accordance with the enforced security standards defined globally or on the namespace level cannot run. With this update, the Operator and collectors allow privileged execution and run without security audit warnings or errors. (LOG-3077)

  • Before this update, the Operator removed any custom outputs defined in the ClusterLogForwarder custom resource when using LokiStack as the default log storage. With this update, the Operator merges custom outputs with the default outputs when processing the ClusterLogForwarder custom resource. (LOG-3095)

Logging 5.5.2

Bug fixes

  • Before this update, alerting rules for the Fluentd collector did not adhere to the OpenShift Container Platform monitoring style guidelines. This update modifies those alerts to include the namespace label, resolving the issue. (LOG-1823)

  • Before this update, the index management rollover script failed to generate a new index name whenever there was more than one hyphen character in the name of the index. With this update, index names generate correctly. (LOG-2644)

  • Before this update, the Kibana route was setting a caCertificate value without a certificate present. With this update, no caCertificate value is set. (LOG-2661)

  • Before this update, a change in the collector dependencies caused it to issue a warning message for unused parameters. With this update, removing unused configuration parameters resolves the issue. (LOG-2859)

  • Before this update, pods created for deployments that Loki Operator created were mistakenly scheduled on nodes with non-Linux operating systems, if such nodes were available in the cluster the Operator was running in. With this update, the Operator attaches an additional node-selector to the pod definitions which only allows scheduling the pods on Linux-based nodes. (LOG-2895)

  • Before this update, the OpenShift Console Logs view did not filter logs by severity due to a LogQL parser issue in the LokiStack gateway. With this update, a parser fix resolves the issue and the OpenShift Console Logs view can filter by severity. (LOG-2908)

  • Before this update, a refactoring of the Fluentd collector plugins removed the timestamp field for events. This update restores the timestamp field, sourced from the event’s received time. (LOG-2923)

  • Before this update, absence of a level field in audit logs caused an error in vector logs. With this update, the addition of a level field in the audit log record resolves the issue. (LOG-2961)

  • Before this update, if you deleted the Kibana Custom Resource, the OpenShift Container Platform web console continued displaying a link to Kibana. With this update, removing the Kibana Custom Resource also removes that link. (LOG-3053)

  • Before this update, each rollover job created empty indices when the ClusterLogForwarder custom resource had JSON parsing defined. With this update, new indices are not empty. (LOG-3063)

  • Before this update, when the user deleted the LokiStack after an update to Loki Operator 5.5 resources originally created by Loki Operator 5.4 remained. With this update, the resources' owner-references point to the 5.5 LokiStack. (LOG-2945)

  • Before this update, a user was not able to view the application logs of namespaces they have access to. With this update, the Loki Operator automatically creates a cluster role and cluster role binding allowing users to read application logs. (LOG-2918)

  • Before this update, users with cluster-admin privileges were not able to properly view infrastructure and audit logs using the logging console. With this update, the authorization check has been extended to also recognize users in cluster-admin and dedicated-admin groups as admins. (LOG-2970)

Logging 5.5.1

Enhancements

  • This enhancement adds an Aggregated Logs tab to the Pod Details page of the OpenShift Container Platform web console when the Logging Console Plugin is in use. This enhancement is only available on OpenShift Container Platform 4.10 and later. (LOG-2647)

  • This enhancement adds Google Cloud Logging as an output option for log forwarding. (LOG-1482)

Bug fixes

  • Before this update, the Operator did not ensure that the pod was ready, which caused the cluster to reach an inoperable state during a cluster restart. With this update, the Operator marks new pods as ready before continuing to a new pod during a restart, which resolves the issue. (LOG-2745)

  • Before this update, Fluentd would sometimes not recognize that the Kubernetes platform rotated the log file and would no longer read log messages. This update corrects that by setting the configuration parameter suggested by the upstream development team. (LOG-2995)

  • Before this update, the addition of multi-line error detection caused internal routing to change and forward records to the wrong destination. With this update, the internal routing is correct. (LOG-2801)

  • Before this update, changing the OpenShift Container Platform web console’s refresh interval created an error when the Query field was empty. With this update, changing the interval is not an available option when the Query field is empty. (LOG-2917)

Logging 5.5

The following advisories are available for Logging 5.5:Release 5.5

Enhancements

  • With this update, you can forward structured logs from different containers within the same pod to different indices. To use this feature, you must configure the pipeline with multi-container support and annotate the pods. (LOG-1296)

JSON formatting of logs varies by application. Because creating too many indices impacts performance, limit your use of this feature to creating indices for logs that have incompatible JSON formats. Use queries to separate logs from different namespaces, or applications with compatible JSON formats.

  • With this update, you can filter logs with Elasticsearch outputs by using the Kubernetes common labels, app.kubernetes.io/component, app.kubernetes.io/managed-by, app.kubernetes.io/part-of, and app.kubernetes.io/version. Non-Elasticsearch output types can use all labels included in kubernetes.labels. (LOG-2388)

  • With this update, clusters with AWS Security Token Service (STS) enabled may use STS authentication to forward logs to Amazon CloudWatch. (LOG-1976)

  • With this update, the 'Loki Operator' Operator and Vector collector move from Technical Preview to General Availability. Full feature parity with prior releases are pending, and some APIs remain Technical Previews. See the Logging with the LokiStack section for details.

Bug fixes

  • Before this update, clusters configured to forward logs to Amazon CloudWatch wrote rejected log files to temporary storage, causing cluster instability over time. With this update, chunk backup for all storage options has been disabled, resolving the issue. (LOG-2746)

  • Before this update, the Operator was using versions of some APIs that are deprecated and planned for removal in future versions of OpenShift Container Platform. This update moves dependencies to the supported API versions. (LOG-2656)

Before this update, the Operator was using versions of some APIs that are deprecated and planned for removal in future versions of OpenShift Container Platform. This update moves dependencies to the supported API versions. (LOG-2656)

  • Before this update, multiple ClusterLogForwarder pipelines configured for multiline error detection caused the collector to go into a crashloopbackoff error state. This update fixes the issue where multiple configuration sections had the same unique ID. (LOG-2241)

  • Before this update, the collector could not save non UTF-8 symbols to the Elasticsearch storage logs. With this update the collector encodes non UTF-8 symbols, resolving the issue. (LOG-2203)

  • Before this update, non-latin characters displayed incorrectly in Kibana. With this update, Kibana displays all valid UTF-8 symbols correctly. (LOG-2784)

Logging 5.4.13

Bug fixes

  • Before this update, a problem with the Fluentd collector caused it to not capture OAuth login events stored in /var/log/auth-server/audit.log. This led to incomplete collection of login events from the OAuth service. With this update, the Fluentd collector now resolves this issue by capturing all login events from the OAuth service, including those stored in /var/log/auth-server/audit.log, as expected. (LOG-3731)

Logging 5.4.9

Bug fixes

  • Before this update, the Fluentd collector would warn of unused configuration parameters. This update removes those configuration parameters and their warning messages. (LOG-3074)

  • Before this update, Kibana had a fixed 24h OAuth cookie expiration time, which resulted in 401 errors in Kibana whenever the accessTokenInactivityTimeout field was set to a value lower than 24h. With this update, Kibana’s OAuth cookie expiration time synchronizes to the accessTokenInactivityTimeout, with a default value of 24h. (LOG-3306)

Logging 5.4.6

Bug fixes

  • Before this update, Fluentd would sometimes not recognize that the Kubernetes platform rotated the log file and would no longer read log messages. This update corrects that by setting the configuration parameter suggested by the upstream development team. (LOG-2792)

  • Before this update, each rollover job created empty indices when the ClusterLogForwarder custom resource had JSON parsing defined. With this update, new indices are not empty. (LOG-2823)

  • Before this update, if you deleted the Kibana Custom Resource, the OpenShift Container Platform web console continued displaying a link to Kibana. With this update, removing the Kibana Custom Resource also removes that link. (LOG-3054)

Logging 5.4.5

Bug fixes

  • Before this update, the Operator did not ensure that the pod was ready, which caused the cluster to reach an inoperable state during a cluster restart. With this update, the Operator marks new pods as ready before continuing to a new pod during a restart, which resolves the issue. (LOG-2881)

  • Before this update, the addition of multi-line error detection caused internal routing to change and forward records to the wrong destination. With this update, the internal routing is correct. (LOG-2946)

  • Before this update, the Operator could not decode index setting JSON responses with a quoted Boolean value and would result in an error. With this update, the Operator can properly decode this JSON response. (LOG-3009)

  • Before this update, Elasticsearch index templates defined the fields for labels with the wrong types. This change updates those templates to match the expected types forwarded by the log collector. (LOG-2972)

Logging 5.4.4

Bug fixes

  • Before this update, non-latin characters displayed incorrectly in Elasticsearch. With this update, Elasticsearch displays all valid UTF-8 symbols correctly. (LOG-2794)

  • Before this update, non-latin characters displayed incorrectly in Fluentd. With this update, Fluentd displays all valid UTF-8 symbols correctly. (LOG-2657)

  • Before this update, the metrics server for the collector attempted to bind to the address using a value exposed by an environment value. This change modifies the configuration to bind to any available interface. (LOG-2821)

  • Before this update, the cluster-logging Operator relied on the cluster to create a secret. This cluster behavior changed in OpenShift Container Platform 4.11, which caused logging deployments to fail. With this update, the cluster-logging Operator resolves the issue by creating the secret if needed. (LOG-2840)

Logging 5.4.3

Elasticsearch Operator deprecation notice

In logging subsystem 5.4.3 the Elasticsearch Operator is deprecated and is planned to be removed in a future release. Red Hat will provide bug fixes and support for this feature during the current release lifecycle, but this feature will no longer receive enhancements and will be removed. As an alternative to using the Elasticsearch Operator to manage the default log storage, you can use the Loki Operator.

Bug fixes

  • Before this update, the OpenShift Logging Dashboard showed the number of active primary shards instead of all active shards. With this update, the dashboard displays all active shards. (LOG-2781)

  • Before this update, a bug in a library used by elasticsearch-operator contained a denial of service attack vulnerability. With this update, the library has been updated to a version that does not contain this vulnerability. (LOG-2816)

  • Before this update, when configuring Vector to forward logs to Loki, it was not possible to set a custom bearer token or use the default token if Loki had TLS enabled. With this update, Vector can forward logs to Loki using tokens with TLS enabled. (LOG-2786

  • Before this update, the ElasticSearch Operator omitted the referencePolicy property of the ImageStream custom resource when selecting an oauth-proxy image. This omission caused the Kibana deployment to fail in specific environments. With this update, using referencePolicy resolves the issue, and the Operator can deploy Kibana successfully. (LOG-2791)

  • Before this update, alerting rules for the ClusterLogForwarder custom resource did not take multiple forward outputs into account. This update resolves the issue. (LOG-2640)

  • Before this update, clusters configured to forward logs to Amazon CloudWatch wrote rejected log files to temporary storage, causing cluster instability over time. With this update, chunk backup for CloudWatch has been disabled, resolving the issue. (LOG-2768)

Logging 5.4.2

Bug fixes

  • Before this update, editing the Collector configuration using oc edit was difficult because it had inconsistent use of white-space. This change introduces logic to normalize and format the configuration prior to any updates by the Operator so that it is easy to edit using oc edit. (LOG-2319)

  • Before this update, the FluentdNodeDown alert could not provide instance labels in the message section appropriately. This update resolves the issue by fixing the alert rule to provide instance labels in cases of partial instance failures. (LOG-2607)

  • Before this update, several log levels, such as`critical`, that were documented as supported by the product were not. This update fixes the discrepancy so the documented log levels are now supported by the product. (LOG-2033)

Logging 5.4.1

Bug fixes

  • Before this update, the log file metric exporter only reported logs created while the exporter was running, which resulted in inaccurate log growth data. This update resolves this issue by monitoring /var/log/pods. (LOG-2442)

  • Before this update, the collector would be blocked because it continually tried to use a stale connection when forwarding logs to fluentd forward receivers. With this release, the keepalive_timeout value has been set to 30 seconds (30s) so that the collector recycles the connection and re-attempts to send failed messages within a reasonable amount of time. (LOG-2534)

  • Before this update, an error in the gateway component enforcing tenancy for reading logs limited access to logs with a Kubernetes namespace causing "audit" and some "infrastructure" logs to be unreadable. With this update, the proxy correctly detects users with admin access and allows access to logs without a namespace. (LOG-2448)

  • Before this update, the system:serviceaccount:openshift-monitoring:prometheus-k8s service account had cluster level privileges as a clusterrole and clusterrolebinding. This update restricts the service account` to the openshift-logging namespace with a role and rolebinding. (LOG-2437)

  • Before this update, Linux audit log time parsing relied on an ordinal position of a key/value pair. This update changes the parsing to use a regular expression to find the time entry. (LOG-2321)

Logging 5.4

The following advisories are available for logging 5.4: Logging subsystem for Red Hat OpenShift Release 5.4

Technology Previews

Vector is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

About Vector

Vector is a log collector offered as a tech-preview alternative to the current default collector for the logging subsystem.

The following outputs are supported:

  • elasticsearch. An external Elasticsearch instance. The elasticsearch output can use a TLS connection.

  • kafka. A Kafka broker. The kafka output can use an unsecured or TLS connection.

  • loki. Loki, a horizontally scalable, highly available, multi-tenant log aggregation system.

Enabling Vector

Vector is not enabled by default. Use the following steps to enable Vector on your OpenShift Container Platform cluster.

Vector does not support FIPS Enabled Clusters.

Prerequisites
  • OpenShift Container Platform: 4.10

  • Logging subsystem for Red Hat OpenShift: 5.4

  • FIPS disabled

Procedure
  1. Edit the ClusterLogging custom resource (CR) in the openshift-logging project:

    $ oc -n openshift-logging edit ClusterLogging instance
  2. Add a logging.openshift.io/preview-vector-collector: enabled annotation to the ClusterLogging custom resource (CR).

  3. Add vector as a collection type to the ClusterLogging custom resource (CR).

  apiVersion: "logging.openshift.io/v1"
  kind: "ClusterLogging"
  metadata:
    name: "instance"
    namespace: "openshift-logging"
    annotations:
      logging.openshift.io/preview-vector-collector: enabled
  spec:
    collection:
      logs:
        type: "vector"
        vector: {}
Additional resources

Loki Operator is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

About Loki

Loki is a horizontally scalable, highly available, multi-tenant log aggregation system currently offered as an alternative to Elasticsearch as a log store for the logging subsystem.

Additional resources

Deploying the Lokistack

You can use the OpenShift Container Platform web console to install the Loki Operator.

Prerequisites
  • OpenShift Container Platform: 4.10

  • Logging subsystem for Red Hat OpenShift: 5.4

To install the Loki Operator using the OpenShift Container Platform web console:

  1. Install the Loki Operator:

    1. In the OpenShift Container Platform web console, click OperatorsOperatorHub.

    2. Choose Loki Operator from the list of available Operators, and click Install.

    3. Under Installation Mode, select All namespaces on the cluster.

    4. Under Installed Namespace, select openshift-operators-redhat.

      You must specify the openshift-operators-redhat namespace. The openshift-operators namespace might contain Community Operators, which are untrusted and could publish a metric with the same name as an OpenShift Container Platform metric, which would cause conflicts.

    5. Select Enable operator recommended cluster monitoring on this namespace.

      This option sets the openshift.io/cluster-monitoring: "true" label in the Namespace object. You must select this option to ensure that cluster monitoring scrapes the openshift-operators-redhat namespace.

    6. Select an Approval Strategy.

      • The Automatic strategy allows Operator Lifecycle Manager (OLM) to automatically update the Operator when a new version is available.

      • The Manual strategy requires a user with appropriate credentials to approve the Operator update.

    7. Click Install.

    8. Verify that you installed the Loki Operator. Visit the OperatorsInstalled Operators page and look for "Loki Operator."

    9. Ensure that Loki Operator is listed in all the projects whose Status is Succeeded.

Bug fixes

  • Before this update, the cluster-logging-operator used cluster scoped roles and bindings to establish permissions for the Prometheus service account to scrape metrics. These permissions were created when deploying the Operator using the console interface but were missing when deploying from the command line. This update fixes the issue by making the roles and bindings namespace-scoped. (LOG-2286)

  • Before this update, a prior change to fix dashboard reconciliation introduced a ownerReferences field to the resource across namespaces. As a result, both the config map and dashboard were not created in the namespace. With this update, the removal of the ownerReferences field resolves the issue, and the OpenShift Logging dashboard is available in the console. (LOG-2163)

  • Before this update, changes to the metrics dashboards did not deploy because the cluster-logging-operator did not correctly compare existing and modified config maps that contain the dashboard. With this update, the addition of a unique hash value to object labels resolves the issue. (LOG-2071)

  • Before this update, the OpenShift Logging dashboard did not correctly display the pods and namespaces in the table, which displays the top producing containers collected over the last 24 hours. With this update, the pods and namespaces are displayed correctly. (LOG-2069)

  • Before this update, when the ClusterLogForwarder was set up with Elasticsearch OutputDefault and Elasticsearch outputs did not have structured keys, the generated configuration contained the incorrect values for authentication. This update corrects the secret and certificates used. (LOG-2056)

  • Before this update, the OpenShift Logging dashboard displayed an empty CPU graph because of a reference to an invalid metric. With this update, the correct data point has been selected, resolving the issue. (LOG-2026)

  • Before this update, the Fluentd container image included builder tools that were unnecessary at run time. This update removes those tools from the image.(LOG-1927)

  • Before this update, a name change of the deployed collector in the 5.3 release caused the logging collector to generate the FluentdNodeDown alert. This update resolves the issue by fixing the job name for the Prometheus alert. (LOG-1918)

  • Before this update, the log collector was collecting its own logs due to a refactoring of the component name change. This lead to a potential feedback loop of the collector processing its own log that might result in memory and log message size issues. This update resolves the issue by excluding the collector logs from the collection. (LOG-1774)

  • Before this update, Elasticsearch generated the error Unable to create PersistentVolumeClaim due to forbidden: exceeded quota: infra-storage-quota. if the PVC already existed. With this update, Elasticsearch checks for existing PVCs, resolving the issue. (LOG-2131)

  • Before this update, Elasticsearch was unable to return to the ready state when the elasticsearch-signing secret was removed. With this update, Elasticsearch is able to go back to the ready state after that secret is removed. (LOG-2171)

  • Before this update, the change of the path from which the collector reads container logs caused the collector to forward some records to the wrong indices. With this update, the collector now uses the correct configuration to resolve the issue. (LOG-2160)

  • Before this update, clusters with a large number of namespaces caused Elasticsearch to stop serving requests because the list of namespaces reached the maximum header size limit. With this update, headers only include a list of namespace names, resolving the issue. (LOG-1899)

  • Before this update, the OpenShift Container Platform Logging dashboard showed the number of shards 'x' times larger than the actual value when Elasticsearch had 'x' nodes. This issue occurred because it was printing all primary shards for each Elasticsearch pod and calculating a sum on it, although the output was always for the whole Elasticsearch cluster. With this update, the number of shards is now correctly calculated. (LOG-2156)

  • Before this update, the secrets kibana and kibana-proxy were not recreated if they were deleted manually. With this update, the elasticsearch-operator will watch the resources and automatically recreate them if deleted. (LOG-2250)

  • Before this update, tuning the buffer chunk size could cause the collector to generate a warning about the chunk size exceeding the byte limit for the event stream. With this update, you can also tune the read line limit, resolving the issue. (LOG-2379)

  • Before this update, the logging console link in OpenShift web console was not removed with the ClusterLogging CR. With this update, deleting the CR or uninstalling the Cluster Logging Operator removes the link. (LOG-2373)

  • Before this update, a change to the container logs path caused the collection metric to always be zero with older releases configured with the original path. With this update, the plugin which exposes metrics about collected logs supports reading from either path to resolve the issue. (LOG-2462)

Logging 5.3.11

Bug fixes

  • Before this update, the Operator did not ensure that the pod was ready, which caused the cluster to reach an inoperable state during a cluster restart. With this update, the Operator marks new pods as ready before continuing to a new pod during a restart, which resolves the issue. (LOG-2871)

Logging 5.3.9

Bug fixes

  • Before this update, the logging collector included a path as a label for the metrics it produced. This path changed frequently and contributed to significant storage changes for the Prometheus server. With this update, the label has been dropped to resolve the issue and reduce storage consumption. (LOG-2682)

OpenShift Logging 5.3.7

Bug fixes

  • Before this update, Linux audit log time parsing relied on an ordinal position of key/value pair. This update changes the parsing to utilize a regex to find the time entry. (LOG-2322)

  • Before this update, some log forwarder outputs could re-order logs with the same time-stamp. With this update, a sequence number has been added to the log record to order entries that have matching timestamps. (LOG-2334)

  • Before this update, clusters with a large number of namespaces caused Elasticsearch to stop serving requests because the list of namespaces reached the maximum header size limit. With this update, headers only include a list of namespace names, resolving the issue. (LOG-2450)

  • Before this update, system:serviceaccount:openshift-monitoring:prometheus-k8s had cluster level privileges as a clusterrole and clusterrolebinding. This update restricts the serviceaccount to the openshift-logging namespace with a role and rolebinding. (LOG-2481))

OpenShift Logging 5.3.6

Bug fixes

  • Before this update, defining a toleration with no key and the existing Operator caused the Operator to be unable to complete an upgrade. With this update, this toleration no longer blocks the upgrade from completing. (LOG-2126)

  • Before this change, it was possible for the collector to generate a warning where the chunk byte limit was exceeding an emitted event. With this change, you can tune the readline limit to resolve the issue as advised by the upstream documentation. (LOG-2380)

OpenShift Logging 5.3.5

Bug fixes

  • Before this update, if you removed OpenShift Logging from OpenShift Container Platform, the web console continued displaying a link to the Logging page. With this update, removing or uninstalling OpenShift Logging also removes that link. (LOG-2182)

OpenShift Logging 5.3.4

Bug fixes

  • Before this update, changes to the metrics dashboards had not yet been deployed because the cluster-logging-operator did not correctly compare existing and desired config maps that contained the dashboard. This update fixes the logic by adding a unique hash value to the object labels. (LOG-2066)

  • Before this update, Elasticsearch pods failed to start after updating with FIPS enabled. With this update, Elasticsearch pods start successfully. (LOG-1974)

  • Before this update, elasticsearch generated the error "Unable to create PersistentVolumeClaim due to forbidden: exceeded quota: infra-storage-quota." if the PVC already existed. With this update, elasticsearch checks for existing PVCs, resolving the issue. (LOG-2127)

OpenShift Logging 5.3.3

Bug fixes

  • Before this update, changes to the metrics dashboards had not yet been deployed because the cluster-logging-operator did not correctly compare existing and desired configmaps containing the dashboard. This update fixes the logic by adding a dashboard unique hash value to the object labels.(LOG-2066)

  • This update changes the log4j dependency to 2.17.1 to resolve CVE-2021-44832.(LOG-2102)

OpenShift Logging 5.3.2

Bug fixes

  • Before this update, Elasticsearch rejected logs from the Event Router due to a parsing error. This update changes the data model to resolve the parsing error. However, as a result, previous indices might cause warnings or errors within Kibana. The kubernetes.event.metadata.resourceVersion field causes errors until existing indices are removed or reindexed. If this field is not used in Kibana, you can ignore the error messages. If you have a retention policy that deletes old indices, the policy eventually removes the old indices and stops the error messages. Otherwise, manually reindex to stop the error messages. (LOG-2087)

  • Before this update, the OpenShift Logging Dashboard displayed the wrong pod namespace in the table that displays top producing and collected containers over the last 24 hours. With this update, the OpenShift Logging Dashboard displays the correct pod namespace. (LOG-2051)

  • Before this update, if outputDefaults.elasticsearch.structuredTypeKey in the ClusterLogForwarder custom resource (CR) instance did not have a structured key, the CR replaced the output secret with the default secret used to communicate to the default log store. With this update, the defined output secret is correctly used. (LOG-2046)

OpenShift Logging 5.3.1

Bug fixes

  • Before this update, the Fluentd container image included builder tools that were unnecessary at run time. This update removes those tools from the image. (LOG-1998)

  • Before this update, the Logging dashboard displayed an empty CPU graph because of a reference to an invalid metric. With this update, the Logging dashboard displays CPU graphs correctly. (LOG-1925)

  • Before this update, the Elasticsearch Prometheus exporter plugin compiled index-level metrics using a high-cost query that impacted the Elasticsearch node performance. This update implements a lower-cost query that improves performance. (LOG-1897)

OpenShift Logging 5.3.0

New features and enhancements

  • With this update, authorization options for Log Forwarding have been expanded. Outputs may now be configured with SASL, username/password, or TLS.

Bug fixes

  • Before this update, if you forwarded logs using the syslog protocol, serializing a ruby hash encoded key/value pairs to contain a '⇒' character and replaced tabs with "#11". This update fixes the issue so that log messages are correctly serialized as valid JSON. (LOG-1494)

  • Before this update, application logs were not correctly configured to forward to the proper Cloudwatch stream with multi-line error detection enabled. (LOG-1939)

  • Before this update, a name change of the deployed collector in the 5.3 release caused the alert 'fluentnodedown' to generate. (LOG-1918)

  • Before this update, a regression introduced in a prior release configuration caused the collector to flush its buffered messages before shutdown, creating a delay the termination and restart of collector Pods. With this update, fluentd no longer flushes buffers at shutdown, resolving the issue. (LOG-1735)

  • Before this update, a regression introduced in a prior release intentionally disabled JSON message parsing. This update re-enables JSON parsing. It also sets the log entry "level" based on the "level" field in parsed JSON message or by using regex to extract a match from a message field. (LOG-1199)

  • Before this update, the ClusterLogging custom resource (CR) applied the value of the totalLimitSize field to the Fluentd total_limit_size field, even if the required buffer space was not available. With this update, the CR applies the lesser of the two totalLimitSize or 'default' values to the Fluentd total_limit_size field, resolving the issue. (LOG-1776)

Known issues

  • If you forward logs to an external Elasticsearch server and then change a configured value in the pipeline secret, such as the username and password, the Fluentd forwarder loads the new secret but uses the old value to connect to an external Elasticsearch server. This issue happens because the Red Hat OpenShift Logging Operator does not currently monitor secrets for content changes. (LOG-1652)

    As a workaround, if you change the secret, you can force the Fluentd pods to redeploy by entering:

    $ oc delete pod -l component=collector

Deprecated and removed features

Some features available in previous releases have been deprecated or removed.

Deprecated functionality is still included in OpenShift Logging and continues to be supported; however, it will be removed in a future release of this product and is not recommended for new deployments.

Forwarding logs using the legacy Fluentd and legacy syslog methods have been removed

In OpenShift Logging 5.3, the legacy methods of forwarding logs to Syslog and Fluentd are removed. Bug fixes and support are provided through the end of the OpenShift Logging 5.2 life cycle. After which, no new feature enhancements are made.

Instead, use the following non-legacy methods:

Configuration mechanisms for legacy forwarding methods have been removed

In OpenShift Logging 5.3, the legacy configuration mechanism for log forwarding is removed: You cannot forward logs using the legacy Fluentd method and legacy Syslog method. Use the standard log forwarding methods instead.

CVEs

Click to expand CVEs

OpenShift Logging 5.2.10

Bug fixes

  • Before this update some log forwarder outputs could re-order logs with the same time-stamp. With this update, a sequence number has been added to the log record to order entries that have matching timestamps.(LOG-2335)

  • Before this update, clusters with a large number of namespaces caused Elasticsearch to stop serving requests because the list of namespaces reached the maximum header size limit. With this update, headers only include a list of namespace names, resolving the issue. (LOG-2475)

  • Before this update, system:serviceaccount:openshift-monitoring:prometheus-k8s had cluster level privileges as a clusterrole and clusterrolebinding. This update restricts the serviceaccount to the openshift-logging namespace with a role and rolebinding. (LOG-2480)

  • Before this update, the cluster-logging-operator utilized cluster scoped roles and bindings to establish permissions for the Prometheus service account to scrape metrics. These permissions were only created when deploying the Operator using the console interface and were missing when the Operator was deployed from the command line. This fixes the issue by making this role and binding namespace scoped. (LOG-1972)

OpenShift Logging 5.2.9

Bug fixes

  • Before this update, defining a toleration with no key and the existing Operator caused the Operator to be unable to complete an upgrade. With this update, this toleration no longer blocks the upgrade from completing. (LOG-2304)

OpenShift Logging 5.2.8

Bug fixes

  • Before this update, if you removed OpenShift Logging from OpenShift Container Platform, the web console continued displaying a link to the Logging page. With this update, removing or uninstalling OpenShift Logging also removes that link. (LOG-2180)

OpenShift Logging 5.2.7

Bug fixes

  • Before this update, Elasticsearch pods with FIPS enabled failed to start after updating. With this update, Elasticsearch pods start successfully. (LOG-2000)

  • Before this update, if a persistent volume claim (PVC) already existed, Elasticsearch generated an error, "Unable to create PersistentVolumeClaim due to forbidden: exceeded quota: infra-storage-quota." With this update, Elasticsearch checks for existing PVCs, resolving the issue. (LOG-2118)

OpenShift Logging 5.2.6

Bug fixes

  • Before this update, the release did not include a filter change which caused Fluentd to crash. With this update, the missing filter has been corrected. (LOG-2104)

  • This update changes the log4j dependency to 2.17.1 to resolve CVE-2021-44832.(LOG-2101)

OpenShift Logging 5.2.5

Bug fixes

  • Before this update, Elasticsearch rejected logs from the Event Router due to a parsing error. This update changes the data model to resolve the parsing error. However, as a result, previous indices might cause warnings or errors within Kibana. The kubernetes.event.metadata.resourceVersion field causes errors until existing indices are removed or reindexed. If this field is not used in Kibana, you can ignore the error messages. If you have a retention policy that deletes old indices, the policy eventually removes the old indices and stops the error messages. Otherwise, manually reindex to stop the error messages. LOG-2087)

OpenShift Logging 5.2.4

Bug fixes

  • Before this update, records shipped via syslog would serialize a ruby hash encoding key/value pairs to contain a '⇒' character, as well as replace tabs with "#11". This update serializes the message correctly as proper JSON. (LOG-1775)

  • Before this update, the Elasticsearch Prometheus exporter plugin compiled index-level metrics using a high-cost query that impacted the Elasticsearch node performance. This update implements a lower-cost query that improves performance. (LOG-1970)

  • Before this update, Elasticsearch sometimes rejected messages when Log Forwarding was configured with multiple outputs. This happened because configuring one of the outputs modified message content to be a single message. With this update, Log Forwarding duplicates the messages for each output so that output-specific processing does not affect the other outputs. (LOG-1824)

OpenShift Logging 5.2.3

Bug fixes

  • Before this update, some alerts did not include a namespace label. This omission does not comply with the OpenShift Monitoring Team’s guidelines for writing alerting rules in OpenShift Container Platform. With this update, all the alerts in Elasticsearch Operator include a namespace label and follow all the guidelines for writing alerting rules in OpenShift Container Platform. (LOG-1857)

  • Before this update, a regression introduced in a prior release intentionally disabled JSON message parsing. This update re-enables JSON parsing. It also sets the log entry level based on the level field in parsed JSON message or by using regex to extract a match from a message field. (LOG-1759)

OpenShift Logging 5.2.2

Bug fixes

  • Before this update, the ClusterLogging custom resource (CR) applied the value of the totalLimitSize field to the Fluentd total_limit_size field, even if the required buffer space was not available. With this update, the CR applies the lesser of the two totalLimitSize or 'default' values to the Fluentd total_limit_size field, resolving the issue.(LOG-1738)

  • Before this update, a regression introduced in a prior release configuration caused the collector to flush its buffered messages before shutdown, creating a delay to the termination and restart of collector pods. With this update, Fluentd no longer flushes buffers at shutdown, resolving the issue. (LOG-1739)

  • Before this update, an issue in the bundle manifests prevented installation of the Elasticsearch Operator through OLM on OpenShift Container Platform 4.9. With this update, a correction to bundle manifests re-enables installation and upgrade in 4.9.(LOG-1780)

OpenShift Logging 5.2.1

Bug fixes

  • Before this update, due to an issue in the release pipeline scripts, the value of the olm.skipRange field remained unchanged at 5.2.0 instead of reflecting the current release number. This update fixes the pipeline scripts to update the value of this field when the release numbers change. (LOG-1743)

CVEs

(None)

OpenShift Logging 5.2.0

New features and enhancements

  • With this update, you can forward log data to Amazon CloudWatch, which provides application and infrastructure monitoring. For more information, see Forwarding logs to Amazon CloudWatch. (LOG-1173)

  • With this update, you can forward log data to Loki, a horizontally scalable, highly available, multi-tenant log aggregation system. For more information, see Forwarding logs to Loki. (LOG-684)

  • With this update, if you use the Fluentd forward protocol to forward log data over a TLS-encrypted connection, now you can use a password-encrypted private key file and specify the passphrase in the Cluster Log Forwarder configuration. For more information, see Forwarding logs using the Fluentd forward protocol. (LOG-1525)

  • This enhancement enables you to use a username and password to authenticate a log forwarding connection to an external Elasticsearch instance. For example, if you cannot use mutual TLS (mTLS) because a third-party operates the Elasticsearch instance, you can use HTTP or HTTPS and set a secret that contains the username and password. For more information, see Forwarding logs to an external Elasticsearch instance. (LOG-1022)

  • With this update, you can collect OVN network policy audit logs for forwarding to a logging server. (LOG-1526)

  • By default, the data model introduced in OpenShift Container Platform 4.5 gave logs from different namespaces a single index in common. This change made it harder to see which namespaces produced the most logs.

    The current release adds namespace metrics to the Logging dashboard in the OpenShift Container Platform console. With these metrics, you can see which namespaces produce logs and how many logs each namespace produces for a given timestamp.

    To see these metrics, open the Administrator perspective in the OpenShift Container Platform web console, and navigate to ObserveDashboardsLogging/Elasticsearch. (LOG-1680)

  • The current release, OpenShift Logging 5.2, enables two new metrics: For a given timestamp or duration, you can see the total logs produced or logged by individual containers, and the total logs collected by the collector. These metrics are labeled by namespace, pod, and container name so that you can see how many logs each namespace and pod collects and produces. (LOG-1213)

Bug fixes

  • Before this update, when the OpenShift Elasticsearch Operator created index management cronjobs, it added the POLICY_MAPPING environment variable twice, which caused the apiserver to report the duplication. This update fixes the issue so that the POLICY_MAPPING environment variable is set only once per cronjob, and there is no duplication for the apiserver to report. (LOG-1130)

  • Before this update, suspending an Elasticsearch cluster to zero nodes did not suspend the index-management cronjobs, which put these cronjobs into maximum backoff. Then, after unsuspending the Elasticsearch cluster, these cronjobs stayed halted due to maximum backoff reached. This update resolves the issue by suspending the cronjobs and the cluster. (LOG-1268)

  • Before this update, in the Logging dashboard in the OpenShift Container Platform console, the list of top 10 log-producing containers was missing the "chart namespace" label and provided the incorrect metric name, fluentd_input_status_total_bytes_logged. With this update, the chart shows the namespace label and the correct metric name, log_logged_bytes_total. (LOG-1271)

  • Before this update, if an index management cronjob terminated with an error, it did not report the error exit code: instead, its job status was "complete." This update resolves the issue by reporting the error exit codes of index management cronjobs that terminate with errors. (LOG-1273)

  • The priorityclasses.v1beta1.scheduling.k8s.io was removed in 1.22 and replaced by priorityclasses.v1.scheduling.k8s.io (v1beta1 was replaced by v1). Before this update, APIRemovedInNextReleaseInUse alerts were generated for priorityclasses because v1beta1 was still present . This update resolves the issue by replacing v1beta1 with v1. The alert is no longer generated. (LOG-1385)

  • Previously, the OpenShift Elasticsearch Operator and Red Hat OpenShift Logging Operator did not have the annotation that was required for them to appear in the OpenShift Container Platform web console list of Operators that can run in a disconnected environment. This update adds the operators.openshift.io/infrastructure-features: '["Disconnected"]' annotation to these two Operators so that they appear in the list of Operators that run in disconnected environments. (LOG-1420)

  • Before this update, Red Hat OpenShift Logging Operator pods were scheduled on CPU cores that were reserved for customer workloads on performance-optimized single-node clusters. With this update, cluster logging Operator pods are scheduled on the correct CPU cores. (LOG-1440)

  • Before this update, some log entries had unrecognized UTF-8 bytes, which caused Elasticsearch to reject the messages and block the entire buffered payload. With this update, rejected payloads drop the invalid log entries and resubmit the remaining entries to resolve the issue. (LOG-1499)

  • Before this update, the kibana-proxy pod sometimes entered the CrashLoopBackoff state and logged the following message Invalid configuration: cookie_secret must be 16, 24, or 32 bytes to create an AES cipher when pass_access_token == true or cookie_refresh != 0, but is 29 bytes. The exact actual number of bytes could vary. With this update, the generation of the Kibana session secret has been corrected, and the kibana-proxy pod no longer enters a CrashLoopBackoff state due to this error. (LOG-1446)

  • Before this update, the AWS CloudWatch Fluentd plugin logged its AWS API calls to the Fluentd log at all log levels, consuming additional OpenShift Container Platform node resources. With this update, the AWS CloudWatch Fluentd plugin logs AWS API calls only at the "debug" and "trace" log levels. This way, at the default "warn" log level, Fluentd does not consume extra node resources. (LOG-1071)

  • Before this update, the Elasticsearch OpenDistro security plugin caused user index migrations to fail. This update resolves the issue by providing a newer version of the plugin. Now, index migrations proceed without errors. (LOG-1276)

  • Before this update, in the Logging dashboard in the OpenShift Container Platform console, the list of top 10 log-producing containers lacked data points. This update resolves the issue, and the dashboard displays all data points. (LOG-1353)

  • Before this update, if you were tuning the performance of the Fluentd log forwarder by adjusting the chunkLimitSize and totalLimitSize values, the Setting queued_chunks_limit_size for each buffer to message reported values that were too low. The current update fixes this issue so that this message reports the correct values. (LOG-1411)

  • Before this update, the Kibana OpenDistro security plugin caused user index migrations to fail. This update resolves the issue by providing a newer version of the plugin. Now, index migrations proceed without errors. (LOG-1558)

  • Before this update, using a namespace input filter prevented logs in that namespace from appearing in other inputs. With this update, logs are sent to all inputs that can accept them. (LOG-1570)

  • Before this update, a missing license file for the viaq/logerr dependency caused license scanners to abort without success. With this update, the viaq/logerr dependency is licensed under Apache 2.0 and the license scanners run successfully. (LOG-1590)

  • Before this update, an incorrect brew tag for curator5 within the elasticsearch-operator-bundle build pipeline caused the pull of an image pinned to a dummy SHA1. With this update, the build pipeline uses the logging-curator5-rhel8 reference for curator5, enabling index management cronjobs to pull the correct image from registry.redhat.io. (LOG-1624)

  • Before this update, an issue with the ServiceAccount permissions caused errors such as no permissions for [indices:admin/aliases/get]. With this update, a permission fix resolves the issue. (LOG-1657)

  • Before this update, the Custom Resource Definition (CRD) for the Red Hat OpenShift Logging Operator was missing the Loki output type, which caused the admission controller to reject the ClusterLogForwarder custom resource object. With this update, the CRD includes Loki as an output type so that administrators can configure ClusterLogForwarder to send logs to a Loki server. (LOG-1683)

  • Before this update, OpenShift Elasticsearch Operator reconciliation of the ServiceAccounts overwrote third-party-owned fields that contained secrets. This issue caused memory and CPU spikes due to frequent recreation of secrets. This update resolves the issue. Now, the OpenShift Elasticsearch Operator does not overwrite third-party-owned fields. (LOG-1714)

  • Before this update, in the ClusterLogging custom resource (CR) definition, if you specified a flush_interval value but did not set flush_mode to interval, the Red Hat OpenShift Logging Operator generated a Fluentd configuration. However, the Fluentd collector generated an error at runtime. With this update, the Red Hat OpenShift Logging Operator validates the ClusterLogging CR definition and only generates the Fluentd configuration if both fields are specified. (LOG-1723)

Known issues

  • If you forward logs to an external Elasticsearch server and then change a configured value in the pipeline secret, such as the username and password, the Fluentd forwarder loads the new secret but uses the old value to connect to an external Elasticsearch server. This issue happens because the Red Hat OpenShift Logging Operator does not currently monitor secrets for content changes. (LOG-1652)

    As a workaround, if you change the secret, you can force the Fluentd pods to redeploy by entering:

    $ oc delete pod -l component=collector

Deprecated and removed features

Some features available in previous releases have been deprecated or removed.

Deprecated functionality is still included in OpenShift Logging and continues to be supported; however, it will be removed in a future release of this product and is not recommended for new deployments.

Forwarding logs using the legacy Fluentd and legacy syslog methods have been deprecated

From OpenShift Container Platform 4.6 to the present, forwarding logs by using the following legacy methods have been deprecated and will be removed in a future release:

  • Forwarding logs using the legacy Fluentd method

  • Forwarding logs using the legacy syslog method

Instead, use the following non-legacy methods: