Logging 5.9 - Logging | Observability

Logging is provided as an installable component, with a distinct release cycle from the core OKD. The Red Hat OpenShift Container Platform Life Cycle Policy outlines release compatibility.

The stable channel only provides updates to the most recent release of logging. To continue receiving updates for prior releases, you must change your subscription channel to stable-x.y, where x.y represents the major and minor version of logging you have installed. For example, stable-5.7.

Logging 5.9.9

This release includes RHBA-2024:10049.

Bug fixes

Before this update, upgrades to version 6.0 failed with errors if a Log File Metric Exporter instance was present. This update fixes the issue, enabling upgrades to proceed smoothly without errors. (LOG-6201)
Before this update, Loki did not correctly load some configurations, which caused issues when using Alibaba Cloud or IBM Cloud object storage. This update fixes the configuration-loading code in Loki, resolving the issue. (LOG-6293)

CVEs

CVE-2024-6119

Logging 5.9.8

This release includes OpenShift Logging Bug Fix Release 5.9.8.

Bug fixes

Before this update, the Loki Operator failed to add the default namespace label to all AlertingRule resources, which caused the User-Workload-Monitoring Alertmanager to skip routing these alerts. This update adds the rule namespace as a label to all alerting and recording rules, resolving the issue and restoring proper alert routing in Alertmanager. (LOG-6181)
Before this update, the LokiStack ruler component view did not initialize properly, causing an invalid field error when the ruler component was disabled. This update ensures that the component view initializes with an empty value, resolving the issue. (LOG-6183)
Before this update, an LF character in the vector.toml file under the ES authentication configuration caused the collector pods to crash. This update removes the newline characters from the username and password fields, resolving the issue. (LOG-6206)
Before this update, it was possible to set the .containerLimit.maxRecordsPerSecond parameter in the ClusterLogForwarder custom resource to 0, which could lead to an exception during Vector’s startup. With this update, the configuration is validated before being applied, and any invalid values (less than or equal to zero) are rejected. (LOG-6214)

CVEs

Logging 5.9.7

This release includes OpenShift Logging Bug Fix Release 5.9.7.

Bug fixes

Before this update, the clusterlogforwarder.spec.outputs.http.timeout parameter was not applied to the Fluentd configuration when Fluentd was used as the collector type, causing HTTP timeouts to be misconfigured. With this update, the clusterlogforwarder.spec.outputs.http.timeout parameter is now correctly applied, ensuring Fluentd honors the specified timeout and handles HTTP connections according to the user’s configuration. (LOG-6125)
Before this update, the TLS section was added without verifying the broker URL schema, resulting in SSL connection errors if the URLs did not start with tls. With this update, the TLS section is now added only if the broker URLs start with tls, preventing SSL connection errors. (LOG-6041)

CVEs

For detailed information on Red Hat security ratings, review Severity ratings.

Logging 5.9.6

This release includes OpenShift Logging Bug Fix Release 5.9.6.

Bug fixes

Before this update, the collector deployment ignored secret changes, causing receivers to reject logs. With this update, the system rolls out a new pod when there is a change in the secret value, ensuring that the collector reloads the updated secrets. (LOG-5525)
Before this update, the Vector could not correctly parse field values that included a single dollar sign ($). With this update, field values with a single dollar sign are automatically changed to two dollar signs ($$), ensuring proper parsing by the Vector. (LOG-5602)
Before this update, the drop filter could not handle non-string values (e.g., .responseStatus.code: 403). With this update, the drop filter now works properly with these values. (LOG-5815)
Before this update, the collector used the default settings to collect audit logs, without handling the backload from output receivers. With this update, the process for collecting audit logs has been improved to better manage file handling and log reading efficiency. (LOG-5866)
Before this update, the must-gather tool failed on clusters with non-AMD64 architectures such as Azure Resource Manager (ARM) or PowerPC. With this update, the tool now detects the cluster architecture at runtime and uses architecture-independent paths and dependencies. The detection allows must-gather to run smoothly on platforms like ARM and PowerPC. (LOG-5997)
Before this update, the log level was set using a mix of structured and unstructured keywords that were unclear. With this update, the log level follows a clear, documented order, starting with structured keywords. (LOG-6016)
Before this update, multiple unnamed pipelines writing to the default output in the ClusterLogForwarder caused a validation error due to duplicate auto-generated names. With this update, the pipeline names are now generated without duplicates. (LOG-6033)
Before this update, the collector pods did not have the PreferredScheduling annotation. With this update, the PreferredScheduling annotation is added to the collector daemonset. (LOG-6023)

CVEs

Logging 5.9.5

This release includes OpenShift Logging Bug Fix Release 5.9.5

Bug Fixes

Before this update, duplicate conditions in the LokiStack resource status led to invalid metrics from the Loki Operator. With this update, the Operator removes duplicate conditions from the status. (LOG-5855)
Before this update, the Loki Operator did not trigger alerts when it dropped log events due to validation failures. With this update, the Loki Operator includes a new alert definition that triggers an alert if Loki drops log events due to validation failures. (LOG-5895)
Before this update, the Loki Operator overwrote user annotations on the LokiStack Route resource, causing customizations to drop. With this update, the Loki Operator no longer overwrites Route annotations, fixing the issue. (LOG-5945)

CVEs

None.

Logging 5.9.4

This release includes OpenShift Logging Bug Fix Release 5.9.4

Bug Fixes

Before this update, an incorrectly formatted timeout configuration caused the OCP plugin to crash. With this update, a validation prevents the crash and informs the user about the incorrect configuration. (LOG-5373)
Before this update, workloads with labels containing - caused an error in the collector when normalizing log entries. With this update, the configuration change ensures the collector uses the correct syntax. (LOG-5524)
Before this update, an issue prevented selecting pods that no longer existed, even if they had generated logs. With this update, this issue has been fixed, allowing selection of such pods. (LOG-5697)
Before this update, the Loki Operator would crash if the CredentialRequest specification was registered in an environment without the cloud-credentials-operator. With this update, the CredentialRequest specification only registers in environments that are cloud-credentials-operator enabled. (LOG-5701)
Before this update, the Logging Operator watched and processed all config maps across the cluster. With this update, the dashboard controller only watches the config map for the logging dashboard. (LOG-5702)
Before this update, the ClusterLogForwarder introduced an extra space in the message payload which did not follow the RFC3164 specification. With this update, the extra space has been removed, fixing the issue. (LOG-5707)
Before this update, removing the seeding for grafana-dashboard-cluster-logging as a part of (LOG-5308) broke new greenfield deployments without dashboards. With this update, the Logging Operator seeds the dashboard at the beginning and continues to update it for changes. (LOG-5747)
Before this update, LokiStack was missing a route for the Volume API causing the following error: 404 not found. With this update, LokiStack exposes the Volume API, resolving the issue. (LOG-5749)

CVEs

CVE-2024-24790

Logging 5.9.3

This release includes OpenShift Logging Bug Fix Release 5.9.3

Bug Fixes

Before this update, there was a delay in restarting Ingesters when configuring LokiStack, because the Loki Operator sets the write-ahead log replay_memory_ceiling to zero bytes for the 1x.demo size. With this update, the minimum value used for the replay_memory_ceiling has been increased to avoid delays. (LOG-5614)
Before this update, monitoring the Vector collector output buffer state was not possible. With this update, monitoring and alerting the Vector collector output buffer size is possible that improves observability capabilities and helps keep the system running optimally. (LOG-5586)

CVEs

Logging 5.9.2

This release includes OpenShift Logging Bug Fix Release 5.9.2

Bug Fixes

Before this update, changes to the Logging Operator caused an error due to an incorrect configuration in the ClusterLogForwarder CR. As a result, upgrades to logging deleted the daemonset collector. With this update, the Logging Operator re-creates collector daemonsets except when a Not authorized to collect error occurs. (LOG-4910)
Before this update, the rotated infrastructure log files were sent to the application index in some scenarios due to an incorrect configuration in the Vector log collector. With this update, the Vector log collector configuration avoids collecting any rotated infrastructure log files. (LOG-5156)
Before this update, the Logging Operator did not monitor changes to the grafana-dashboard-cluster-logging config map. With this update, the Logging Operator monitors changes in the ConfigMap objects, ensuring the system stays synchronized and responds effectively to config map modifications. (LOG-5308)
Before this update, an issue in the metrics collection code of the Logging Operator caused it to report stale telemetry metrics. With this update, the Logging Operator does not report stale telemetry metrics. (LOG-5426)
Before this change, the Fluentd out_http plugin ignored the no_proxy environment variable. With this update, the Fluentd patches the HTTP#start method of ruby to honor the no_proxy environment variable. (LOG-5466)

CVEs

Logging 5.9.1

This release includes OpenShift Logging Bug Fix Release 5.9.1

Enhancements

Before this update, the Loki Operator configured Loki to use path-based style access for the Amazon Simple Storage Service (S3), which has been deprecated. With this update, the Loki Operator defaults to virtual-host style without users needing to change their configuration. (LOG-5401)
Before this update, the Loki Operator did not validate the Amazon Simple Storage Service (S3) endpoint used in the storage secret. With this update, the validation process ensures the S3 endpoint is a valid S3 URL, and the LokiStack status updates to indicate any invalid URLs. (LOG-5395)

Bug Fixes

Before this update, a bug in LogQL parsing left out some line filters from the query. With this update, the parsing now includes all the line filters while keeping the original query unchanged. (LOG-5268)
Before this update, a prune filter without a defined pruneFilterSpec would cause a segfault. With this update, there is a validation error if a prune filter is without a defined puneFilterSpec. (LOG-5322)
Before this update, a drop filter without a defined dropTestsSpec would cause a segfault. With this update, there is a validation error if a prune filter is without a defined puneFilterSpec. (LOG-5323)
Before this update, the Loki Operator did not validate the Amazon Simple Storage Service (S3) endpoint URL format used in the storage secret. With this update, the S3 endpoint URL goes through a validation step that reflects on the status of the LokiStack. (LOG-5397)
Before this update, poorly formatted timestamp fields in audit log records led to WARN messages in Red Hat OpenShift Logging Operator logs. With this update, a remap transformation ensures that the timestamp field is properly formatted. (LOG-4672)
Before this update, the error message thrown while validating a ClusterLogForwarder resource name and namespace did not correspond to the correct error. With this update, the system checks if a ClusterLogForwarder resource with the same name exists in the same namespace. If not, it corresponds to the correct error. (LOG-5062)
Before this update, the validation feature for output config required a TLS URL, even for services such as Amazon CloudWatch or Google Cloud Logging where a URL is not needed by design. With this update, the validation logic for services without URLs are improved, and the error message are more informative. (LOG-5307)
Before this update, defining an infrastructure input type did not exclude logging workloads from the collection. With this update, the collection excludes logging services to avoid feedback loops. (LOG-5309)

CVEs

No CVEs.

Logging 5.9.0

This release includes OpenShift Logging Bug Fix Release 5.9.0

Removal notice

The Logging 5.9 release does not contain an updated version of the OpenShift Elasticsearch Operator. Instances of OpenShift Elasticsearch Operator from prior logging releases, remain supported until the EOL of the logging release. As an alternative to using the OpenShift Elasticsearch Operator to manage the default log storage, you can use the Loki Operator. For more information on the Logging lifecycle dates, see Platform Agnostic Operators.

Deprecation notice

In Logging 5.9, Fluentd, and Kibana are deprecated and are planned to be removed in Logging 6.0, which is expected to be shipped alongside a future release of OKD. Red Hat will provide critical and above CVE bug fixes and support for these components during the current release lifecycle, but these components will no longer receive feature enhancements. The Vector-based collector provided by the Red Hat OpenShift Logging Operator and LokiStack provided by the Loki Operator are the preferred Operators for log collection and storage. We encourage all users to adopt the Vector and Loki log stack, as this will be the stack that will be enhanced going forward.
In Logging 5.9, the Fields option for the Splunk output type was never implemented and is now deprecated. It will be removed in a future release.

Enhancements

Log Collection

This enhancement adds the ability to refine the process of log collection by using a workload’s metadata to drop or prune logs based on their content. Additionally, it allows the collection of infrastructure logs, such as journal or container logs, and audit logs, such as kube api or ovn logs, to only collect individual sources. (LOG-2155)
This enhancement introduces a new type of remote log receiver, the syslog receiver. You can configure it to expose a port over a network, allowing external systems to send syslog logs using compatible tools such as rsyslog. (LOG-3527)
With this update, the ClusterLogForwarder API now supports log forwarding to Azure Monitor Logs, giving users better monitoring abilities. This feature helps users to maintain optimal system performance and streamline the log analysis processes in Azure Monitor, which speeds up issue resolution and improves operational efficiency. (LOG-4605)
This enhancement improves collector resource utilization by deploying collectors as a deployment with two replicas. This occurs when the only input source defined in the ClusterLogForwarder custom resource (CR) is a receiver input instead of using a daemon set on all nodes. Additionally, collectors deployed in this manner do not mount the host file system. To use this enhancement, you need to annotate the ClusterLogForwarder CR with the logging.openshift.io/dev-preview-enable-collector-as-deployment annotation. (LOG-4779)
This enhancement introduces the capability for custom tenant configuration across all supported outputs, facilitating the organization of log records in a logical manner. However, it does not permit custom tenant configuration for logging managed storage. (LOG-4843)
With this update, the ClusterLogForwarder CR that specifies an application input with one or more infrastructure namespaces like default, openshift*, or kube*, now requires a service account with the collect-infrastructure-logs role. (LOG-4943)
This enhancement introduces the capability for tuning some output settings, such as compression, retry duration, and maximum payloads, to match the characteristics of the receiver. Additionally, this feature includes a delivery mode to allow administrators to choose between throughput and log durability. For example, the AtLeastOnce option configures minimal disk buffering of collected logs so that the collector can deliver those logs after a restart. (LOG-5026)
This enhancement adds three new Prometheus alerts, warning users about the deprecation of Elasticsearch, Fluentd, and Kibana. (LOG-5055)

Log Storage

This enhancement in LokiStack improves support for OTEL by using the new V13 object storage format and enabling automatic stream sharding by default. This also prepares the collector for future enhancements and configurations. (LOG-4538)
This enhancement introduces support for short-lived token workload identity federation with Azure and AWS log stores for STS enabled OKD 4.14 and later clusters. Local storage requires the addition of a CredentialMode: static annotation under spec.storage.secret in the LokiStack CR. (LOG-4540)
With this update, the validation of the Azure storage secret is now extended to give early warning for certain error conditions. (LOG-4571)
With this update, Loki now adds upstream and downstream support for GCP workload identity federation mechanism. This allows authenticated and authorized access to the corresponding object storage services. (LOG-4754)

Bug Fixes

Before this update, the logging must-gather could not collect any logs on a fips-enabled cluster. With this update, a new oc client is available in cluster-logging-rhel9-operator, and must-gather works properly on fips clusters. (LOG-4403)
Before this update, the LokiStack ruler pods could not format the IPv6 pod IP in HTTP URLs used for cross-pod communication. This issue caused querying rules and alerts through the Prometheus-compatible API to fail. With this update, the LokiStack ruler pods encapsulate the IPv6 pod IP in square brackets, resolving the problem. Now, querying rules and alerts through the Prometheus-compatible API works just like in IPv4 environments. (LOG-4709)
Before this fix, the YAML content from the logging must-gather was exported in a single line, making it unreadable. With this update, the YAML white spaces are preserved, ensuring that the file is properly formatted. (LOG-4792)
Before this update, when the ClusterLogForwarder CR was enabled, the Red Hat OpenShift Logging Operator could run into a nil pointer exception when ClusterLogging.Spec.Collection was nil. With this update, the issue is now resolved in the Red Hat OpenShift Logging Operator. (LOG-5006)
Before this update, in specific corner cases, replacing the ClusterLogForwarder CR status field caused the resourceVersion to constantly update due to changing timestamps in Status conditions. This condition led to an infinite reconciliation loop. With this update, all status conditions synchronize, so that timestamps remain unchanged if conditions stay the same. (LOG-5007)
Before this update, there was an internal buffering behavior to drop_newest to address high memory consumption by the collector resulting in significant log loss. With this update, the behavior reverts to using the collector defaults. (LOG-5123)
Before this update, the Loki Operator ServiceMonitor in the openshift-operators-redhat namespace used static token and CA files for authentication, causing errors in the Prometheus Operator in the User Workload Monitoring spec on the ServiceMonitor configuration. With this update, the Loki Operator ServiceMonitor in openshift-operators-redhat namespace now references a service account token secret by a LocalReference object. This approach allows the User Workload Monitoring spec in the Prometheus Operator to handle the Loki Operator ServiceMonitor successfully, enabling Prometheus to scrape the Loki Operator metrics. (LOG-5165)
Before this update, the configuration of the Loki Operator ServiceMonitor could match many Kubernetes services, resulting in the Loki Operator metrics being collected multiple times. With this update, the configuration of ServiceMonitor now only matches the dedicated metrics service. (LOG-5212)

Known Issues

None.