This is a cache of https://docs.okd.io/4.7/logging/cluster-logging-release-notes.html. It is a snapshot of the page at 2024-11-29T02:08:36.892+0000.
Release notes | Logging | OKD 4.7
×

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. We are beginning with these four terms: master, slave, blacklist, and whitelist. Because of the enormity of this endeavor, these changes will be implemented gradually over several upcoming releases. For more details, see Red Hat CTO Chris Wright’s message.

Supported Versions

Table 1. OKD version support for Red Hat OpenShift Logging (RHOL)
4.7 4.8 4.9

RHOL 5.1

X

X

RHOL 5.2

X

X

X

RHOL 5.3

X

X

OpenShift Logging 5.1.0

New features and enhancements

OpenShift Logging 5.1 now supports OKD 4.7 and later running on:

  • IBM Power Systems

  • IBM Z and LinuxONE

This release adds improvements related to the following components and concepts.

  • As a cluster administrator, you can use Kubernetes pod labels to gather log data from an application and send it to a specific log store. You can gather log data by configuring the inputs[].application.selector.matchLabels element in the ClusterLogForwarder custom resource (CR) YAML file. You can also filter the gathered log data by namespace. (LOG-883)

  • This release adds the following new ElasticsearchNodeDiskWatermarkReached warnings to the OpenShift Elasticsearch Operator (EO):

    • Elasticsearch Node Disk Low Watermark Reached

    • Elasticsearch Node Disk High Watermark Reached

    • Elasticsearch Node Disk Flood Watermark Reached

    The alert applies the past several warnings when it predicts that an Elasticsearch node will reach the Disk Low Watermark, Disk High Watermark, or Disk Flood Stage Watermark thresholds in the next 6 hours. This warning period gives you time to respond before the node reaches the disk watermark thresholds. The warning messages also provide links to the troubleshooting steps, which you can follow to help mitigate the issue. The EO applies the past several hours of disk space data to a linear model to generate these warnings. (LOG-1100)

  • JSON logs can now be forwarded as JSON objects, rather than quoted strings, to either Red Hat’s managed Elasticsearch cluster or any of the other supported third-party systems. Additionally, you can now query individual fields from a JSON log message inside Kibana which increases the discoverability of specific logs. (LOG-785, LOG-1148)

Deprecated and removed features

Some features available in previous releases have been deprecated or removed.

Deprecated functionality is still included in OpenShift Logging and continues to be supported; however, it will be removed in a future release of this product and is not recommended for new deployments.

Elasticsearch Curator has been removed

With this update, the Elasticsearch Curator has been removed and is no longer supported. Elasticsearch Curator helped you curate or manage your indices on OpenShift Container Platform 4.4 and earlier. Instead of using Elasticsearch Curator, configure the log retention time.

Forwarding logs using the legacy Fluentd and legacy syslog methods have been deprecated

From OKD 4.6 to the present, forwarding logs by using the legacy Fluentd and legacy syslog methods have been deprecated and will be removed in a future release. Use the standard non-legacy methods instead.

Bug fixes

  • Before this update, the ClusterLogForwarder CR did not show the input[].selector element after it had been created. With this update, when you specify a selector in the ClusterLogForwarder CR, it remains. Fixing this bug was necessary for LOG-883, which enables using pod label selectors to forward application log data. (LOG-1338)

  • Before this update, an update in the cluster service version (CSV) accidentally introduced resources and limits for the OpenShift Elasticsearch Operator container. Under specific conditions, this caused an out-of-memory condition that terminated the Elasticsearch Operator pod. This update fixes the issue by removing the CSV resources and limits for the Operator container. The Operator gets scheduled without issues. (LOG-1254)

  • Before this update, forwarding logs to Kafka using chained certificates failed with the following error message:

    state=error: certificate verify failed (unable to get local issuer certificate)

    Logs could not be forwarded to a Kafka broker with a certificate signed by an intermediate CA. This happened because the Fluentd Kafka plug-in could only handle a single CA certificate supplied in the ca-bundle.crt entry of the corresponding secret. This update fixes the issue by enabling the Fluentd Kafka plug-in to handle multiple CA certificates supplied in the ca-bundle.crt entry of the corresponding secret. Now, logs can be forwarded to a Kafka broker with a certificate signed by an intermediate CA. (LOG-1218, LOG-1216)

  • Before this update, while under load, Elasticsearch responded to some requests with an HTTP 500 error, even though there was nothing wrong with the cluster. Retrying the request was successful. This update fixes the issue by updating the index management cron jobs to be more resilient when they encounter temporary HTTP 500 errors. The updated index management cron jobs will first retry a request multiple times before failing. (LOG-1215)

  • Before this update, if you did not set the .proxy value in the cluster installation configuration, and then configured a global proxy on the installed cluster, a bug prevented Fluentd from forwarding logs to Elasticsearch. To work around this issue, in the proxy or cluster configuration, set the no_proxy value to .svc.cluster.local so it skips internal traffic. This update fixes the proxy configuration issue. If you configure the global proxy after installing an OKD cluster, Fluentd forwards logs to Elasticsearch. (LOG-1187, BZ#1915448)

  • Before this update, the logging collector created more socket connections than necessary. With this update, the logging collector reuses the existing socket connection to send logs. (LOG-1186)

  • Before this update, if a cluster administrator tried to add or remove storage from an Elasticsearch cluster, the OpenShift Elasticsearch Operator (EO) incorrectly tried to upgrade the Elasticsearch cluster, displaying scheduledUpgrade: "True", shardAllocationEnabled: primaries, and change the volumes. With this update, the EO does not try to upgrade the Elasticsearch cluster.

    The EO status displays the following new status information to indicate when you have tried to make an unsupported change to the Elasticsearch storage that it has ignored:

    • StorageStructureChangeIgnored when you try to change between using ephemeral and persistent storage structures.

    • StorageClassNameChangeIgnored when you try to change the storage class name.

    • StorageSizeChangeIgnored when you try to change the storage size.

    If you configure the ClusterLogging custom resource (CR) to switch from ephemeral to persistent storage, the EO creates a persistent volume claim (PVC) but does not create a persistent volume (PV). To clear the StorageStructureChangeIgnored status, you must revert the change to the ClusterLogging CR and delete the persistent volume claim (PVC).

  • Before this update, if you redeployed a full Elasticsearch cluster, it got stuck in an unhealthy state, with one non-data node running and all other data nodes shut down. This issue happened because new certificates prevented the Elasticsearch Operator from scaling down the non-data nodes of the Elasticsearch cluster. With this update, Elasticsearch Operator can scale all the data and non-data nodes down and then back up again, so they load the new certificates. The Elasticsearch Operator can reach the new nodes after they load the new certificates. (LOG-1536)

OpenShift Logging 5.0.9

Bug fixes

This release includes the following bug fixes:

  • Before this update, some log entries had unrecognized UTF-8 bytes, which caused Elasticsearch to reject messages and block the entire buffered payload. This update resolves the issue: rejected payloads drop the invalid log entries and resubmit the remaining entries. (LOG-1574)

  • Before this update, editing the ClusterLogging custom resource (CR) did not apply the value of totalLimitSize to the Fluentd total_limit_size field, which limits the size of the buffer plugin instance. As a result, Fluentd applied the default values. With this update, the CR applies the value of totalLimitSize to the Fluentd total_limit_size field. Fluentd uses the value of the total_limit_size field or the default value, whichever is less. (LOG-1736)

OpenShift Logging 5.0.8

Bug fixes

This release also includes the following bug fixes:

  • Due to an issue in the release pipeline scripts, the value of the olm.skipRange field remained unchanged at 5.2.0 and was not updated when the z-stream number, 0, increased. The current release fixes the pipeline scripts to update the value of this field when the release numbers change. (LOG-1741)

OpenShift Logging 5.0.7

Bug fixes

This release also includes the following bug fixes:

  • LOG-1594 - Vendored viaq/logerr dependency is missing a license file

OpenShift Logging 5.0.6

Bug fixes

This release also includes the following bug fixes:

  • LOG-1451 - [1927249] fieldmanager.go:186] [SHOULD NOT HAPPEN] failed to update managedFields…​duplicate entries for key [name="POLICY_MAPPING"] (LOG-1451)

  • LOG-1537 - Full Cluster Cert Redeploy is broken when the ES clusters includes non-data nodes(LOG-1537)

  • LOG-1430 - eventrouter raising "Observed a panic: &runtime.TypeAssertionError" (LOG-1430)

  • LOG-1461 - The index management job status is always Completed even when there has an error in the job log. (LOG-1461)

  • LOG-1459 - Operators missing disconnected annotation (LOG-1459)

  • LOG-1572 - Bug 1981579: Fix built-in application behavior to collect all of logs (LOG-1572)

OpenShift Logging 5.0.5

Security fixes

  • gogo/protobuf: plugin/unmarshal/unmarshal.go lacks certain index validation. (CVE-2021-3121)

  • glib: integer overflow in g_bytes_new function on 64-bit platforms due to an implicit cast from 64 bits to 32 bits(CVE-2021-27219)

The following issues relate to the above CVEs:

  • BZ#1921650 gogo/protobuf: plugin/unmarshal/unmarshal.go lacks certain index validation(BZ#1921650)

  • LOG-1361 CVE-2021-3121 elasticsearch-operator-container: gogo/protobuf: plugin/unmarshal/unmarshal.go lacks certain index validation [openshift-logging-5](LOG-1361)

  • LOG-1362 CVE-2021-3121 elasticsearch-proxy-container: gogo/protobuf: plugin/unmarshal/unmarshal.go lacks certain index validation [openshift-logging-5](LOG-1362)

  • LOG-1363 CVE-2021-3121 logging-eventrouter-container: gogo/protobuf: plugin/unmarshal/unmarshal.go lacks certain index validation [openshift-logging-5](LOG-1363)

OpenShift Logging 5.0.4

Security fixes

  • gogo/protobuf: plugin/unmarshal/unmarshal.go lacks certain index validation. (CVE-2021-3121)

The following Jira issues contain the above CVEs:

  • LOG-1364 CVE-2021-3121 cluster-logging-operator-container: gogo/protobuf: plugin/unmarshal/unmarshal.go lacks certain index validation [openshift-logging-5]. (LOG-1364)

Bug fixes

This release also includes the following bug fixes:

  • LOG-1328 Port fix to 5.0.z for BZ-1945168. (LOG-1328)

OpenShift Logging 5.0.3

Security fixes

  • jackson-databind: arbitrary code execution in slf4j-ext class (CVE-2018-14718)

  • jackson-databind: arbitrary code execution in blaze-ds-opt and blaze-ds-core classes (CVE-2018-14719)

  • jackson-databind: exfiltration/XXE in some JDK classes (CVE-2018-14720)

  • jackson-databind: server-side request forgery (SSRF) in axis2-jaxws class (CVE-2018-14721)

  • jackson-databind: improper polymorphic deserialization in axis2-transport-jms class (CVE-2018-19360)

  • jackson-databind: improper polymorphic deserialization in openjpa class (CVE-2018-19361)

  • jackson-databind: improper polymorphic deserialization in jboss-common-core class (CVE-2018-19362)

  • jackson-databind: default typing mishandling leading to remote code execution (CVE-2019-14379)

  • jackson-databind: serialization gadgets in com.pastdev.httpcomponents.configuration.JndiConfiguration (CVE-2020-24750)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to org.apache.commons.dbcp2.datasources.PerUserPoolDataSource (CVE-2020-35490)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to org.apache.commons.dbcp2.datasources.SharedPoolDataSource (CVE-2020-35491)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to com.oracle.wls.shaded.org.apache.xalan.lib.sql.JNDIConnectionPool (CVE-2020-35728)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to oadd.org.apache.commons.dbcp.cpdsadapter.DriverAdapterCPDS (CVE-2020-36179)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to org.apache.commons.dbcp2.cpdsadapter.DriverAdapterCPDS (CVE-2020-36180)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to org.apache.tomcat.dbcp.dbcp.cpdsadapter.DriverAdapterCPDS (CVE-2020-36181)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to org.apache.tomcat.dbcp.dbcp2.cpdsadapter.DriverAdapterCPDS (CVE-2020-36182)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to org.docx4j.org.apache.xalan.lib.sql.JNDIConnectionPool (CVE-2020-36183)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to org.apache.tomcat.dbcp.dbcp2.datasources.PerUserPoolDataSource (CVE-2020-36184)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to org.apache.tomcat.dbcp.dbcp2.datasources.SharedPoolDataSource (CVE-2020-36185)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to org.apache.tomcat.dbcp.dbcp.datasources.PerUserPoolDataSource (CVE-2020-36186)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to org.apache.tomcat.dbcp.dbcp.datasources.SharedPoolDataSource (CVE-2020-36187)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to com.newrelic.agent.deps.ch.qos.logback.core.db.JNDIConnectionSource (CVE-2020-36188)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to com.newrelic.agent.deps.ch.qos.logback.core.db.DriverManagerConnectionSource (CVE-2020-36189)

  • jackson-databind: mishandles the interaction between serialization gadgets and typing, related to javax.swing (CVE-2021-20190)

  • golang: data race in certain net/http servers including ReverseProxy can lead to DoS (CVE-2020-15586)

  • golang: ReadUvarint and ReadVarint can read an unlimited number of bytes from invalid inputs (CVE-2020-16845)

  • OpenJDK: Incomplete enforcement of JAR signing disabled algorithms (Libraries, 8249906) (CVE-2021-2163)

The following Jira issues contain the above CVEs:

  • LOG-1234 CVE-2020-15586 CVE-2020-16845 openshift-eventrouter: various flaws [openshift-4]. (LOG-1234)

  • LOG-1243 CVE-2018-14718 CVE-2018-14719 CVE-2018-14720 CVE-2018-14721 CVE-2018-19360 CVE-2018-19361 CVE-2018-19362 CVE-2019-14379 CVE-2020-35490 CVE-2020-35491 CVE-2020-35728…​ logging-elasticsearch6-container: various flaws [openshift-logging-5.0]. (LOG-1243)

Bug fixes

This release also includes the following bug fixes:

  • LOG-1224 Release 5.0 - ClusterLogForwarder namespace-specific log forwarding does not work as expected. (LOG-1224)

  • LOG-1232 5.0 - Bug 1859004 - Sometimes the eventrouter couldn’t gather event logs. (LOG-1232)

  • LOG-1299 Release 5.0 - Forwarding logs to Kafka using Chained certificates fails with error "state=error: certificate verify failed (unable to get local issuer certificate)". (LOG-1299)

OpenShift Logging 5.0.2

Bug fixes

  • If you did not set .proxy in the cluster installation configuration, and then configured a global proxy on the installed cluster, a bug prevented Fluentd from forwarding logs to Elasticsearch. To work around this issue, in the proxy/cluster configuration, set no_proxy to .svc.cluster.local so it skips internal traffic. The current release fixes the proxy configuration issue. Now, if you configure the global proxy after installing an OpenShift cluster, Fluentd forwards logs to Elasticsearch. (LOG-1187)

  • Previously, forwarding logs to Kafka using chained certificates failed with error "state=error: certificate verify failed (unable to get local issuer certificate)." Logs could not be forwarded to a Kafka broker with a certificate signed by an intermediate CA. This happened because fluentd Kafka plugin could only handle a single CA certificate supplied in the ca-bundle.crt entry of the corresponding secret. The current release fixes this issue by enabling the fluentd Kafka plugin to handle multiple CA certificates supplied in the ca-bundle.crt entry of the corresponding secret. Now, logs can be forwarded to a Kafka broker with a certificate signed by an intermediate CA. (LOG-1216, LOG-1218)

  • Previously, an update in the cluster service version (CSV) accidentally introduced resources and limits for the OpenShift Elasticsearch operator container. Under specific conditions, this caused an out-of-memory condition that terminated the Elasticsearch operator pod. The current release fixes this issue by removing the CSV resources and limits for the operator container. Now, the operator gets scheduled without issues. (LOG-1254)

OpenShift Logging 5.0.1

Bug fixes

  • Previously, if you enabled legacy log forwarding, logs were not sent to managed storage. This issue occurred because the generated log forwarding configuration improperly chose between either log forwarding or legacy log forwarding. The current release fixes this issue. If the ClusterLogging CR defines a logstore, logs are sent to managed storage. Additionally, if legacy log forwarding is enabled, logs are sent to legacy log forwarding regardless of whether managed storage is enabled. (LOG-1172)

  • Previously, while under load, Elasticsearch responded to some requests with an HTTP 500 error, even though there was nothing wrong with the cluster. Retrying the request was successful. This release fixes the issue by updating the cron jobs to be more resilient when encountering temporary HTTP 500 errors. Now, they will retry a request multiple times first before failing. (LOG-1215)

OpenShift Logging 5.0.0

New features and enhancements

This release adds improvements related to the following concepts.

Cluster Logging becomes Red Hat OpenShift Logging

With this release, Cluster Logging becomes Red Hat OpenShift Logging 5.0.

Maximum five primary shards per index

With this release, the OpenShift Elasticsearch Operator (EO) sets the number of primary shards for an index between one and five, depending on the number of data nodes defined for a cluster.

Previously, the EO set the number of shards for an index to the number of data nodes. When an index in Elasticsearch was configured with a number of replicas, it created that many replicas for each primary shard, not per index. Therefore, as the index sharded, a greater number of replica shards existed in the cluster, which created a lot of overhead for the cluster to replicate and keep in sync.

Updated OpenShift Elasticsearch Operator name and maturity level

This release updates the display name of the OpenShift Elasticsearch Operator and operator maturity level. The new display name and clarified specific use for the OpenShift Elasticsearch Operator are updated in Operator Hub.

OpenShift Elasticsearch Operator reports on CSV success

This release adds reporting metrics to indicate that installing or upgrading the ClusterServiceVersion (CSV) object for the OpenShift Elasticsearch Operator was successful. Previously, there was no way to determine, or generate an alert, if the installing or upgrading the CSV failed. Now, an alert is provided as part of the OpenShift Elasticsearch Operator.

Reduce Elasticsearch pod certificate permission warnings

Previously, when the Elasticsearch pod started, it generated certificate permission warnings, which misled some users to troubleshoot their clusters. The current release fixes these permissions issues to reduce these types of notifications.

This release adds a link from the alerts that an Elasticsearch cluster generates to a page of explanations and troubleshooting steps for that alert.

New connection timeout for deletion jobs

The current release adds a connection timeout for deletion jobs, which helps prevent pods from occasionally hanging when they query Elasticsearch to delete indices. Now, if the underlying 'curl' call does not connect before the timeout period elapses, the timeout terminates the call.

Minimize updates to rollover index templates

With this enhancement, the OpenShift Elasticsearch Operator only updates its rollover index templates if they have different field values. Index templates have a higher priority than indices. When the template is updated, the cluster prioritizes distributing them over the index shards, impacting performance. To minimize Elasticsearch cluster operations, the operator only updates the templates when the number of primary shards or replica shards changes from what is currently configured.

Technology Preview features

Some features in this release are currently in Technology Preview. These experimental features are not intended for production use. Note the following scope of support on the Red Hat Customer Portal for these features:

In the table below, features are marked with the following statuses:

  • TP: Technology Preview

  • GA: General Availability

  • -: Not Available

Table 2. Technology Preview tracker
Feature OCP 4.5 OCP 4.6 Logging 5.0

Log forwarding

TP

GA

GA

Deprecated and removed features

Some features available in previous releases have been deprecated or removed.

Deprecated functionality is still included in OpenShift Logging and continues to be supported; however, it will be removed in a future release of this product and is not recommended for new deployments.

Elasticsearch Curator has been deprecated

The Elasticsearch Curator has been deprecated and will be removed in a future release. Elasticsearch Curator helped you curate or manage your indices on OpenShift Container Platform 4.4 and earlier. Instead of using Elasticsearch Curator, configure the log retention time.

Forwarding logs using the legacy Fluentd and legacy syslog methods have been deprecated

From OKD 4.6 to the present, forwarding logs by using the legacy Fluentd and legacy syslog methods have been deprecated and will be removed in a future release. Use the standard non-legacy methods instead.

Bug fixes

  • Previously, Elasticsearch rejected HTTP requests whose headers exceeded the default max header size, 8 KB. Now, the max header size is 128 KB, and Elasticsearch no longer rejects HTTP requests for exceeding the max header size. (BZ#1845293)

  • Previously, nodes did not recover from Pending status because a software bug did not correctly update their statuses in the Elasticsearch custom resource (CR). The current release fixes this issue, so the nodes can recover when their status is Pending. (BZ#1887357)

  • Previously, when the Cluster Logging Operator (CLO) scaled down the number of Elasticsearch nodes in the clusterlogging CR to three nodes, it omitted previously-created nodes that had unique IDs. The OpenShift Elasticsearch Operator rejected the update because it has safeguards that prevent nodes with unique IDs from being removed. Now, when the CLO scales down the number of nodes and updates the Elasticsearch CR, it marks nodes with unique IDs as count 0 instead of omitting them. As a result, users can scale down their cluster to 3 nodes by using the clusterlogging CR. (BZ#1879150)

In OpenShift Logging 5.0 and later, the Cluster Logging Operator is called Red Hat OpenShift Logging Operator.

  • Previously, the Fluentd collector pod went into a crash loop when the ClusterLogForwarder had an incorrectly-configured secret. The current release fixes this issue. Now, the ClusterLogForwarder validates the secrets and reports any errors in its status field. As a result, it does not cause the Fluentd collector pod to crash. (BZ#1888943)

  • Previously, if you updated the Kibana resource configuration in the clusterlogging instance to resource{}, the resulting nil map caused a panic and changed the status of the OpenShift Elasticsearch Operator to CrashLoopBackOff. The current release fixes this issue by initializing the map. (BZ#1889573)

  • Previously, the fluentd collector pod went into a crash loop when the ClusterLogForwarder had multiple outputs using the same secret. The current release fixes this issue. Now, multiple outputs can share a secret. (BZ#1890072)

  • Previously, if you deleted a Kibana route, the Cluster Logging Operator (CLO) could not recover or recreate it. Now, the CLO watches the route, and if you delete the route, the OpenShift Elasticsearch Operator can reconcile or recreate it. (BZ#1890825)

  • Previously, the Cluster Logging Operator (CLO) would attempt to reconcile the Elasticsearch resource, which depended upon the Red Hat-provided Elastic Custom Resource Definition (CRD). Attempts to list an unknown kind caused the CLO to exit its reconciliation loop. This happened because the CLO tried to reconcile all of its managed resources whether they were defined or not. The current release fixes this issue. The CLO only reconciles types provided by the OpenShift Elasticsearch Operator if a user defines managed storage. As a result, users can create collector-only deployments of cluster logging by deploying the CLO. (BZ#1891738)

  • Previously, because of an LF GA syslog implementation for RFC 3164, logs sent to remote syslog were not compatible with the legacy behavior. The current release fixes this issue. AddLogSource adds details about log’s source details to the "message" field. Now, logs sent to remote syslog are compatible with the legacy behavior. (BZ#1891886)

  • Previously, the Elasticsearch rollover pods failed with a resource_already_exists_exception error. Within the Elasticsearch rollover API, when the next index was created, the *-write alias was not updated to point to it. As a result, the next time the rollover API endpoint was triggered for that particular index, it received an error that the resource already existed.

    The current release fixes this issue. Now, when a rollover occurs in the indexmanagement cronjobs, if a new index was created, it verifies that the alias points to the new index. This behavior prevents the error. If the cluster is already receiving this error, a cronjob fixes the issue so that subsequent runs work as expected. Now, performing rollovers no longer produces the exception. (BZ#1893992)

  • Previously, Fluent stopped sending logs even though the logging stack seemed functional. Logs were not shipped to an endpoint for an extended period even when an endpoint came back up. This happened if the max backoff time was too long and the endpoint was down. The current release fixes this issue by lowering the max backoff time, so the logs are shipped sooner. (BZ#1894634)

  • Previously, omitting the Storage size of the Elasticsearch node caused panic in the OpenShift Elasticsearch Operator code. This panic appeared in the logs as: Observed a panic: "invalid memory address or nil pointer dereference" The panic happened because although Storage size is a required field, the software didn’t check for it. The current release fixes this issue, so there is no panic if the storage size is omitted. Instead, the storage defaults to ephemeral storage and generates a log message for the user. (BZ#1899589)

  • Previously, elasticsearch-rollover and elasticsearch-delete pods remained in the Invalid JSON: or ValueError: No JSON object could be decoded error states. This exception was raised because there was no exception handler for invalid JSON input. The current release fixes this issue by providing a handler for invalid JSON input. As a result, the handler outputs an error message instead of an exception traceback, and the elasticsearch-rollover and elasticsearch-delete jobs do not remain those error states. (BZ#1899905)

  • Previously, when deploying Fluentd as a stand-alone, a Kibana pod was created even if the value of replicas was 0. This happened because Kibana defaulted to 1 pod even when there were no Elasticsearch nodes. The current release fixes this. Now, a Kibana only defaults to 1 when there are one or more Elasticsearch nodes. (BZ#1901424)

  • Previously, if you deleted the secret, it was not recreated. Even though the certificates were on a disk local to the operator, they weren’t rewritten because they hadn’t changed. That is, certificates were only written if they changed. The current release fixes this issue. It rewrites the secret if the certificate changes or is not found. Now, if you delete the master-certs, they are replaced. (BZ#1901869)

  • Previously, if a cluster had multiple custom resources with the same name, the resource would get selected alphabetically when not fully qualified with the API group. As a result, if you installed both Red Hat’s OpenShift Elasticsearch Operator alongside the OpenShift Elasticsearch Operator, you would see failures when collected data via a must-gather report. The current release fixes this issue by ensuring must-gathers now use the full API group when gathering information about the cluster’s custom resources. (BZ#1897731)

  • An earlier bug fix to address issues related to certificate generation introduced an error. Trying to read the certificates caused them to be regenerated because they were recognized as missing. This, in turn, triggered the OpenShift Elasticsearch Operator to perform a rolling upgrade on the Elasticsearch cluster and, potentially, to have mismatched certificates. This bug was caused by the operator incorrectly writing certificates to the working directory. The current release fixes this issue. Now the operator consistently reads and writes certificates to the same working directory, and the certificates are only regenerated if needed. (BZ#1905910)

  • Previously, queries to the root endpoint to retrieve the Elasticsearch version received a 403 response. The 403 response broke any services that used this endpoint in prior releases. This error happened because non-administrative users did not have the monitor permission required to query the root endpoint and retrieve the Elasticsearch version. Now, non-administrative users can query the root endpoint for the deployed version of Elasticsearch. (BZ#1906765)

  • Previously, in some bulk insertion situations, the Elasticsearch proxy timed out connections between fluentd and Elasticsearch. As a result, fluentd failed to deliver messages and logged a Server returned nothing (no headers, no data) error. The current release fixes this issue: It increases the default HTTP read and write timeouts in the Elasticsearch proxy from five seconds to one minute. It also provides command-line options in the Elasticsearch proxy to control HTTP timeouts in the field. (BZ#1908707)

  • Previously, in some cases, the {ProductName}/Elasticsearch dashboard was missing from the OKD monitoring dashboard because the dashboard configuration resource referred to a different namespace owner and caused the OKD to garbage-collect that resource. Now, the ownership reference is removed from the OpenShift Elasticsearch Operator reconciler configuration, and the logging dashboard appears in the console. (BZ#1910259)

  • Previously, the code that uses environment variables to replace values in the Kibana configuration file did not consider commented lines. This prevented users from overriding the default value of server.maxPayloadBytes. The current release fixes this issue by uncommenting the default value of server.maxPayloadByteswithin. Now, users can override the value by using environment variables, as documented. (BZ#1918876)

  • Previously, the Kibana log level was increased not to suppress instructions to delete indices that failed to migrate, which also caused the display of GET requests at the INFO level that contained the Kibana user’s email address and OAuth token. The current release fixes this issue by masking these fields, so the Kibana logs do not display them. (BZ#1925081)

Known issues

  • Fluentd pods with the ruby-kafka-1.1.0 and fluent-plugin-kafka-0.13.1 gems are not compatible with Apache Kafka version 0.10.1.0.

    As a result, log forwarding to Kafka fails with a message: error_class=Kafka::DeliveryFailed error="Failed to send messages to flux-openshift-v4/1"

    The ruby-kafka-0.7 gem dropped support for Kafka 0.10 in favor of native support for Kafka 0.11. The ruby-kafka-1.0.0 gem added support for Kafka 2.3 and 2.4. The current version of OpenShift Logging tests and therefore supports Kafka version 2.4.1.

    To work around this issue, upgrade to a supported version of Apache Kafka.