About log collection and forwarding - Logging | Observability | Red Hat OpenShift <strong>service</strong> on AWS

Log collection
Log forwarding
- Log forwarding implementations
- Enabling the multi log forwarder feature for a cluster

The Red Hat OpenShift Logging Operator deploys a collector based on the ClusterLogForwarder resource specification. There are two collector options supported by this Operator: the legacy Fluentd collector, and the Vector collector.

Fluentd is deprecated and is planned to be removed in a future release. Red Hat provides bug fixes and support for this feature during the current release lifecycle, but this feature no longer receives enhancements. As an alternative to Fluentd, you can use Vector instead.

Log collection

The log collector is a daemon set that deploys pods to each Red Hat OpenShift service on AWS node to collect container and node logs.

By default, the log collector uses the following sources:

System and infrastructure logs generated by journald log messages from the operating system, the container runtime, and Red Hat OpenShift service on AWS.
/var/log/containers/*.log for all container logs.

If you configure the log collector to collect audit logs, it collects them from /var/log/audit/audit.log.

The log collector collects the logs from these sources and forwards them internally or externally depending on your logging configuration.

edit

Log collector types

Vector is a log collector offered as an alternative to Fluentd for the logging.

You can configure which logging collector type your cluster uses by modifying the ClusterLogging custom resource (CR) collection spec:

Example ClusterLogging CR that configures Vector as the collector

apiVersion: logging.openshift.io/v1
kind: ClusterLogging
metadata:
  name: instance
  namespace: openshift-logging
spec:
  collection:
    logs:
      type: vector
      vector: {}
# ...

Log collection limitations

The container runtimes provide minimal information to identify the source of log messages: project, pod name, and container ID. This information is not sufficient to uniquely identify the source of the logs. If a pod with a given name and project is deleted before the log collector begins processing its logs, information from the API server, such as labels and annotations, might not be available. There might not be a way to distinguish the log messages from a similarly named pod and project or trace the logs to their source. This limitation means that log collection and normalization are considered best effort.

The available container runtimes provide minimal information to identify the source of log messages and do not guarantee unique individual log messages or that these messages can be traced to their source.

edit

Log collector features by type

Table 1. Log Sources
Feature	Fluentd	Vector
App container logs	✓	✓
App-specific routing	✓	✓
App-specific routing by namespace	✓	✓
Infra container logs	✓	✓
Infra journal logs	✓	✓
Kube API audit logs	✓	✓
OpenShift API audit logs	✓	✓
Open Virtual Network (OVN) audit logs	✓	✓

Table 2. Authorization and Authentication
Feature	Fluentd	Vector
Elasticsearch certificates	✓	✓
Elasticsearch username / password	✓	✓
Amazon Cloudwatch keys	✓	✓
Amazon Cloudwatch STS	✓	✓
Kafka certificates	✓	✓
Kafka username / password	✓	✓
Kafka SASL	✓	✓
Loki bearer token	✓	✓

Table 3. Normalizations and Transformations
Feature	Fluentd	Vector
Viaq data model - app	✓	✓
Viaq data model - infra	✓	✓
Viaq data model - infra(journal)	✓	✓
Viaq data model - Linux audit	✓	✓
Viaq data model - kube-apiserver audit	✓	✓
Viaq data model - OpenShift API audit	✓	✓
Viaq data model - OVN	✓	✓
Loglevel Normalization	✓	✓
JSON parsing	✓	✓
Structured Index	✓	✓
Multiline error detection	✓	✓
Multicontainer / split indices	✓	✓
Flatten labels	✓	✓
CLF static labels	✓	✓

Table 4. Tuning
Feature	Fluentd	Vector
Fluentd readlinelimit	✓
Fluentd buffer	✓
- chunklimitsize	✓
- totallimitsize	✓
- overflowaction	✓
- flushthreadcount	✓
- flushmode	✓
- flushinterval	✓
- retrywait	✓
- retrytype	✓
- retrymaxinterval	✓
- retrytimeout	✓

Table 5. Visibility
Feature	Fluentd	Vector
Metrics	✓	✓
Dashboard	✓	✓
Alerts	✓	✓

Table 6. Miscellaneous
Feature	Fluentd	Vector
Global proxy support	✓	✓
x86 support	✓	✓
ARM support	✓	✓
IBM Power® support	✓	✓
IBM Z® support	✓	✓
IPv6 support	✓	✓
Log event buffering	✓
Disconnected Cluster	✓	✓

edit

Collector outputs

The following collector outputs are supported:

Table 7. Supported outputs
Feature	Fluentd	Vector
Elasticsearch v6-v8	✓	✓
Fluent forward	✓
Syslog RFC3164	✓	✓ (Logging 5.7+)
Syslog RFC5424	✓	✓ (Logging 5.7+)
Kafka	✓	✓
Amazon Cloudwatch	✓	✓
Amazon Cloudwatch STS	✓	✓
Loki	✓	✓
HTTP	✓	✓ (Logging 5.7+)
Google Cloud Logging	✓	✓
Splunk		✓ (Logging 5.6+)

Log forwarding

Administrators can create ClusterLogForwarder resources that specify which logs are collected, how they are transformed, and where they are forwarded to.

ClusterLogForwarder resources can be used up to forward container, infrastructure, and audit logs to specific endpoints within or outside of a cluster. Transport Layer Security (TLS) is supported so that log forwarders can be configured to send logs securely.

Administrators can also authorize RBAC permissions that define which service accounts and users can access and forward which types of logs.

Log forwarding implementations

There are two log forwarding implementations available: the legacy implementation, and the multi log forwarder feature.

Only the Vector collector is supported for use with the multi log forwarder feature. The Fluentd collector can only be used with legacy implementations.

edit

Legacy implementation

In legacy implementations, you can only use one log forwarder in your cluster. The ClusterLogForwarder resource in this mode must be named instance, and must be created in the openshift-logging namespace. The ClusterLogForwarder resource also requires a corresponding ClusterLogging resource named instance in the openshift-logging namespace.

Multi log forwarder feature

The multi log forwarder feature is available in logging 5.8 and later, and provides the following functionality:

Administrators can control which users are allowed to define log collection and which logs they are allowed to collect.
Users who have the required permissions are able to specify additional log collection configurations.
Administrators who are migrating from the deprecated Fluentd collector to the Vector collector can deploy a new log forwarder separately from their existing deployment. The existing and new log forwarders can operate simultaneously while workloads are being migrated.

In multi log forwarder implementations, you are not required to create a corresponding ClusterLogging resource for your ClusterLogForwarder resource. You can create multiple ClusterLogForwarder resources using any name, in any namespace, with the following exceptions:

You cannot create a ClusterLogForwarder resource named instance in the openshift-logging namespace, because this is reserved for a log forwarder that supports the legacy workflow using the Fluentd collector.
You cannot create a ClusterLogForwarder resource named collector in the openshift-logging namespace, because this is reserved for the collector.

edit

Enabling the multi log forwarder feature for a cluster

To use the multi log forwarder feature, you must create a service account and cluster role bindings for that service account. You can then reference the service account in the ClusterLogForwarder resource to control access permissions.

In order to support multi log forwarding in additional namespaces other than the openshift-logging namespace, you must update the Red Hat OpenShift Logging Operator to watch all namespaces]. This functionality is supported by default in new Red Hat OpenShift Logging Operator version 5.8 installations.

edit

Authorizing log collection RBAC permissions

In logging 5.8 and later, the Red Hat OpenShift Logging Operator provides collect-audit-logs, collect-application-logs, and collect-infrastructure-logs cluster roles, which enable the collector to collect audit logs, application logs, and infrastructure logs respectively.

You can authorize RBAC permissions for log collection by binding the required cluster roles to a service account.

Prerequisites

The Red Hat OpenShift Logging Operator is installed in the openshift-logging namespace.
You have administrator permissions.

Procedure

Create a service account for the collector. If you want to write logs to storage that requires a token for authentication, you must include a token in the service account.

Bind the appropriate cluster roles to the service account:

Example binding command

$ oc adm policy add-cluster-role-to-user <cluster_role_name> system:serviceaccount:<namespace_name>:<service_account_name>

Additional resources

Using RBAC Authorization Kubernetes documentation