Configuring the distributed tracing platform - Distributed tracing installation | Distributed tracing

Deploying the distributed tracing default strategy from the web console
- Deploying the distributed tracing default strategy from the CLI
Deploying the distributed tracing production strategy from the web console
- Deploying the distributed tracing production strategy from the CLI
Deploying the distributed tracing streaming strategy from the web console
- Deploying the distributed tracing streaming strategy from the CLI
Validating your deployment
- Accessing the Jaeger console
Customizing your deployment
Injecting sidecars
- Automatically injecting sidecars
- Manually injecting sidecars

The Red Hat OpenShift distributed tracing platform Operator uses a custom resource definition (CRD) file that defines the architecture and configuration settings to be used when creating and deploying the distributed tracing platform resources. You can either install the default configuration or modify the file to better suit your business requirements.

Red Hat OpenShift distributed tracing platform has predefined deployment strategies. You specify a deployment strategy in the custom resource file. When you create a distributed tracing platform instance the Operator uses this configuration file to create the objects necessary for the deployment.

Jaeger custom resource file showing deployment strategy

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: MyConfigFile
spec:
  strategy: production (1)

The Red Hat OpenShift distributed tracing platform Operator currently supports the following deployment strategies:

allInOne (Default) - This strategy is intended for development, testing, and demo purposes; it is not intended for production use. The main backend components, Agent, Collector, and Query service, are all packaged into a single executable which is configured, by default. to use in-memory storage.

In-memory storage is not persistent, which means that if the distributed tracing platform instance shuts down, restarts, or is replaced, that your trace data will be lost. And in-memory storage cannot be scaled, since each pod has its own memory. For persistent storage, you must use the production or streaming strategies, which use Elasticsearch as the default storage.

production - The production strategy is intended for production environments, where long term storage of trace data is important, as well as a more scalable and highly available architecture is required. Each of the backend components is therefore deployed separately. The Agent can be injected as a sidecar on the instrumented application. The Query and Collector services are configured with a supported storage type - currently Elasticsearch. Multiple instances of each of these components can be provisioned as required for performance and resilience purposes.
streaming - The streaming strategy is designed to augment the production strategy by providing a streaming capability that effectively sits between the Collector and the Elasticsearch backend storage. This provides the benefit of reducing the pressure on the backend storage, under high load situations, and enables other trace post-processing capabilities to tap into the real time span data directly from the streaming platform (AMQ Streams/ Kafka).

The streaming strategy requires an additional Red Hat subscription for AMQ Streams.

The streaming deployment strategy is currently unsupported on IBM Z.

There are two ways to install and use Red Hat OpenShift distributed tracing, as part of a service mesh or as a stand alone component. If you have installed distributed tracing as part of Red Hat OpenShift Service Mesh, you can perform basic configuration as part of the ServiceMeshControlPlane but for completely control you should configure a Jaeger CR and then reference your distributed tracing configuration file in the ServiceMeshControlPlane.

Deploying the distributed tracing default strategy from the web console

The custom resource definition (CRD) defines the configuration used when you deploy an instance of Red Hat OpenShift distributed tracing. The default CR is named jaeger-all-in-one-inmemory and it is configured with minimal resources to ensure that you can successfully install it on a default OpenShift Container Platform installation. You can use this default configuration to create a Red Hat OpenShift distributed tracing platform instance that uses the AllInOne deployment strategy, or you can define your own custom resource file.

In-memory storage is not persistent. If the Jaeger pod shuts down, restarts, or is replaced, your trace data will be lost. For persistent storage, you must use the production or streaming strategies, which use Elasticsearch as the default storage.

Prerequisites

The Red Hat OpenShift distributed tracing platform Operator has been installed.
You have reviewed the instructions for how to customize the deployment.
You have access to the cluster as a user with the cluster-admin role.

Procedure

Log in to the OpenShift Container Platform web console as a user with the cluster-admin role.
Create a new project, for example tracing-system.

If you are installing as part of Service Mesh, the distributed tracing resources must be installed in the same namespace as the ServiceMeshControlPlane resource, for example istio-system.
1. Navigate to Home → Projects.
2. Click Create Project.
3. Enter tracing-system in the Name field.
4. Click Create.
Navigate to Operators → Installed Operators.
If necessary, select tracing-system from the Project menu. You may have to wait a few moments for the Operators to be copied to the new project.
Click the Red Hat OpenShift distributed tracing platform Operator. On the Details tab, under Provided APIs, the Operator provides a single link.
Under Jaeger, click Create Instance.
On the Create Jaeger page, to install using the defaults, click Create to create the distributed tracing platform instance.
On the Jaegers page, click the name of the distributed tracing platform instance, for example, jaeger-all-in-one-inmemory.
On the Jaeger Details page, click the Resources tab. Wait until the pod has a status of "Running" before continuing.

Deploying the distributed tracing default strategy from the CLI

Follow this procedure to create an instance of distributed tracing platform from the command line.

Prerequisites

The Red Hat OpenShift distributed tracing platform Operator has been installed and verified.
You have reviewed the instructions for how to customize the deployment.
You have access to the OpenShift CLI (oc) that matches your OpenShift Container Platform version.
You have access to the cluster as a user with the cluster-admin role.

Procedure

Log in to the OpenShift Container Platform CLI as a user with the cluster-admin role.
```
$ oc login --username=<NAMEOFUSER> https://<HOSTNAME>:8443
```
Create a new project named tracing-system.
```
$ oc new-project tracing-system
```
Create a custom resource file named jaeger.yaml that contains the following text:
Example jaeger-all-in-one.yaml
```
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: jaeger-all-in-one-inmemory
```
Run the following command to deploy distributed tracing platform:
```
$ oc create -n tracing-system -f jaeger.yaml
```

Run the following command to watch the progress of the pods during the installation process:

$ oc get pods -n tracing-system -w

After the installation process has completed, you should see output similar to the following example:

NAME                                         READY   STATUS    RESTARTS   AGE
jaeger-all-in-one-inmemory-cdff7897b-qhfdx   2/2     Running   0          24s

Deploying the distributed tracing production strategy from the web console

The production deployment strategy is intended for production environments that require a more scalable and highly available architecture, and where long-term storage of trace data is important.

Prerequisites

The OpenShift Elasticsearch Operator has been installed.
The Red Hat OpenShift distributed tracing platform Operator has been installed.
You have reviewed the instructions for how to customize the deployment.
You have access to the cluster as a user with the cluster-admin role.

Procedure

Log in to the OpenShift Container Platform web console as a user with the cluster-admin role.
Create a new project, for example tracing-system.

If you are installing as part of Service Mesh, the distributed tracing resources must be installed in the same namespace as the ServiceMeshControlPlane resource, for example istio-system.
1. Navigate to Home → Projects.
2. Click Create Project.
3. Enter tracing-system in the Name field.
4. Click Create.
Navigate to Operators → Installed Operators.
If necessary, select tracing-system from the Project menu. You may have to wait a few moments for the Operators to be copied to the new project.
Click the Red Hat OpenShift distributed tracing platform Operator. On the Overview tab, under Provided APIs, the Operator provides a single link.
Under Jaeger, click Create Instance.

On the Create Jaeger page, replace the default all-in-one YAML text with your production YAML configuration, for example:

Example jaeger-production.yaml file with Elasticsearch

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: jaeger-production
  namespace:
spec:
  strategy: production
  ingress:
    security: oauth-proxy
  storage:
    type: elasticsearch
    elasticsearch:
      nodeCount: 3
      redundancyPolicy: SingleRedundancy
    esIndexCleaner:
      enabled: true
      numberOfDays: 7
      schedule: 55 23 * * *
    esRollover:
      schedule: '*/30 * * * *'

Click Create to create the distributed tracing platform instance.
On the Jaegers page, click the name of the distributed tracing platform instance, for example, jaeger-prod-elasticsearch.
On the Jaeger Details page, click the Resources tab. Wait until all the pods have a status of "Running" before continuing.

Deploying the distributed tracing production strategy from the CLI

Follow this procedure to create an instance of distributed tracing platform from the command line.

Prerequisites

The OpenShift Elasticsearch Operator has been installed.
The Red Hat OpenShift distributed tracing platform Operator has been installed.
You have reviewed the instructions for how to customize the deployment.
You have access to the OpenShift CLI (oc) that matches your OpenShift Container Platform version.
You have access to the cluster as a user with the cluster-admin role.

Procedure

Log in to the OpenShift Container Platform CLI as a user with the cluster-admin role.
```
$ oc login --username=<NAMEOFUSER> https://<HOSTNAME>:8443
```
Create a new project named tracing-system.
```
$ oc new-project tracing-system
```
Create a custom resource file named jaeger-production.yaml that contains the text of the example file in the previous procedure.
Run the following command to deploy distributed tracing platform:
```
$ oc create -n tracing-system -f jaeger-production.yaml
```

Run the following command to watch the progress of the pods during the installation process:

$ oc get pods -n tracing-system -w

After the installation process has completed, you should see output similar to the following example:

NAME                                                              READY   STATUS    RESTARTS   AGE
elasticsearch-cdm-jaegersystemjaegerproduction-1-6676cf568gwhlw   2/2     Running   0          10m
elasticsearch-cdm-jaegersystemjaegerproduction-2-bcd4c8bf5l6g6w   2/2     Running   0          10m
elasticsearch-cdm-jaegersystemjaegerproduction-3-844d6d9694hhst   2/2     Running   0          10m
jaeger-production-collector-94cd847d-jwjlj                        1/1     Running   3          8m32s
jaeger-production-query-5cbfbd499d-tv8zf                          3/3     Running   3          8m32s

Deploying the distributed tracing streaming strategy from the web console

The streaming deployment strategy is intended for production environments that require a more scalable and highly available architecture, and where long-term storage of trace data is important.

The streaming strategy provides a streaming capability that sits between the Collector and the Elasticsearch storage. This reduces the pressure on the storage under high load situations, and enables other trace post-processing capabilities to tap into the real-time span data directly from the Kafka streaming platform.

The streaming strategy requires an additional Red Hat subscription for AMQ Streams. If you do not have an AMQ Streams subscription, contact your sales representative for more information.

The streaming deployment strategy is currently unsupported on IBM Z.

Prerequisites

The AMQ Streams Operator has been installed. If using version 1.4.0 or higher you can use self-provisioning. Otherwise you must create the Kafka instance.
The Red Hat OpenShift distributed tracing platform Operator has been installed.
You have reviewed the instructions for how to customize the deployment.
You have access to the cluster as a user with the cluster-admin role.

Procedure

Log in to the OpenShift Container Platform web console as a user with the cluster-admin role.
Create a new project, for example tracing-system.

If you are installing as part of Service Mesh, the distributed tracing resources must be installed in the same namespace as the ServiceMeshControlPlane resource, for example istio-system.
1. Navigate to Home → Projects.
2. Click Create Project.
3. Enter tracing-system in the Name field.
4. Click Create.
Navigate to Operators → Installed Operators.
If necessary, select tracing-system from the Project menu. You may have to wait a few moments for the Operators to be copied to the new project.
Click the Red Hat OpenShift distributed tracing platform Operator. On the Overview tab, under Provided APIs, the Operator provides a single link.
Under Jaeger, click Create Instance.
On the Create Jaeger page, replace the default all-in-one YAML text with your streaming YAML configuration, for example:

Example jaeger-streaming.yaml file

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: jaeger-streaming
spec:
  strategy: streaming
  collector:
    options:
      kafka:
        producer:
          topic: jaeger-spans
          #Note: If brokers are not defined,AMQStreams 1.4.0+ will self-provision Kafka.
          brokers: my-cluster-kafka-brokers.kafka:9092
  storage:
    type: elasticsearch
  ingester:
    options:
      kafka:
        consumer:
          topic: jaeger-spans
          brokers: my-cluster-kafka-brokers.kafka:9092

Click Create to create the distributed tracing platform instance.
On the Jaegers page, click the name of the distributed tracing platform instance, for example, jaeger-streaming.
On the Jaeger Details page, click the Resources tab. Wait until all the pods have a status of "Running" before continuing.

Deploying the distributed tracing streaming strategy from the CLI

Follow this procedure to create an instance of distributed tracing platform from the command line.

Prerequisites

The AMQ Streams Operator has been installed. If using version 1.4.0 or higher you can use self-provisioning. Otherwise you must create the Kafka instance.
The Red Hat OpenShift distributed tracing platform Operator has been installed.
You have reviewed the instructions for how to customize the deployment.
You have access to the OpenShift CLI (oc) that matches your OpenShift Container Platform version.
You have access to the cluster as a user with the cluster-admin role.

Procedure

Log in to the OpenShift Container Platform CLI as a user with the cluster-admin role.
```
$ oc login --username=<NAMEOFUSER> https://<HOSTNAME>:8443
```
Create a new project named tracing-system.
```
$ oc new-project tracing-system
```
Create a custom resource file named jaeger-streaming.yaml that contains the text of the example file in the previous procedure.

Run the following command to deploy Jaeger:

$ oc create -n tracing-system -f jaeger-streaming.yaml

Run the following command to watch the progress of the pods during the installation process:

$ oc get pods -n tracing-system -w

After the installation process has completed, you should see output similar to the following example:

NAME                                                              READY   STATUS    RESTARTS   AGE
elasticsearch-cdm-jaegersystemjaegerstreaming-1-697b66d6fcztcnn   2/2     Running   0          5m40s
elasticsearch-cdm-jaegersystemjaegerstreaming-2-5f4b95c78b9gckz   2/2     Running   0          5m37s
elasticsearch-cdm-jaegersystemjaegerstreaming-3-7b6d964576nnz97   2/2     Running   0          5m5s
jaeger-streaming-collector-6f6db7f99f-rtcfm                       1/1     Running   0          80s
jaeger-streaming-entity-operator-6b6d67cc99-4lm9q                 3/3     Running   2          2m18s
jaeger-streaming-ingester-7d479847f8-5h8kc                        1/1     Running   0          80s
jaeger-streaming-kafka-0                                          2/2     Running   0          3m1s
jaeger-streaming-query-65bf5bb854-ncnc7                           3/3     Running   0          80s
jaeger-streaming-zookeeper-0                                      2/2     Running   0          3m39s

Validating your deployment

Accessing the Jaeger console

To access the Jaeger console you must have either Red Hat OpenShift Service Mesh or Red Hat OpenShift distributed tracing installed, and Red Hat OpenShift distributed tracing platform installed, configured, and deployed.

The installation process creates a route to access the Jaeger console.

If you know the URL for the Jaeger console, you can access it directly. If you do not know the URL, use the following directions.

Procedure from OpenShift console

Log in to the OpenShift Container Platform web console as a user with cluster-admin rights. If you use Red Hat OpenShift Dedicated, you must have an account with the dedicated-admin role.
Navigate to Networking → routes.
On the routes page, select the control plane project, for example tracing-system, from the Namespace menu.

The Location column displays the linked address for each route.
If necessary, use the filter to find the jaeger route. Click the route Location to launch the console.
Click Log In With OpenShift.

Procedure from the CLI

Log in to the OpenShift Container Platform CLI as a user with the cluster-admin role. If you use Red Hat OpenShift Dedicated, you must have an account with the dedicated-admin role.
```
$ oc login --username=<NAMEOFUSER> https://<HOSTNAME>:6443
```
To query for details of the route using the command line, enter the following command. In this example, tracing-system is the control plane namespace.
```
$ export JAEGER_URL=$(oc get route -n tracing-system jaeger -o jsonpath='{.spec.host}')
```
Launch a browser and navigate to https://<JAEGER_URL>, where <JAEGER_URL> is the route that you discovered in the previous step.
Log in using the same user name and password that you use to access the OpenShift Container Platform console.
If you have added services to the service mesh and have generated traces, you can use the filters and Find Traces button to search your trace data.

If you are validating the console installation, there is no trace data to display.

Customizing your deployment

Deployment best practices

Red Hat OpenShift distributed tracing instance names must be unique. If you want to have multiple Red Hat OpenShift distributed tracing platform instances and are using sidecar injected agents, then the Red Hat OpenShift distributed tracing platform instances should have unique names, and the injection annotation should explicitly specify the Red Hat OpenShift distributed tracing platform instance name the tracing data should be reported to.
If you have a multitenant implementation and tenants are separated by namespaces, deploy a Red Hat OpenShift distributed tracing platform instance to each tenant namespace.
- Agent as a daemonset is not supported for multitenant installations or Red Hat OpenShift Dedicated. Agent as a sidecar is the only supported configuration for these use cases.
If you are installing distributed tracing as part of Red Hat OpenShift Service Mesh, the distributed tracing resources must be installed in the same namespace as the ServiceMeshControlPlane resource.

For information about configuring persistent storage, see Understanding persistent storage and the appropriate configuration topic for your chosen storage option.

Distributed tracing default configuration options

The Jaeger custom resource (CR) defines the architecture and settings to be used when creating the distributed tracing platform resources. You can modify these parameters to customize your distributed tracing platform implementation to your business needs.

Jaeger generic YAML example

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: name
spec:
  strategy: <deployment_strategy>
  allInOne:
    options: {}
    resources: {}
  agent:
    options: {}
    resources: {}
  collector:
    options: {}
    resources: {}
  sampling:
    options: {}
  storage:
    type:
    options: {}
  query:
    options: {}
    resources: {}
  ingester:
    options: {}
    resources: {}
  options: {}

Table 1. Jaeger parameters
Parameter	Description	Values	Default value
`apiVersion:`		API version to use when creating the object.	`jaegertracing.io/v1`
`jaegertracing.io/v1`	`kind:`	Defines the kind of Kubernetes object to create.	`jaeger`
	`metadata:`	Data that helps uniquely identify the object, including a `name` string, `UID`, and optional `namespace`.
OpenShift Container Platform automatically generates the `UID` and completes the `namespace` with the name of the project where the object is created.	`name:`	Name for the object.	The name of your distributed tracing platform instance.
`jaeger-all-in-one-inmemory`	`spec:`	Specification for the object to be created.	Contains all of the configuration parameters for your distributed tracing platform instance. When a common definition for all Jaeger components is required, it is defined under the `spec` node. When the definition relates to an individual component, it is placed under the `spec/<component>` node.
N/A	`strategy:`	Jaeger deployment strategy	`allInOne`, `production`, or `streaming`
`allInOne`	`allInOne:`	Because the `allInOne` image deploys the Agent, Collector, Query, Ingester, and Jaeger UI in a single pod, configuration for this deployment must nest component configuration under the `allInOne` parameter.
	`agent:`	Configuration options that define the Agent.
	`collector:`	Configuration options that define the Jaeger Collector.
	`sampling:`	Configuration options that define the sampling strategies for tracing.
	`storage:`	Configuration options that define the storage. All storage-related options must be placed under `storage`, rather than under the `allInOne` or other component options.
	`query:`	Configuration options that define the Query service.
	`ingester:`	Configuration options that define the Ingester service.

The following example YAML is the minimum required to create a Red Hat OpenShift distributed tracing platform deployment using the default settings.

Example minimum required dist-tracing-all-in-one.yaml

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: jaeger-all-in-one-inmemory

Jaeger Collector configuration options

The Jaeger Collector is the component responsible for receiving the spans that were captured by the tracer and writing them to persistent Elasticsearch storage when using the production strategy, or to AMQ Streams when using the streaming strategy.

The Collectors are stateless and thus many instances of Jaeger Collector can be run in parallel. Collectors require almost no configuration, except for the location of the Elasticsearch cluster.

Table 2. Parameters used by the Operator to define the Jaeger Collector
Parameter	Description	Values
collector: replicas:	Specifies the number of Collector replicas to create.	Integer, for example, `5`

Table 3. Configuration parameters passed to the Collector
Parameter	Description	Values
spec: collector: options: {}	Configuration options that define the Jaeger Collector.
options: collector: num-workers:	The number of workers pulling from the queue.	Integer, for example, `50`
options: collector: queue-size:	The size of the Collector queue.	Integer, for example, `2000`
options: kafka: producer: topic: jaeger-spans	The `topic` parameter identifies the Kafka configuration used by the Collector to produce the messages, and the Ingester to consume the messages.	Label for the producer.
options: kafka: producer: brokers: my-cluster-kafka-brokers.kafka:9092	Identifies the Kafka configuration used by the Collector to produce the messages. If brokers are not specified, and you have AMQ Streams 1.4.0+ installed, the Red Hat OpenShift distributed tracing platform Operator will self-provision Kafka.
options: log-level:	Logging level for the Collector.	Possible values: `debug`, `info`, `warn`, `error`, `fatal`, `panic`.

Distributed tracing sampling configuration options

The Red Hat OpenShift distributed tracing platform Operator can be used to define sampling strategies that will be supplied to tracers that have been configured to use a remote sampler.

While all traces are generated, only a few are sampled. Sampling a trace marks the trace for further processing and storage.

This is not relevant if a trace was started by the Envoy proxy, as the sampling decision is made there. The Jaeger sampling decision is only relevant when the trace is started by an application using the client.

When a service receives a request that contains no trace context, the client starts a new trace, assigns it a random trace ID, and makes a sampling decision based on the currently installed sampling strategy. The sampling decision propagates to all subsequent requests in the trace so that other services are not making the sampling decision again.

distributed tracing platform libraries support the following samplers:

Probabilistic - The sampler makes a random sampling decision with the probability of sampling equal to the value of the sampling.param property. For example, using sampling.param=0.1 samples approximately 1 in 10 traces.
Rate Limiting - The sampler uses a leaky bucket rate limiter to ensure that traces are sampled with a certain constant rate. For example, using sampling.param=2.0 samples requests with the rate of 2 traces per second.

Table 4. Jaeger sampling options
Parameter	Description	Values	Default value
spec: sampling: options: {} default_strategy: service_strategy:	Configuration options that define the sampling strategies for tracing.		If you do not provide configuration, the Collectors will return the default probabilistic sampling policy with 0.001 (0.1%) probability for all services.
default_strategy: type: service_strategy: type:	Sampling strategy to use. See descriptions above.	Valid values are `probabilistic`, and `ratelimiting`.	`probabilistic`
default_strategy: param: service_strategy: param:	Parameters for the selected sampling strategy.	Decimal and integer values (0, .1, 1, 10)	1

This example defines a default sampling strategy that is probabilistic, with a 50% chance of the trace instances being sampled.

Probabilistic sampling example

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: with-sampling
spec:
  sampling:
    options:
      default_strategy:
        type: probabilistic
        param: 0.5
      service_strategies:
        - service: alpha
          type: probabilistic
          param: 0.8
          operation_strategies:
            - operation: op1
              type: probabilistic
              param: 0.2
            - operation: op2
              type: probabilistic
              param: 0.4
        - service: beta
          type: ratelimiting
          param: 5

If there are no user-supplied configurations, the distributed tracing platform uses the following settings:

Default sampling

spec:
  sampling:
    options:
      default_strategy:
        type: probabilistic
        param: 1

Distributed tracing storage configuration options

You configure storage for the Collector, Ingester, and Query services under spec.storage. Multiple instances of each of these components can be provisioned as required for performance and resilience purposes.

Table 5. General storage parameters used by the Red Hat OpenShift distributed tracing platform Operator to define distributed tracing storage
Parameter	Description	Values	Default value
spec: storage: type:	Type of storage to use for the deployment.	`memory` or `elasticsearch`. Memory storage is only appropriate for development, testing, demonstrations, and proof of concept environments as the data does not persist if the pod is shut down. For production environments distributed tracing platform supports Elasticsearch for persistent storage.	`memory`
storage: secretname:	Name of the secret, for example `tracing-secret`.		N/A
storage: options: {}	Configuration options that define the storage.

Table 6. Elasticsearch index cleaner parameters
Parameter	Description	Values	Default value
storage: esIndexCleaner: enabled:	When using Elasticsearch storage, by default a job is created to clean old traces from the index. This parameter enables or disables the index cleaner job.	`true`/ `false`	`true`
storage: esIndexCleaner: numberOfDays:	Number of days to wait before deleting an index.	Integer value	`7`
storage: esIndexCleaner: schedule:	Defines the schedule for how often to clean the Elasticsearch index.	Cron expression	"55 23 * * *"

Auto-provisioning an Elasticsearch instance

When you deploy a Jaeger custom resource, the Red Hat OpenShift distributed tracing platform Operator uses the OpenShift Elasticsearch Operator to create an Elasticsearch cluster based on the configuration provided in the storage section of the custom resource file. The Red Hat OpenShift distributed tracing platform Operator will provision Elasticsearch if the following configurations are set:

spec.storage:type is set to elasticsearch
spec.storage.elasticsearch.doNotProvision set to false
spec.storage.options.es.server-urls is not defined, that is, there is no connection to an Elasticsearch instance that was not provisioned by the Red Hat Elasticsearch Operator.

When provisioning Elasticsearch, the Red Hat OpenShift distributed tracing platform Operator sets the Elasticsearch custom resource name to the value of spec.storage.elasticsearch.name from the Jaeger custom resource. If you do not specify a value for spec.storage.elasticsearch.name, the Operator uses elasticsearch.

Restrictions

You can have only one distributed tracing platform with self-provisioned Elasticsearch instance per namespace. The Elasticsearch cluster is meant to be dedicated for a single distributed tracing platform instance.
There can be only one Elasticsearch per namespace.

If you already have installed Elasticsearch as part of OpenShift Logging, the Red Hat OpenShift distributed tracing platform Operator can use the installed OpenShift Elasticsearch Operator to provision storage.

The following configuration parameters are for a self-provisioned Elasticsearch instance, that is an instance created by the Red Hat OpenShift distributed tracing platform Operator using the OpenShift Elasticsearch Operator. You specify configuration options for self-provisioned Elasticsearch under spec:storage:elasticsearch in your configuration file.

Table 7. Elasticsearch resource configuration parameters
Parameter	Description	Values	Default value
elasticsearch: properties: doNotProvision:	Use to specify whether or not an Elasticsearch instance should be provisioned by the Red Hat OpenShift distributed tracing platform Operator.	`true`/`false`	`true`
elasticsearch: properties: name:	Name of the Elasticsearch instance. The Red Hat OpenShift distributed tracing platform Operator uses the Elasticsearch instance specified in this parameter to connect to Elasticsearch.	string	`elasticsearch`
elasticsearch: nodeCount:	Number of Elasticsearch nodes. For high availability use at least 3 nodes. Do not use 2 nodes as “split brain” problem can happen.	Integer value. For example, Proof of concept = 1, Minimum deployment =3	3
elasticsearch: resources: requests: cpu:	Number of central processing units for requests, based on your environment’s configuration.	Specified in cores or millicores, for example, 200m, 0.5, 1. For example, Proof of concept = 500m, Minimum deployment =1	1
elasticsearch: resources: requests: memory:	Available memory for requests, based on your environment’s configuration.	Specified in bytes, for example, 200Ki, 50Mi, 5Gi. For example, Proof of concept = 1Gi, Minimum deployment = 16Gi*	16Gi
elasticsearch: resources: limits: cpu:	Limit on number of central processing units, based on your environment’s configuration.	Specified in cores or millicores, for example, 200m, 0.5, 1. For example, Proof of concept = 500m, Minimum deployment =1
elasticsearch: resources: limits: memory:	Available memory limit based on your environment’s configuration.	Specified in bytes, for example, 200Ki, 50Mi, 5Gi. For example, Proof of concept = 1Gi, Minimum deployment = 16Gi*
elasticsearch: redundancyPolicy:	Data replication policy defines how Elasticsearch shards are replicated across data nodes in the cluster. If not specified, the Red Hat OpenShift distributed tracing platform Operator automatically determines the most appropriate replication based on number of nodes.	`ZeroRedundancy`(no replica shards), `SingleRedundancy`(one replica shard), `MultipleRedundancy`(each index is spread over half of the Data nodes), `FullRedundancy` (each index is fully replicated on every Data node in the cluster).
elasticsearch: useCertManagement:	Use to specify whether or not distributed tracing platform should use the certificate management feature of the Red Hat Elasticsearch Operator. This feature was added to logging subsystem for Red Hat OpenShift 5.2 in OpenShift Container Platform 4.7 and is the preferred setting for new Jaeger deployments.	`true`/`false`	`true`
	*Each Elasticsearch node can operate with a lower memory setting though this is NOT recommended for production deployments. For production use, you should have no less than 16Gi allocated to each pod by default, but preferably allocate as much as you can, up to 64Gi per pod.

Production storage example

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simple-prod
spec:
  strategy: production
  storage:
    type: elasticsearch
    elasticsearch:
      nodeCount: 3
      resources:
        requests:
          cpu: 1
          memory: 16Gi
        limits:
          memory: 16Gi

Storage example with persistent storage:

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simple-prod
spec:
  strategy: production
  storage:
    type: elasticsearch
    elasticsearch:
      nodeCount: 1
      storage: (1)
        storageClassName: gp2
        size: 5Gi
      resources:
        requests:
          cpu: 200m
          memory: 4Gi
        limits:
          memory: 4Gi
      redundancyPolicy: ZeroRedundancy

1 Persistent storage configuration. In this case AWS gp2 with 5Gi size. When no value is specified, distributed tracing platform uses emptyDir. The OpenShift Elasticsearch Operator provisions PersistentVolumeClaim and PersistentVolume which are not removed with distributed tracing platform instance. You can mount the same volumes if you create a distributed tracing platform instance with the same name and namespace.

Connecting to an existing Elasticsearch instance

You can use an existing Elasticsearch cluster for storage with distributed tracing. An existing Elasticsearch cluster, also known as an external Elasticsearch instance, is an instance that was not installed by the Red Hat OpenShift distributed tracing platform Operator or by the Red Hat Elasticsearch Operator.

When you deploy a Jaeger custom resource, the Red Hat OpenShift distributed tracing platform Operator will not provision Elasticsearch if the following configurations are set:

spec.storage.elasticsearch.doNotProvision set to true
spec.storage.options.es.server-urls has a value
spec.storage.elasticsearch.name has a value, or if the Elasticsearch instance name is elasticsearch.

The Red Hat OpenShift distributed tracing platform Operator uses the Elasticsearch instance specified in spec.storage.elasticsearch.name to connect to Elasticsearch.

Restrictions

You cannot share or reuse a OpenShift Container Platform logging Elasticsearch instance with distributed tracing platform. The Elasticsearch cluster is meant to be dedicated for a single distributed tracing platform instance.

Red Hat does not provide support for your external Elasticsearch instance. You can review the tested integrations matrix on the Customer Portal.

The following configuration parameters are for an already existing Elasticsearch instance, also known as an external Elasticsearch instance. In this case, you specify configuration options for Elasticsearch under spec:storage:options:es in your custom resource file.

Table 8. General ES configuration parameters
Parameter	Description	Values	Default value
es: server-urls:	URL of the Elasticsearch instance.	The fully-qualified domain name of the Elasticsearch server.	`http://elasticsearch.<namespace>.svc:9200`
es: max-doc-count:	The maximum document count to return from an Elasticsearch query. This will also apply to aggregations. If you set both `es.max-doc-count` and `es.max-num-spans`, Elasticsearch will use the smaller value of the two.		10000
es: max-num-spans:	[Deprecated - Will be removed in a future release, use `es.max-doc-count` instead.] The maximum number of spans to fetch at a time, per query, in Elasticsearch. If you set both `es.max-num-spans` and `es.max-doc-count`, Elasticsearch will use the smaller value of the two.		10000
es: max-span-age:	The maximum lookback for spans in Elasticsearch.		72h0m0s
es: sniffer:	The sniffer configuration for Elasticsearch. The client uses the sniffing process to find all nodes automatically. Disabled by default.	`true`/ `false`	`false`
es: sniffer-tls-enabled:	Option to enable TLS when sniffing an Elasticsearch Cluster. The client uses the sniffing process to find all nodes automatically. Disabled by default	`true`/ `false`	`false`
es: timeout:	Timeout used for queries. When set to zero there is no timeout.		0s
es: username:	The username required by Elasticsearch. The basic authentication also loads CA if it is specified. See also `es.password`.
es: password:	The password required by Elasticsearch. See also, `es.username`.
es: version:	The major Elasticsearch version. If not specified, the value will be auto-detected from Elasticsearch.		0

Table 9. ES data replication parameters
Parameter	Description	Values	Default value
es: num-replicas:	The number of replicas per index in Elasticsearch.		1
es: num-shards:	The number of shards per index in Elasticsearch.		5

Table 10. ES index configuration parameters
Parameter	Description	Values	Default value
es: create-index-templates:	Automatically create index templates at application startup when set to `true`. When templates are installed manually, set to `false`.	`true`/ `false`	`true`
es: index-prefix:	Optional prefix for distributed tracing platform indices. For example, setting this to "production" creates indices named "production-tracing-*".

Table 11. ES bulk processor configuration parameters
Parameter	Description	Default value
es: bulk: actions:	The number of requests that can be added to the queue before the bulk processor decides to commit updates to disk.	1000
es: bulk: flush-interval:	A `time.Duration` after which bulk requests are committed, regardless of other thresholds. To disable the bulk processor flush interval, set this to zero.	200ms
es: bulk: size:	The number of bytes that the bulk requests can take up before the bulk processor decides to commit updates to disk.	5000000
es: bulk: workers:	The number of workers that are able to receive and commit bulk requests to Elasticsearch.	1

Table 12. ES TLS configuration parameters
Parameter	Description	Values	Default value
es: tls: ca:	Path to a TLS Certification Authority (CA) file used to verify the remote servers.		Will use the system truststore by default.
es: tls: cert:	Path to a TLS Certificate file, used to identify this process to the remote servers.
es: tls: enabled:	Enable transport layer security (TLS) when talking to the remote servers. Disabled by default.	`true`/ `false`	`false`
es: tls: key:	Path to a TLS Private Key file, used to identify this process to the remote servers.
es: tls: server-name:	Override the expected TLS server name in the certificate of the remote servers.
es: token-file:	Path to a file containing the bearer token. This flag also loads the Certification Authority (CA) file if it is specified.

Table 13. ES archive configuration parameters
Parameter	Description	Values	Default value
es-archive: bulk: actions:	The number of requests that can be added to the queue before the bulk processor decides to commit updates to disk.		0
es-archive: bulk: flush-interval:	A `time.Duration` after which bulk requests are committed, regardless of other thresholds. To disable the bulk processor flush interval, set this to zero.		0s
es-archive: bulk: size:	The number of bytes that the bulk requests can take up before the bulk processor decides to commit updates to disk.		0
es-archive: bulk: workers:	The number of workers that are able to receive and commit bulk requests to Elasticsearch.		0
es-archive: create-index-templates:	Automatically create index templates at application startup when set to `true`. When templates are installed manually, set to `false`.	`true`/ `false`	`false`
es-archive: enabled:	Enable extra storage.	`true`/ `false`	`false`
es-archive: index-prefix:	Optional prefix for distributed tracing platform indices. For example, setting this to "production" creates indices named "production-tracing-*".
es-archive: max-doc-count:	The maximum document count to return from an Elasticsearch query. This will also apply to aggregations.		0
es-archive: max-num-spans:	[Deprecated - Will be removed in a future release, use `es-archive.max-doc-count` instead.] The maximum number of spans to fetch at a time, per query, in Elasticsearch.		0
es-archive: max-span-age:	The maximum lookback for spans in Elasticsearch.		0s
es-archive: num-replicas:	The number of replicas per index in Elasticsearch.		0
es-archive: num-shards:	The number of shards per index in Elasticsearch.		0
es-archive: password:	The password required by Elasticsearch. See also, `es.username`.
es-archive: server-urls:	The comma-separated list of Elasticsearch servers. Must be specified as fully qualified URLs, for example, `http://localhost:9200`.
es-archive: sniffer:	The sniffer configuration for Elasticsearch. The client uses the sniffing process to find all nodes automatically. Disabled by default.	`true`/ `false`	`false`
es-archive: sniffer-tls-enabled:	Option to enable TLS when sniffing an Elasticsearch Cluster. The client uses the sniffing process to find all nodes automatically. Disabled by default.	`true`/ `false`	`false`
es-archive: timeout:	Timeout used for queries. When set to zero there is no timeout.		0s
es-archive: tls: ca:	Path to a TLS Certification Authority (CA) file used to verify the remote servers.		Will use the system truststore by default.
es-archive: tls: cert:	Path to a TLS Certificate file, used to identify this process to the remote servers.
es-archive: tls: enabled:	Enable transport layer security (TLS) when talking to the remote servers. Disabled by default.	`true`/ `false`	`false`
es-archive: tls: key:	Path to a TLS Private Key file, used to identify this process to the remote servers.
es-archive: tls: server-name:	Override the expected TLS server name in the certificate of the remote servers.
es-archive: token-file:	Path to a file containing the bearer token. This flag also loads the Certification Authority (CA) file if it is specified.
es-archive: username:	The username required by Elasticsearch. The basic authentication also loads CA if it is specified. See also `es-archive.password`.
es-archive: version:	The major Elasticsearch version. If not specified, the value will be auto-detected from Elasticsearch.		0

Storage example with volume mounts

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simple-prod
spec:
  strategy: production
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: https://quickstart-es-http.default.svc:9200
        index-prefix: my-prefix
        tls:
          ca: /es/certificates/ca.crt
    secretName: tracing-secret
  volumeMounts:
    - name: certificates
      mountPath: /es/certificates/
      readOnly: true
  volumes:
    - name: certificates
      secret:
        secretName: quickstart-es-http-certs-public

The following example shows a Jaeger CR using an external Elasticsearch cluster with TLS CA certificate mounted from a volume and user/password stored in a secret.

External Elasticsearch example:

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simple-prod
spec:
  strategy: production
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: https://quickstart-es-http.default.svc:9200 (1)
        index-prefix: my-prefix
        tls: (2)
          ca: /es/certificates/ca.crt
    secretName: tracing-secret (3)
  volumeMounts: (4)
    - name: certificates
      mountPath: /es/certificates/
      readOnly: true
  volumes:
    - name: certificates
      secret:
        secretName: quickstart-es-http-certs-public

1	URL to Elasticsearch service running in default namespace.
2	TLS configuration. In this case only CA certificate, but it can also contain es.tls.key and es.tls.cert when using mutual TLS.
3	Secret which defines environment variables ES_PASSWORD and ES_USERNAME. Created by kubectl create secret generic tracing-secret --from-literal=ES_PASSWORD=changeme --from-literal=ES_USERNAME=elastic
4	Volume mounts and volumes which are mounted into all storage components.

Managing certificates with Elasticsearch

You can create and manage certificates using the Red Hat Elasticsearch Operator. Managing certificates using the Red Hat Elasticsearch Operator also lets you use a single Elasticsearch cluster with multiple Jaeger Collectors.

Managing certificates with Elasticsearch is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production.

These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. For more information about the support scope of Red Hat Technology Preview features, see https://access.redhat.com/support/offerings/techpreview/.

Starting with version 2.4, the Red Hat OpenShift distributed tracing platform Operator delegates certificate creation to the Red Hat Elasticsearch Operator by using the following annotations in the Elasticsearch custom resource:

logging.openshift.io/elasticsearch-cert-management: "true"
logging.openshift.io/elasticsearch-cert.jaeger-<shared-es-node-name>: "user.jaeger"
logging.openshift.io/elasticsearch-cert.curator-<shared-es-node-name>: "system.logging.curator"

Where the <shared-es-node-name> is the name of the Elasticsearch node. For example, if you create an Elasticsearch node named custom-es, your custom resource might look like the following example.

Example Elasticsearch CR showing annotations

apiVersion: logging.openshift.io/v1
kind: Elasticsearch
metadata:
  annotations:
    logging.openshift.io/elasticsearch-cert-management: "true"
    logging.openshift.io/elasticsearch-cert.jaeger-custom-es: "user.jaeger"
    logging.openshift.io/elasticsearch-cert.curator-custom-es: "system.logging.curator"
  name: custom-es
spec:
  managementState: Managed
  nodeSpec:
    resources:
      limits:
        memory: 16Gi
      requests:
        cpu: 1
        memory: 16Gi
  nodes:
    - nodeCount: 3
      proxyResources: {}
      resources: {}
      roles:
        - master
        - client
        - data
      storage: {}
  redundancyPolicy: ZeroRedundancy

Prerequisites

OpenShift Container Platform 4.7
logging subsystem for Red Hat OpenShift 5.2
The Elasticsearch node and the Jaeger instances must be deployed in the same namespace. For example, tracing-system.

You enable certificate management by setting spec.storage.elasticsearch.useCertManagement to true in the Jaeger custom resource.

Example showing useCertManagement

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: jaeger-prod
spec:
  strategy: production
  storage:
    type: elasticsearch
    elasticsearch:
      name: custom-es
      doNotProvision: true
      useCertManagement: true

The Red Hat OpenShift distributed tracing platform Operator sets the Elasticsearch custom resource name to the value of spec.storage.elasticsearch.name from the Jaeger custom resource when provisioning Elasticsearch.

The certificates are provisioned by the Red Hat Elasticsearch Operator and the Red Hat OpenShift distributed tracing platform Operator injects the certificates.

Query configuration options

Query is a service that retrieves traces from storage and hosts the user interface to display them.

Table 14. Parameters used by the Red Hat OpenShift distributed tracing platform Operator to define Query
Parameter	Description	Values	Default value
spec: query: replicas:	Specifies the number of Query replicas to create.	Integer, for example, `2`

Table 15. Configuration parameters passed to Query
Parameter	Description	Values
spec: query: options: {}	Configuration options that define the Query service.
options: log-level:	Logging level for Query.	Possible values: `debug`, `info`, `warn`, `error`, `fatal`, `panic`.
options: query: base-path:	The base path for all jaeger-query HTTP routes can be set to a non-root value, for example, `/jaeger` would cause all UI URLs to start with `/jaeger`. This can be useful when running jaeger-query behind a reverse proxy.	/<path>

Sample Query configuration

apiVersion: jaegertracing.io/v1
kind: "Jaeger"
metadata:
  name: "my-jaeger"
spec:
  strategy: allInOne
  allInOne:
    options:
      log-level: debug
      query:
        base-path: /jaeger

Ingester configuration options

Ingester is a service that reads from a Kafka topic and writes to the Elasticsearch storage backend. If you are using the allInOne or production deployment strategies, you do not need to configure the Ingester service.

Table 16. Jaeger parameters passed to the Ingester
Parameter	Description	Values
spec: ingester: options: {}	Configuration options that define the Ingester service.
options: deadlockInterval:	Specifies the interval, in seconds or minutes, that the Ingester must wait for a message before terminating. The deadlock interval is disabled by default (set to `0`), to avoid terminating the Ingester when no messages arrive during system initialization.	Minutes and seconds, for example, `1m0s`. Default value is `0`.
options: kafka: consumer: topic:	The `topic` parameter identifies the Kafka configuration used by the collector to produce the messages, and the Ingester to consume the messages.	Label for the consumer. For example, `jaeger-spans`.
options: kafka: consumer: brokers:	Identifies the Kafka configuration used by the Ingester to consume the messages.	Label for the broker, for example, `my-cluster-kafka-brokers.kafka:9092`.
options: log-level:	Logging level for the Ingester.	Possible values: `debug`, `info`, `warn`, `error`, `fatal`, `dpanic`, `panic`.

Streaming Collector and Ingester example

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simple-streaming
spec:
  strategy: streaming
  collector:
    options:
      kafka:
        producer:
          topic: jaeger-spans
          brokers: my-cluster-kafka-brokers.kafka:9092
  ingester:
    options:
      kafka:
        consumer:
          topic: jaeger-spans
          brokers: my-cluster-kafka-brokers.kafka:9092
      ingester:
        deadlockInterval: 5
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: http://elasticsearch:9200

Injecting sidecars

Red Hat OpenShift distributed tracing platform relies on a proxy sidecar within the application’s pod to provide the agent. The Red Hat OpenShift distributed tracing platform Operator can inject Agent sidecars into Deployment workloads. You can enable automatic sidecar injection or manage it manually.

Automatically injecting sidecars

The Red Hat OpenShift distributed tracing platform Operator can inject Jaeger Agent sidecars into Deployment workloads. To enable automatic injection of sidecars, add the sidecar.jaegertracing.io/inject annotation set to either the string true or to the distributed tracing platform instance name that is returned by running $ oc get jaegers. When you specify true, there should be only a single distributed tracing platform instance for the same namespace as the deployment, otherwise, the Operator cannot determine which distributed tracing platform instance to use. A specific distributed tracing platform instance name on a deployment has a higher precedence than true applied on its namespace.

The following snippet shows a simple application that will inject a sidecar, with the agent pointing to the single distributed tracing platform instance available in the same namespace:

Automatic sidecar injection example

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  annotations:
    "sidecar.jaegertracing.io/inject": "true" (1)
spec:
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp
        image: acme/myapp:myversion

1	Set to either the string `true` or to the Jaeger instance name.

When the sidecar is injected, the agent can then be accessed at its default location on localhost.

Manually injecting sidecars

The Red Hat OpenShift distributed tracing platform Operator can only automatically inject Jaeger Agent sidecars into Deployment workloads. For controller types other than Deployments, such as StatefulSets`and `DaemonSets, you can manually define the Jaeger agent sidecar in your specification.

The following snippet shows the manual definition you can include in your containers section for a Jaeger agent sidecar:

Sidecar definition example for a StatefulSet

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: example-statefulset
  namespace: example-ns
  labels:
    app: example-app
spec:

    spec:
      containers:
        - name: example-app
          image: acme/myapp:myversion
          ports:
            - containerPort: 8080
              protocol: TCP
        - name: jaeger-agent
          image: registry.redhat.io/distributed-tracing/jaeger-agent-rhel7:<version>
           # The agent version must match the Operator version
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 5775
              name: zk-compact-trft
              protocol: UDP
            - containerPort: 5778
              name: config-rest
              protocol: TCP
            - containerPort: 6831
              name: jg-compact-trft
              protocol: UDP
            - containerPort: 6832
              name: jg-binary-trft
              protocol: UDP
            - containerPort: 14271
              name: admin-http
              protocol: TCP
          args:
            - --reporter.grpc.host-port=dns:///jaeger-collector-headless.example-ns:14250
            - --reporter.type=grpc

The agent can then be accessed at its default location on localhost.

Configuring and deploying distributed tracing

Deploying the distributed tracing default strategy from the web console

Deploying the distributed tracing default strategy from the CLI

Deploying the distributed tracing production strategy from the web console

Deploying the distributed tracing production strategy from the CLI

Deploying the distributed tracing streaming strategy from the web console

Deploying the distributed tracing streaming strategy from the CLI

Validating your deployment

Accessing the Jaeger console

Customizing your deployment

Deployment best practices

Distributed tracing default configuration options

Jaeger Collector configuration options

Distributed tracing sampling configuration options

Distributed tracing storage configuration options

Auto-provisioning an Elasticsearch instance

Connecting to an existing Elasticsearch instance

Managing certificates with Elasticsearch

Query configuration options

Ingester configuration options

Injecting sidecars

Automatically injecting sidecars

Manually injecting sidecars