Tracing | Serverless | OpenShift Container Platform 4.8

Distributed tracing overview
Using Red Hat OpenShift distributed tracing to enable distributed tracing
Using Jaeger to enable distributed tracing
Additional resources

Distributed tracing records the path of a request through the various services that make up an application. It is used to tie information about different units of work together, to understand a whole chain of events in a distributed transaction. The units of work might be executed in different processes or hosts.

Distributed tracing overview

As a service owner, you can use distributed tracing to instrument your services to gather insights into your service architecture. You can use distributed tracing for monitoring, network profiling, and troubleshooting the interaction between components in modern, cloud-native, microservices-based applications.

With distributed tracing you can perform the following functions:

Monitor distributed transactions
Optimize performance and latency
Perform root cause analysis

Red Hat OpenShift distributed tracing consists of two main components:

Red Hat OpenShift distributed tracing platform - This component is based on the open source Jaeger project.
Red Hat OpenShift distributed tracing data collection - This component is based on the open source OpenTelemetry project.

Both of these components are based on the vendor-neutral OpenTracing APIs and instrumentation.

Using Red Hat OpenShift distributed tracing to enable distributed tracing

Red Hat OpenShift distributed tracing is made up of several components that work together to collect, store, and display tracing data. You can use Red Hat OpenShift distributed tracing with OpenShift Serverless to monitor and troubleshoot serverless applications.

Prerequisites

You have access to an OpenShift Container Platform account with cluster administrator access.
You have not yet installed the OpenShift Serverless Operator, Knative Serving, and Knative Eventing. These must be installed after the Red Hat OpenShift distributed tracing installation.
You have installed Red Hat OpenShift distributed tracing by following the OpenShift Container Platform "Installing distributed tracing" documentation.
You have installed the OpenShift CLI (oc).
You have created a project or have access to a project with the appropriate roles and permissions to create applications and other workloads in OpenShift Container Platform.

Procedure

Create an OpenTelemetryCollector custom resource (CR):

Example OpenTelemetryCollector CR

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: cluster-collector
  namespace: <namespace>
spec:
  mode: deployment
  config: |
    receivers:
      zipkin:
    processors:
    exporters:
      jaeger:
        endpoint: jaeger-all-in-one-inmemory-collector-headless.tracing-system.svc:14250
        tls:
          ca_file: "/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt"
      logging:
    service:
      pipelines:
        traces:
          receivers: [zipkin]
          processors: []
          exporters: [jaeger, logging]

Verify that you have two pods running in the namespace where Red Hat OpenShift distributed tracing is installed:

$ oc get pods -n <namespace>

Example output

NAME                                          READY   STATUS    RESTARTS   AGE
cluster-collector-collector-85c766b5c-b5g99   1/1     Running   0          5m56s
jaeger-all-in-one-inmemory-ccbc9df4b-ndkl5    2/2     Running   0          15m

Verify that the following headless services have been created:

$ oc get svc -n <namespace> | grep headless

Example output

cluster-collector-collector-headless            ClusterIP   None             <none>        9411/TCP                                 7m28s
jaeger-all-in-one-inmemory-collector-headless   ClusterIP   None             <none>        9411/TCP,14250/TCP,14267/TCP,14268/TCP   16m

These services are used to configure Jaeger, Knative Serving, and Knative Eventing. The name of the Jaeger service may vary.

Install the OpenShift Serverless Operator by following the "Installing the OpenShift Serverless Operator" documentation.

Install Knative Serving by creating the following KnativeServing CR:

Example KnativeServing CR

apiVersion: operator.knative.dev/v1beta1
kind: KnativeServing
metadata:
    name: knative-serving
    namespace: knative-serving
spec:
  config:
    tracing:
      backend: "zipkin"
      zipkin-endpoint: "http://cluster-collector-collector-headless.tracing-system.svc:9411/api/v2/spans"
      debug: "false"
      sample-rate: "0.1" (1)

1	The `sample-rate` defines sampling probability. Using `sample-rate: "0.1"` means that 1 in 10 traces are sampled.

Install Knative Eventing by creating the following KnativeEventing CR:

Example KnativeEventing CR

apiVersion: operator.knative.dev/v1beta1
kind: KnativeEventing
metadata:
    name: knative-eventing
    namespace: knative-eventing
spec:
  config:
    tracing:
      backend: "zipkin"
      zipkin-endpoint: "http://cluster-collector-collector-headless.tracing-system.svc:9411/api/v2/spans"
      debug: "false"
      sample-rate: "0.1" (1)

1	The `sample-rate` defines sampling probability. Using `sample-rate: "0.1"` means that 1 in 10 traces are sampled.

Create a Knative service:

Example service

apiVersion: serving.knative.dev/v1
kind: service
metadata:
  name: helloworld-go
spec:
  template:
    metadata:
      labels:
        app: helloworld-go
      annotations:
        autoscaling.knative.dev/minScale: "1"
        autoscaling.knative.dev/target: "1"
    spec:
      containers:
      - image: quay.io/openshift-knative/helloworld:v1.2
        imagePullPolicy: Always
        resources:
          requests:
            cpu: "200m"
        env:
        - name: TARGET
          value: "Go Sample v1"

Make some requests to the service:
Example HTTPS request
```
$ curl https://helloworld-go.example.com
```
Get the URL for the Jaeger web console:
Example command
```
$ oc get route jaeger-all-in-one-inmemory  -o jsonpath='{.spec.host}' -n <namespace>
```
You can now examine traces by using the Jaeger console.

Using Jaeger to enable distributed tracing

If you do not want to install all of the components of Red Hat OpenShift distributed tracing, you can still use distributed tracing on OpenShift Container Platform with OpenShift Serverless. To do this, you must install and configure Jaeger as a standalone integration.

Prerequisites

You have access to an OpenShift Container Platform account with cluster administrator access.
You have installed the OpenShift Serverless Operator, Knative Serving, and Knative Eventing.
You have installed the Red Hat OpenShift distributed tracing platform Operator.
You have installed the OpenShift CLI (oc).
You have created a project or have access to a project with the appropriate roles and permissions to create applications and other workloads in OpenShift Container Platform.

Procedure

Create and apply a Jaeger custom resource (CR) that contains the following:

Jaeger CR

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: jaeger
  namespace: default

Enable tracing for Knative Serving, by editing the KnativeServing CR and adding a YAML configuration for tracing:

Tracing YAML example for Serving

apiVersion: operator.knative.dev/v1beta1
kind: KnativeServing
metadata:
  name: knative-serving
  namespace: knative-serving
spec:
  config:
    tracing:
      sample-rate: "0.1" (1)
      backend: zipkin (2)
      zipkin-endpoint: "http://jaeger-collector.default.svc.cluster.local:9411/api/v2/spans" (3)
      debug: "false" (4)

1	The `sample-rate` defines sampling probability. Using `sample-rate: "0.1"` means that 1 in 10 traces are sampled.
2	`backend` must be set to `zipkin`.
3	The `zipkin-endpoint` must point to your `jaeger-collector` service endpoint. To get this endpoint, substitute the namespace where the Jaeger CR is applied.
4	Debugging should be set to `false`. Enabling debug mode by setting `debug: "true"` allows all spans to be sent to the server, bypassing sampling.

Enable tracing for Knative Eventing by editing the KnativeEventing CR:

Tracing YAML example for Eventing

apiVersion: operator.knative.dev/v1beta1
kind: KnativeEventing
metadata:
  name: knative-eventing
  namespace: knative-eventing
spec:
  config:
    tracing:
      sample-rate: "0.1" (1)
      backend: zipkin (2)
      zipkin-endpoint: "http://jaeger-collector.default.svc.cluster.local:9411/api/v2/spans" (3)
      debug: "false" (4)

1	The `sample-rate` defines sampling probability. Using `sample-rate: "0.1"` means that 1 in 10 traces are sampled.
2	Set `backend` to `zipkin`.
3	Point the `zipkin-endpoint` to your `jaeger-collector` service endpoint. To get this endpoint, substitute the namespace where the Jaeger CR is applied.
4	Debugging should be set to `false`. Enabling debug mode by setting `debug: "true"` allows all spans to be sent to the server, bypassing sampling.

Verification

You can access the Jaeger web console to see tracing data, by using the jaeger route.

Get the jaeger route’s hostname by entering the following command:

$ oc get route jaeger -n default

Example output

NAME     HOST/PORT                         PATH   serviceS       PORT    TERMINATION   WILDCARD
jaeger   jaeger-default.apps.example.com          jaeger-query   <all>   reencrypt     None

Open the endpoint address in your browser to view the console.

Tracing requests

Distributed tracing overview

Using Red Hat OpenShift distributed tracing to enable distributed tracing

Using Jaeger to enable distributed tracing

Additional resources