Metrics, logs, and traces

Discovering console addresses
Accessing the Kiali console
Viewing service mesh data in the Kiali console
Distributed tracing
Accessing the Jaeger console
Accessing the Grafana console
Accessing the Prometheus console
Integrating with user-workload monitoring
Additional resources

Once you have added your application to the mesh, you can observe the data flow through your application. If you do not have your own application installed, you can see how observability works in Red Hat OpenShift service Mesh by installing the Bookinfo sample application.

Discovering console addresses

Red Hat OpenShift service Mesh provides the following consoles to view your service mesh data:

Kiali console - Kiali is the management console for Red Hat OpenShift service Mesh.
Jaeger console - Jaeger is the management console for Red Hat OpenShift distributed tracing platform.
Grafana console - Grafana provides mesh administrators with advanced query and metrics analysis and dashboards for Istio data. Optionally, Grafana can be used to analyze service mesh metrics.
Prometheus console - Red Hat OpenShift service Mesh uses Prometheus to store telemetry information from services.

When you install the service Mesh control plane, it automatically generates routes for each of the installed components. Once you have the route address, you can access the Kiali, Jaeger, Prometheus, or Grafana console to view and manage your service mesh data.

Prerequisite

The component must be enabled and installed. For example, if you did not install distributed tracing, you will not be able to access the Jaeger console.

Procedure from OpenShift console

Log in to the OpenShift Container Platform web console as a user with cluster-admin rights. If you use Red Hat OpenShift Dedicated, you must have an account with the dedicated-admin role.
Navigate to Networking → Routes.
On the Routes page, select the service Mesh control plane project, for example istio-system, from the Namespace menu.

The Location column displays the linked address for each route.
If necessary, use the filter to find the component console whose route you want to access. Click the route Location to launch the console.
Click Log In With OpenShift.

Procedure from the CLI

Log in to the OpenShift Container Platform CLI as a user with the cluster-admin role. If you use Red Hat OpenShift Dedicated, you must have an account with the dedicated-admin role.
```
$ oc login --username=<NAMEOFUSER> https://<HOSTNAME>:6443
```
Switch to the service Mesh control plane project. In this example, istio-system is the service Mesh control plane project. Run the following command:
```
$ oc project istio-system
```

To get the routes for the various Red Hat OpenShift service Mesh consoles, run the folowing command:

$ oc get routes

This command returns the URLs for the Kiali, Jaeger, Prometheus, and Grafana web consoles, and any other routes in your service mesh. You should see output similar to the following:

NAME                    HOST/PORT                         serviceS              PORT    TERMINATION
bookinfo-gateway        bookinfo-gateway-yourcompany.com  istio-ingressgateway          http2
grafana                 grafana-yourcompany.com           grafana               <all>   reencrypt/Redirect
istio-ingressgateway    istio-ingress-yourcompany.com     istio-ingressgateway  8080
jaeger                  jaeger-yourcompany.com            jaeger-query          <all>   reencrypt
kiali                   kiali-yourcompany.com             kiali                 20001   reencrypt/Redirect
prometheus              prometheus-yourcompany.com        prometheus            <all>   reencrypt/Redirect

Copy the URL for the console you want to access from the HOST/PORT column into a browser to open the console.
Click Log In With OpenShift.

Accessing the Kiali console

You can view your application’s topology, health, and metrics in the Kiali console. If your service is experiencing problems, the Kiali console lets you view the data flow through your service. You can view insights about the mesh components at different levels, including abstract applications, services, and workloads. Kiali also provides an interactive graph view of your namespace in real time.

To access the Kiali console you must have Red Hat OpenShift service Mesh installed, Kiali installed and configured.

The installation process creates a route to access the Kiali console.

If you know the URL for the Kiali console, you can access it directly. If you do not know the URL, use the following directions.

Procedure for administrators

Log in to the OpenShift Container Platform web console with an administrator role.
Click Home → Projects.
On the Projects page, if necessary, use the filter to find the name of your project.
Click the name of your project, for example, bookinfo.
On the Project details page, in the Launcher section, click the Kiali link.
Log in to the Kiali console with the same user name and password that you use to access the OpenShift Container Platform console.

When you first log in to the Kiali Console, you see the Overview page which displays all the namespaces in your service mesh that you have permission to view.

If you are validating the console installation and namespaces have not yet been added to the mesh, there might not be any data to display other than istio-system.

Procedure for developers

Log in to the OpenShift Container Platform web console with a developer role.
Click Project.
On the Project Details page, if necessary, use the filter to find the name of your project.
Click the name of your project, for example, bookinfo.
On the Project page, in the Launcher section, click the Kiali link.
Click Log In With OpenShift.

Viewing service mesh data in the Kiali console

The Kiali Graph offers a powerful visualization of your mesh traffic. The topology combines real-time request traffic with your Istio configuration information to present immediate insight into the behavior of your service mesh, letting you quickly pinpoint issues. Multiple Graph Types let you visualize traffic as a high-level service topology, a low-level workload topology, or as an application-level topology.

There are several graphs to choose from:

The App graph shows an aggregate workload for all applications that are labeled the same.
The service graph shows a node for each service in your mesh but excludes all applications and workloads from the graph. It provides a high level view and aggregates all traffic for defined services.
The Versioned App graph shows a node for each version of an application. All versions of an application are grouped together.
The Workload graph shows a node for each workload in your service mesh. This graph does not require you to use the application and version labels. If your application does not use version labels, use this the graph.

Graph nodes are decorated with a variety of information, pointing out various route routing options like virtual services and service entries, as well as special configuration like fault-injection and circuit breakers. It can identify mTLS issues, latency issues, error traffic and more. The Graph is highly configurable, can show traffic animation, and has powerful Find and Hide abilities.

Click the Legend button to view information about the shapes, colors, arrows, and badges displayed in the graph.

To view a summary of metrics, select any node or edge in the graph to display its metric details in the summary details panel.

Changing graph layouts in Kiali

The layout for the Kiali graph can render differently depending on your application architecture and the data to display. For example, the number of graph nodes and their interactions can determine how the Kiali graph is rendered. Because it is not possible to create a single layout that renders nicely for every situation, Kiali offers a choice of several different layouts.

Prerequisites

If you do not have your own application installed, install the Bookinfo sample application. Then generate traffic for the Bookinfo application by entering the following command several times.
```
$ curl "http://$GATEWAY_URL/productpage"
```
This command simulates a user visiting the productpage microservice of the application.

Procedure

Launch the Kiali console.
Click Log In With OpenShift.
In Kiali console, click Graph to view a namespace graph.
From the Namespace menu, select your application namespace, for example, bookinfo.
To choose a different graph layout, do either or both of the following:
- Select different graph data groupings from the menu at the top of the graph.
  - App graph
  - service graph
  - Versioned App graph (default)
  - Workload graph
- Select a different graph layout from the Legend at the bottom of the graph.
  - Layout default dagre
  - Layout 1 cose-bilkent
  - Layout 2 cola

Viewing logs in the Kiali console

You can view logs for your workloads in the Kiali console. The Workload Detail page includes a Logs tab which displays a unified logs view that displays both application and proxy logs. You can select how often you want the log display in Kiali to be refreshed.

To change the logging level on the logs displayed in Kiali, you change the logging configuration for the workload or the proxy.

Prerequisites

service Mesh installed and configured.
Kiali installed and configured.
The address for the Kiali console.
Application or Bookinfo sample application added to the mesh.

Procedure

Launch the Kiali console.
Click Log In With OpenShift.

The Kiali Overview page displays namespaces that have been added to the mesh that you have permissions to view.
Click Workloads.
On the Workloads page, select the project from the Namespace menu.
If necessary, use the filter to find the workload whose logs you want to view. Click the workload Name. For example, click ratings-v1.
On the Workload Details page, click the Logs tab to view the logs for the workload.

If you do not see any log entries, you may need to adjust either the Time Range or the Refresh interval.

Viewing metrics in the Kiali console

You can view inbound and outbound metrics for your applications, workloads, and services in the Kiali console. The Detail pages include the following tabs:

inbound Application metrics
outbound Application metrics
inbound Workload metrics
outbound Workload metrics
inbound service metrics

These tabs display predefined metrics dashboards, tailored to the relevant application, workload or service level. The application and workload detail views show request and response metrics such as volume, duration, size, or TCP traffic. The service detail view shows request and response metrics for inbound traffic only.

Kiali lets you customize the charts by choosing the charted dimensions. Kiali can also present metrics reported by either source or destination proxy metrics. And for troubleshooting, Kiali can overlay trace spans on the metrics.

Prerequisites

service Mesh installed and configured.
Kiali installed and configured.
The address for the Kiali console.
(Optional) Distributed tracing installed and configured.

Procedure

Launch the Kiali console.
Click Log In With OpenShift.

The Kiali Overview page displays namespaces that have been added to the mesh that you have permissions to view.
Click either Applications, Workloads, or services.
On the Applications, Workloads, or services page, select the project from the Namespace menu.
If necessary, use the filter to find the application, workload, or service whose logs you want to view. Click the Name.
On the Application Detail, Workload Details, or service Details page, click either the Inbound Metrics or Outbound Metrics tab to view the metrics.

Distributed tracing

Distributed tracing is the process of tracking the performance of individual services in an application by tracing the path of the service calls in the application. Each time a user takes action in an application, a request is executed that might require many services to interact to produce a response. The path of this request is called a distributed transaction.

Red Hat OpenShift service Mesh uses Red Hat OpenShift distributed tracing platform to allow developers to view call flows in a microservice application.

Configuring the Red Hat OpenShift distributed tracing platform (Tempo) and the Red Hat build of OpenTelemetry

You can expose tracing data to the Red Hat OpenShift distributed tracing platform (Tempo) by appending a named element and the opentelemetry provider to the spec.meshConfig.extensionProviders specification in the serviceMeshControlPlane. Then, a telemetry custom resource configures Istio proxies to collect trace spans and send them to the OpenTelemetry Collector endpoint.

You can create a Red Hat build of OpenTelemetry instance in a mesh namespace and configure it to send tracing data to a tracing platform backend service.

Prerequisites

You created a TempoStack instance using the Red Hat Tempo Operator in the tracing-system namespace. For more information, see "Installing Red Hat OpenShift distributed tracing platform (Tempo)".
You installed the Red Hat build of OpenTelemetry Operator in either the recommended namespace or the openshift-operators namespace. For more information, see "Installing the Red Hat build of OpenTelemetry".
If using Red Hat OpenShift service Mesh 2.5 or earlier, set the spec.tracing.type parameter of the serviceMeshControlPlane resource to None so tracing data can be sent to the OpenTelemetry Collector.

Procedure

Create an OpenTelemetry Collector instance in a mesh namespace. This example uses the bookinfo namespace:

Example OpenTelemetry Collector configuration

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: otel
  namespace: bookinfo  (1)
spec:
  mode: deployment
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
    exporters:
      otlp:
        endpoint: "tempo-sample-distributor.tracing-system.svc.cluster.local:4317" (2)
        tls:
          insecure: true
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: []
          exporters: [otlp]

1	Include the namespace in the `serviceMeshMemberRoll` member list.
2	In this example, a TempoStack instance is running in the `tracing-system` namespace. You do not have to include the TempoStack namespace, such as`tracing-system`, in the `serviceMeshMemberRoll` member list.

Create a single instance of the OpenTelemetry Collector in one of the serviceMeshMemberRoll member namespaces.
You can add an otel-collector as a part of the mesh by adding sidecar.istio.io/inject: 'true' to the OpenTelemetryCollector resource.

Check the otel-collector pod log and verify that the pod is running:
Example otel-collector pod log check
```
$ oc logs -n bookinfo  -l app.kubernetes.io/name=otel-collector
```

Create or update an existing serviceMeshControlPlane custom resource (CR) in the istio-system namespace:

Example SMCP custom resource

kind: serviceMeshControlPlane
apiVersion: maistra.io/v2
metadata:
  name: basic
  namespace: istio-system
spec:
  addons:
    grafana:
      enabled: false
    kiali:
      enabled: true
    prometheus:
      enabled: true
  meshConfig:
    extensionProviders:
      - name: otel
        opentelemetry:
          port: 4317
          service: otel-collector.bookinfo.svc.cluster.local
  policy:
    type: Istiod
  telemetry:
    type: Istiod
  version: v2.6

When upgrading from SMCP 2.5 to 2.6, set the spec.tracing.type parameter to None:

Example SMCP spec.tracing.type parameter

spec:
  tracing:
    type: None

Create a Telemetry resource in the istio-system namespace:

Example Telemetry resource

apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
  name: mesh-default
  namespace: istio-system
spec:
  tracing:
  - providers:
    - name: otel
    randomSamplingPercentage: 100

Verify the istiod log.

Configure the Kiali resource specification to enable a Kiali workload traces dashboard. You can use the dashboard to view tracing query results.

Example Kiali resource

apiVersion: kiali.io/v1alpha1
kind: Kiali
# ...
spec:
  external_services:
    tracing:
      query_timeout: 30 (1)
      enabled: true
      in_cluster_url: 'http://tempo-sample-query-frontend.tracing-system.svc.cluster.local:16685'
      url: '[Tempo query frontend Route url]'
      use_grpc: true (2)

1	The default `query_timeout` integer value is 30 seconds. If you set the value to greater than 30 seconds, you must update `.spec.server.write_timeout` in the Kiali CR and add the annotation `haproxy.router.openshift.io/timeout=50s` to the Kiali route. Both `.spec.server.write_timeout` and `haproxy.router.openshift.io/timeout=` must be greater than `query_timeout`.
2	If you are not using the default HTTP or gRPC port, replace the `in_cluster_url:` port with your custom port.

Kiali 1.73 uses the Jaeger Query API, which causes a longer response time depending on Tempo resource limits. If you see a Could not fetch spans error message in the Kiali UI, then check your Tempo configuration or reduce the limit per query in Kiali.

Send requests to your application.
Verify the istiod pod logs and the otel-collector pod logs.

Configuring the `OpenTelemetryCollector` in a mTLS encrypted service Mesh member namespace

All traffic is TLS encrypted when you enable service Mesh dataPlane mTLS encryption.

To enable the mesh to communicate with the OpenTelemetryCollector service, disable the TLS trafficPolicy by applying a DestinationRule for the OpenTelemetryCollector service:

Example DestinationRule Tempo CR

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: otel-disable-tls
spec:
  host: "otel-collector.bookinfo.svc.cluster.local"
  trafficPolicy:
    tls:
      mode: DISABLE

Configuring the Red Hat OpenShift distributed tracing platform (Tempo) in a mTLS encrypted service Mesh member namespace

You don’t need this additional DestinationRule configuration if you created a TempoStack instance in a namespace that is not a service Mesh member namespace.

All traffic is TLS encrypted when you enable service Mesh dataPlane mTLS encryption and you create a TempoStack instance in a service Mesh member namespace such as tracing-system-mtls. This encryption is not expected from the Tempo distributed service and returns a TLS error.

To fix the TLS error, disable the TLS trafficPolicy by applying a DestinationRule for Tempo and Kiali:

Example DestinationRule Tempo

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: tempo
  namespace: tracing-system-mtls
spec:
  host: "*.tracing-system-mtls.svc.cluster.local"
  trafficPolicy:
    tls:
      mode: DISABLE

Example DestinationRule Kiali

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: kiali
  namespace: istio-system
spec:
  host: kiali.istio-system.svc.cluster.local
  trafficPolicy:
    tls:
      mode: DISABLE

Connecting an existing distributed tracing Jaeger instance

If you already have an existing Red Hat OpenShift distributed tracing platform (Jaeger) instance in OpenShift Container Platform, you can configure your serviceMeshControlPlane resource to use that instance for distributed tracing platform.

Starting with Red Hat OpenShift service Mesh 2.5, Red Hat OpenShift distributed tracing platform (Jaeger) and OpenShift Elasticsearch Operator are deprecated and will be removed in a future release. Red Hat will provide bug fixes and support for these features during the current release lifecycle, but these features will no longer receive enhancements and will be removed. As an alternative to Red Hat OpenShift distributed tracing platform (Jaeger), you can use Red Hat OpenShift distributed tracing platform (Tempo) instead.

Prerequisites

Red Hat OpenShift distributed tracing platform instance installed and configured.

Procedure

In the OpenShift Container Platform web console, click Operators → Installed Operators.
Click the Project menu and select the project where you installed the service Mesh control plane, for example istio-system.
Click the Red Hat OpenShift service Mesh Operator. In the Istio service Mesh Control Plane column, click the name of your serviceMeshControlPlane resource, for example basic.
Add the name of your distributed tracing platform (Jaeger) instance to the serviceMeshControlPlane.
1. Click the YAML tab.
2. Add the name of your distributed tracing platform (Jaeger) instance to spec.addons.jaeger.name in your serviceMeshControlPlane resource. In the following example, distr-tracing-production is the name of the distributed tracing platform (Jaeger) instance.
  Example distributed tracing configuration
  spec: addons: jaeger: name: distr-tracing-production
3. Click Save.
Click Reload to verify the serviceMeshControlPlane resource was configured correctly.

Adjusting the sampling rate

A trace is an execution path between services in the service mesh. A trace is comprised of one or more spans. A span is a logical unit of work that has a name, start time, and duration. The sampling rate determines how often a trace is persisted.

The Envoy proxy sampling rate is set to sample 100% of traces in your service mesh by default. A high sampling rate consumes cluster resources and performance but is useful when debugging issues. Before you deploy Red Hat OpenShift service Mesh in production, set the value to a smaller proportion of traces. For example, set spec.tracing.sampling to 100 to sample 1% of traces.

Configure the Envoy proxy sampling rate as a scaled integer representing 0.01% increments.

In a basic installation, spec.tracing.sampling is set to 10000, which samples 100% of traces. For example:

Setting the value to 10 samples 0.1% of traces.
Setting the value to 500 samples 5% of traces.

The Envoy proxy sampling rate applies for applications that are available to a service Mesh, and use the Envoy proxy. This sampling rate determines how much data the Envoy proxy collects and tracks.

The Jaeger remote sampling rate applies to applications that are external to the service Mesh, and do not use the Envoy proxy, such as a database. This sampling rate determines how much data the distributed tracing system collects and stores. For more information, see Distributed tracing configuration options.

Procedure

In the OpenShift Container Platform web console, click Operators → Installed Operators.
Click the Project menu and select the project where you installed the control plane, for example istio-system.
Click the Red Hat OpenShift service Mesh Operator. In the Istio service Mesh Control Plane column, click the name of your serviceMeshControlPlane resource, for example basic.
To adjust the sampling rate, set a different value for spec.tracing.sampling.
1. Click the YAML tab.
2. Set the value for spec.tracing.sampling in your serviceMeshControlPlane resource. In the following example, set it to 100.
  Jaeger sampling example
  spec: tracing: sampling: 100
3. Click Save.
Click Reload to verify the serviceMeshControlPlane resource was configured correctly.

Accessing the Jaeger console

To access the Jaeger console you must have Red Hat OpenShift service Mesh installed, Red Hat OpenShift distributed tracing platform (Jaeger) installed and configured.

The installation process creates a route to access the Jaeger console.

If you know the URL for the Jaeger console, you can access it directly. If you do not know the URL, use the following directions.

Starting with Red Hat OpenShift service Mesh 2.5, Red Hat OpenShift distributed tracing platform (Jaeger) and OpenShift Elasticsearch Operator have been deprecated and will be removed in a future release. Red Hat will provide bug fixes and support for this feature during the current release lifecycle, but this feature will no longer receive enhancements and will be removed. As an alternative to Red Hat OpenShift distributed tracing platform (Jaeger), you can use Red Hat OpenShift distributed tracing platform (Tempo) instead.

Procedure from OpenShift console

Log in to the OpenShift Container Platform web console as a user with cluster-admin rights. If you use Red Hat OpenShift Dedicated, you must have an account with the dedicated-admin role.
Navigate to Networking → Routes.
On the Routes page, select the service Mesh control plane project, for example istio-system, from the Namespace menu.

The Location column displays the linked address for each route.
If necessary, use the filter to find the jaeger route. Click the route Location to launch the console.
Click Log In With OpenShift.

Procedure from Kiali console

Launch the Kiali console.
Click Distributed Tracing in the left navigation pane.
Click Log In With OpenShift.

Procedure from the CLI

Log in to the OpenShift Container Platform CLI as a user with the cluster-admin role. If you use Red Hat OpenShift Dedicated, you must have an account with the dedicated-admin role.
```
$ oc login --username=<NAMEOFUSER> https://<HOSTNAME>:6443
```
To query for details of the route using the command line, enter the following command. In this example, istio-system is the service Mesh control plane namespace.
```
$ oc get route -n istio-system jaeger -o jsonpath='{.spec.host}'
```
Launch a browser and navigate to https://<JAEGER_URL>, where <JAEGER_URL> is the route that you discovered in the previous step.
Log in using the same user name and password that you use to access the OpenShift Container Platform console.
If you have added services to the service mesh and have generated traces, you can use the filters and Find Traces button to search your trace data.

If you are validating the console installation, there is no trace data to display.

For more information about configuring Jaeger, see the distributed tracing documentation.

Accessing the Grafana console

Grafana is an analytics tool you can use to view, query, and analyze your service mesh metrics. In this example, istio-system is the service Mesh control plane namespace. To access Grafana, do the following:

Procedure

Log in to the OpenShift Container Platform web console.
Click the Project menu and select the project where you installed the service Mesh control plane, for example istio-system.
Click Routes.
Click the link in the Location column for the Grafana row.
Log in to the Grafana console with your OpenShift Container Platform credentials.

Accessing the Prometheus console

Prometheus is a monitoring and alerting tool that you can use to collect multi-dimensional data about your microservices. In this example, istio-system is the service Mesh control plane namespace.

Procedure

Log in to the OpenShift Container Platform web console.
Click the Project menu and select the project where you installed the service Mesh control plane, for example istio-system.
Click Routes.
Click the link in the Location column for the Prometheus row.
Log in to the Prometheus console with your OpenShift Container Platform credentials.

Integrating with user-workload monitoring

By default, Red Hat OpenShift service Mesh (OSSM) installs the service Mesh control plane (SMCP) with a dedicated instance of Prometheus for collecting metrics from a mesh. However, production systems need more advanced monitoring systems, like OpenShift Container Platform monitoring for user-defined projects.

The following steps show how to integrate service Mesh with user-workload monitoring.

Prerequisites

User-workload monitoring is enabled.
Red Hat OpenShift service Mesh Operator 2.4 is installed.
Kiali Operator 1.65 is installed.

Procedure

Grant the cluster-monitoring-view role to the Kiali service Account:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kiali-monitoring-rbac
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-monitoring-view
subjects:
- kind: serviceAccount
  name: kiali-service-account
  namespace: istio-system

Configure Kiali for user-workload monitoring:

apiVersion: kiali.io/v1alpha1
kind: Kiali
metadata:
  name: kiali-user-workload-monitoring
  namespace: istio-system
spec:
  external_services:
    prometheus:
      auth:
        type: bearer
        use_kiali_token: true
      query_scope:
        mesh_id: "basic-istio-system"
      thanos_proxy:
        enabled: true
      url: https://thanos-querier.openshift-monitoring.svc.cluster.local:9091

If you use Istio Operator 2.4, use this configuration to configure Kiali for user-workload monitoring:

apiVersion: kiali.io/v1alpha1
kind: Kiali
metadata:
  name: kiali-user-workload-monitoring
  namespace: istio-system
spec:
  external_services:
    istio:
      config_map_name: istio-<smcp-name>
      istio_sidecar_injector_config_map_name: istio-sidecar-injector-<smcp-name>
      istiod_deployment_name: istiod-<smcp-name>
      url_service_version: 'http://istiod-<smcp-name>.istio-system:15014/version'
    prometheus:
      auth:
        token: secret:thanos-querier-web-token:token
        type: bearer
        use_kiali_token: false
      query_scope:
        mesh_id: "basic-istio-system"
      thanos_proxy:
        enabled: true
      url: https://thanos-querier.openshift-monitoring.svc.cluster.local:9091
  version: v1.65

Configure the SMCP for external Prometheus:

apiVersion: maistra.io/v2
kind: serviceMeshControlPlane
metadata:
  name: basic
  namespace: istio-system
spec:
  addons:
    prometheus:
      enabled: false (1)
    grafana:
      enabled: false (2)
    kiali:
      name: kiali-user-workload-monitoring
  meshConfig:
    extensionProviders:
    - name: prometheus
      prometheus: {}

1	Disable the default Prometheus instance provided by OSSM.
2	Disable Grafana. It is not supported with an external Prometheus instance.

Apply a custom network policy to allow ingress traffic from the monitoring namespace:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: user-workload-access
  namespace: istio-system (1)
spec:
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          network.openshift.io/policy-group: monitoring
  podSelector: {}
  policyTypes:
  - Ingress

1	The custom network policy must be applied to all namespaces.

Apply a Telemetry object to enable traffic metrics in Istio proxies:

apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
  name: enable-prometheus-metrics
  namespace: istio-system (1)
spec:
  selector: (2)
    matchLabels:
      app: bookinfo
  metrics:
  - providers:
    - name: prometheus

1	A `Telemetry` object created in the control plane namespace applies to all workloads in a mesh. To apply telemetry to only one namespace, create the object in the target namespace.
2	Optional: Setting the `selector.matchLabels` spec applies the `Telemetry` object to specific workloads in the target namespace.

Apply a serviceMonitor object to monitor the Istio control plane:

apiVersion: monitoring.coreos.com/v1
kind: serviceMonitor
metadata:
  name: istiod-monitor
  namespace: istio-system (1)
spec:
  targetLabels:
  - app
  selector:
    matchLabels:
      istio: pilot
  endpoints:
  - port: http-monitoring
    interval: 30s
    relabelings:
    - action: replace
      replacement: "basic-istio-system" (2)
      targetLabel: mesh_id

1	Create this `serviceMonitor` object in the Istio control plane namespace because it monitors the Istiod service. In this example, the namespace is `istio-system`.
2	The string `"basic-istio-system"` is a combination of the SMCP name and its namespace, but any label can be used as long as it is unique for every mesh using user workload monitoring in the cluster. The `spec.prometheus.query_scope` of the Kiali resource configured in Step 2 needs to match this value.

If there is only one mesh using user-workload monitoring, then both the mesh_id relabeling and the spec.prometheus.query_scope field in the Kiali resource are optional (but the query_scope field given here should be removed if the mesh_id label is removed).

If multiple mesh instances on the cluster might use user-workload monitoring, then both the mesh_id relabelings and the spec.prometheus.query_scope field in the Kiali resource are required. This ensures that Kiali only sees metrics from its associated mesh.

If you are not deploying Kiali, you can still apply mesh_id relabeling so that metrics from different meshes can be distinguished from one another.

Apply a PodMonitor object to collect metrics from Istio proxies:

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: istio-proxies-monitor
  namespace: istio-system (1)
spec:
  selector:
    matchExpressions:
    - key: istio-prometheus-ignore
      operator: DoesNotExist
  podMetricsEndpoints:
  - path: /stats/prometheus
    interval: 30s
    relabelings:
    - action: keep
      sourceLabels: [__meta_kubernetes_pod_container_name]
      regex: "istio-proxy"
    - action: keep
      sourceLabels: [__meta_kubernetes_pod_annotationpresent_prometheus_io_scrape]
    - action: replace
      regex: (\d+);(([A-Fa-f0-9]{1,4}::?){1,7}[A-Fa-f0-9]{1,4})
      replacement: '[$2]:$1'
      sourceLabels: [__meta_kubernetes_pod_annotation_prometheus_io_port,
      __meta_kubernetes_pod_ip]
      targetLabel: __address__
    - action: replace
      regex: (\d+);((([0-9]+?)(\.|$)){4})
      replacement: $2:$1
      sourceLabels: [__meta_kubernetes_pod_annotation_prometheus_io_port,
      __meta_kubernetes_pod_ip]
      targetLabel: __address__
    - action: labeldrop
      regex: "__meta_kubernetes_pod_label_(.+)"
    - sourceLabels: [__meta_kubernetes_namespace]
      action: replace
      targetLabel: namespace
    - sourceLabels: [__meta_kubernetes_pod_name]
      action: replace
      targetLabel: pod_name
    - action: replace
      replacement: "basic-istio-system" (2)
      targetLabel: mesh_id

1	Since OpenShift Container Platform monitoring ignores the `namespaceSelector` spec in `serviceMonitor` and `PodMonitor` objects, you must apply the `PodMonitor` object in all mesh namespaces, including the control plane namespace.
2	The string `"basic-istio-system"` is a combination of the SMCP name and its namespace, but any label can be used as long as it is unique for every mesh using user workload monitoring in the cluster. The `spec.prometheus.query_scope` of the Kiali resource configured in Step 2 needs to match this value.

If you are not deploying Kiali, you can still apply mesh_id relabeling so that metrics from different meshes can be distinguished from one another.

Open the OpenShift Container Platform web console, and check that metrics are visible.

Discovering console addresses

Accessing the Kiali console

Viewing service mesh data in the Kiali console

Changing graph layouts in Kiali

Viewing logs in the Kiali console

Viewing metrics in the Kiali console

Distributed tracing

Configuring the Red Hat OpenShift distributed tracing platform (Tempo) and the Red Hat build of OpenTelemetry

Configuring the OpenTelemetryCollector in a mTLS encrypted service Mesh member namespace

Configuring the Red Hat OpenShift distributed tracing platform (Tempo) in a mTLS encrypted service Mesh member namespace

Connecting an existing distributed tracing Jaeger instance

Adjusting the sampling rate

Accessing the Jaeger console

Accessing the Grafana console

Accessing the Prometheus console

Integrating with user-workload monitoring

Additional resources

Configuring the `OpenTelemetryCollector` in a mTLS encrypted service Mesh member namespace