This is a cache of https://docs.openshift.com/container-platform/4.11/service_mesh/v2x/ossm-observability.html. It is a snapshot of the page at 2024-11-27T15:49:01.817+0000.
Metrics, logs, and traces - Service Mesh 2.x | Service Mesh | OpenShift Container Platform 4.11
×

Discovering console addresses

Red Hat OpenShift Service Mesh provides the following consoles to view your service mesh data:

  • Kiali console - Kiali is the management console for Red Hat OpenShift Service Mesh.

  • Jaeger console - Jaeger is the management console for Red Hat OpenShift distributed tracing platform.

  • Grafana console - Grafana provides mesh administrators with advanced query and metrics analysis and dashboards for Istio data. Optionally, Grafana can be used to analyze service mesh metrics.

  • Prometheus console - Red Hat OpenShift Service Mesh uses Prometheus to store telemetry information from services.

When you install the Service Mesh control plane, it automatically generates routes for each of the installed components. Once you have the route address, you can access the Kiali, Jaeger, Prometheus, or Grafana console to view and manage your service mesh data.

Prerequisite
  • The component must be enabled and installed. For example, if you did not install distributed tracing, you will not be able to access the Jaeger console.

Procedure from OpenShift console
  1. Log in to the OpenShift Container Platform web console as a user with cluster-admin rights. If you use Red Hat OpenShift Dedicated, you must have an account with the dedicated-admin role.

  2. Navigate to NetworkingRoutes.

  3. On the Routes page, select the Service Mesh control plane project, for example istio-system, from the Namespace menu.

    The Location column displays the linked address for each route.

  4. If necessary, use the filter to find the component console whose route you want to access. Click the route Location to launch the console.

  5. Click Log In With OpenShift.

Procedure from the CLI
  1. Log in to the OpenShift Container Platform CLI as a user with the cluster-admin role. If you use Red Hat OpenShift Dedicated, you must have an account with the dedicated-admin role.

    $ oc login --username=<NAMEOFuser> https://<HOSTNAME>:6443
  2. Switch to the Service Mesh control plane project. In this example, istio-system is the Service Mesh control plane project. Run the following command:

    $ oc project istio-system
  3. To get the routes for the various Red Hat OpenShift Service Mesh consoles, run the folowing command:

    $ oc get routes

    This command returns the URLs for the Kiali, Jaeger, Prometheus, and Grafana web consoles, and any other routes in your service mesh. You should see output similar to the following:

    NAME                    HOST/PORT                         SERVICES              PORT    TERMINATION
    bookinfo-gateway        bookinfo-gateway-yourcompany.com  istio-ingressgateway          http2
    grafana                 grafana-yourcompany.com           grafana               <all>   reencrypt/Redirect
    istio-ingressgateway    istio-ingress-yourcompany.com     istio-ingressgateway  8080
    jaeger                  jaeger-yourcompany.com            jaeger-query          <all>   reencrypt
    kiali                   kiali-yourcompany.com             kiali                 20001   reencrypt/Redirect
    prometheus              prometheus-yourcompany.com        prometheus            <all>   reencrypt/Redirect
  4. Copy the URL for the console you want to access from the HOST/PORT column into a browser to open the console.

  5. Click Log In With OpenShift.

Accessing the Kiali console

You can view your application’s topology, health, and metrics in the Kiali console. If your service is experiencing problems, the Kiali console lets you view the data flow through your service. You can view insights about the mesh components at different levels, including abstract applications, services, and workloads. Kiali also provides an interactive graph view of your namespace in real time.

To access the Kiali console you must have Red Hat OpenShift Service Mesh installed, Kiali installed and configured.

The installation process creates a route to access the Kiali console.

If you know the URL for the Kiali console, you can access it directly. If you do not know the URL, use the following directions.

Procedure for administrators
  1. Log in to the OpenShift Container Platform web console with an administrator role.

  2. Click HomeProjects.

  3. On the Projects page, if necessary, use the filter to find the name of your project.

  4. Click the name of your project, for example, bookinfo.

  5. On the Project details page, in the Launcher section, click the Kiali link.

  6. Log in to the Kiali console with the same user name and password that you use to access the OpenShift Container Platform console.

    When you first log in to the Kiali Console, you see the Overview page which displays all the namespaces in your service mesh that you have permission to view.

    If you are validating the console installation and namespaces have not yet been added to the mesh, there might not be any data to display other than istio-system.

Procedure for developers
  1. Log in to the OpenShift Container Platform web console with a developer role.

  2. Click Project.

  3. On the Project Details page, if necessary, use the filter to find the name of your project.

  4. Click the name of your project, for example, bookinfo.

  5. On the Project page, in the Launcher section, click the Kiali link.

  6. Click Log In With OpenShift.

Viewing service mesh data in the Kiali console

The Kiali Graph offers a powerful visualization of your mesh traffic. The topology combines real-time request traffic with your Istio configuration information to present immediate insight into the behavior of your service mesh, letting you quickly pinpoint issues. Multiple Graph Types let you visualize traffic as a high-level service topology, a low-level workload topology, or as an application-level topology.

There are several graphs to choose from:

  • The App graph shows an aggregate workload for all applications that are labeled the same.

  • The Service graph shows a node for each service in your mesh but excludes all applications and workloads from the graph. It provides a high level view and aggregates all traffic for defined services.

  • The Versioned App graph shows a node for each version of an application. All versions of an application are grouped together.

  • The Workload graph shows a node for each workload in your service mesh. This graph does not require you to use the application and version labels. If your application does not use version labels, use this the graph.

Graph nodes are decorated with a variety of information, pointing out various route routing options like virtual services and service entries, as well as special configuration like fault-injection and circuit breakers. It can identify mTLS issues, latency issues, error traffic and more. The Graph is highly configurable, can show traffic animation, and has powerful Find and Hide abilities.

Click the Legend button to view information about the shapes, colors, arrows, and badges displayed in the graph.

To view a summary of metrics, select any node or edge in the graph to display its metric details in the summary details panel.

Changing graph layouts in Kiali

The layout for the Kiali graph can render differently depending on your application architecture and the data to display. For example, the number of graph nodes and their interactions can determine how the Kiali graph is rendered. Because it is not possible to create a single layout that renders nicely for every situation, Kiali offers a choice of several different layouts.

Prerequisites
  • If you do not have your own application installed, install the Bookinfo sample application. Then generate traffic for the Bookinfo application by entering the following command several times.

    $ curl "http://$GATEWAY_URL/productpage"

    This command simulates a user visiting the productpage microservice of the application.

Procedure
  1. Launch the Kiali console.

  2. Click Log In With OpenShift.

  3. In Kiali console, click Graph to view a namespace graph.

  4. From the Namespace menu, select your application namespace, for example, bookinfo.

  5. To choose a different graph layout, do either or both of the following:

    • Select different graph data groupings from the menu at the top of the graph.

      • App graph

      • Service graph

      • Versioned App graph (default)

      • Workload graph

    • Select a different graph layout from the Legend at the bottom of the graph.

      • Layout default dagre

      • Layout 1 cose-bilkent

      • Layout 2 cola

Viewing logs in the Kiali console

You can view logs for your workloads in the Kiali console. The Workload Detail page includes a Logs tab which displays a unified logs view that displays both application and proxy logs. You can select how often you want the log display in Kiali to be refreshed.

To change the logging level on the logs displayed in Kiali, you change the logging configuration for the workload or the proxy.

Prerequisites
  • Service Mesh installed and configured.

  • Kiali installed and configured.

  • The address for the Kiali console.

  • Application or Bookinfo sample application added to the mesh.

Procedure
  1. Launch the Kiali console.

  2. Click Log In With OpenShift.

    The Kiali Overview page displays namespaces that have been added to the mesh that you have permissions to view.

  3. Click Workloads.

  4. On the Workloads page, select the project from the Namespace menu.

  5. If necessary, use the filter to find the workload whose logs you want to view. Click the workload Name. For example, click ratings-v1.

  6. On the Workload Details page, click the Logs tab to view the logs for the workload.

If you do not see any log entries, you may need to adjust either the Time Range or the Refresh interval.

Viewing metrics in the Kiali console

You can view inbound and outbound metrics for your applications, workloads, and services in the Kiali console. The Detail pages include the following tabs:

  • inbound Application metrics

  • outbound Application metrics

  • inbound Workload metrics

  • outbound Workload metrics

  • inbound Service metrics

These tabs display predefined metrics dashboards, tailored to the relevant application, workload or service level. The application and workload detail views show request and response metrics such as volume, duration, size, or TCP traffic. The service detail view shows request and response metrics for inbound traffic only.

Kiali lets you customize the charts by choosing the charted dimensions. Kiali can also present metrics reported by either source or destination proxy metrics. And for troubleshooting, Kiali can overlay trace spans on the metrics.

Prerequisites
  • Service Mesh installed and configured.

  • Kiali installed and configured.

  • The address for the Kiali console.

  • (Optional) Distributed tracing installed and configured.

Procedure
  1. Launch the Kiali console.

  2. Click Log In With OpenShift.

    The Kiali Overview page displays namespaces that have been added to the mesh that you have permissions to view.

  3. Click either Applications, Workloads, or Services.

  4. On the Applications, Workloads, or Services page, select the project from the Namespace menu.

  5. If necessary, use the filter to find the application, workload, or service whose logs you want to view. Click the Name.

  6. On the Application Detail, Workload Details, or Service Details page, click either the Inbound Metrics or Outbound Metrics tab to view the metrics.

Distributed tracing

Distributed tracing is the process of tracking the performance of individual services in an application by tracing the path of the service calls in the application. Each time a user takes action in an application, a request is executed that might require many services to interact to produce a response. The path of this request is called a distributed transaction.

Red Hat OpenShift Service Mesh uses Red Hat OpenShift distributed tracing platform to allow developers to view call flows in a microservice application.

Connecting an existing distributed tracing instance

If you already have an existing Red Hat OpenShift distributed tracing platform (Jaeger) instance in OpenShift Container Platform, you can configure your ServiceMeshControlPlane resource to use that instance for distributed tracing platform.

Prerequisites
  • Red Hat OpenShift distributed tracing platform instance installed and configured.

Procedure
  1. In the OpenShift Container Platform web console, click OperatorsInstalled Operators.

  2. Click the Project menu and select the project where you installed the Service Mesh control plane, for example istio-system.

  3. Click the Red Hat OpenShift Service Mesh Operator. In the Istio Service Mesh Control Plane column, click the name of your ServiceMeshControlPlane resource, for example basic.

  4. Add the name of your distributed tracing platform (Jaeger) instance to the ServiceMeshControlPlane.

    1. Click the YAML tab.

    2. Add the name of your distributed tracing platform (Jaeger) instance to spec.addons.jaeger.name in your ServiceMeshControlPlane resource. In the following example, distr-tracing-production is the name of the distributed tracing platform (Jaeger) instance.

      Example distributed tracing configuration
      spec:
        addons:
          jaeger:
            name: distr-tracing-production
    3. Click Save.

  5. Click Reload to verify the ServiceMeshControlPlane resource was configured correctly.

Adjusting the sampling rate

A trace is an execution path between services in the service mesh. A trace is comprised of one or more spans. A span is a logical unit of work that has a name, start time, and duration. The sampling rate determines how often a trace is persisted.

The Envoy proxy sampling rate is set to sample 100% of traces in your service mesh by default. A high sampling rate consumes cluster resources and performance but is useful when debugging issues. Before you deploy Red Hat OpenShift Service Mesh in production, set the value to a smaller proportion of traces. For example, set spec.tracing.sampling to 100 to sample 1% of traces.

Configure the Envoy proxy sampling rate as a scaled integer representing 0.01% increments.

In a basic installation, spec.tracing.sampling is set to 10000, which samples 100% of traces. For example:

  • Setting the value to 10 samples 0.1% of traces.

  • Setting the value to 500 samples 5% of traces.

The Envoy proxy sampling rate applies for applications that are available to a Service Mesh, and use the Envoy proxy. This sampling rate determines how much data the Envoy proxy collects and tracks.

The Jaeger remote sampling rate applies to applications that are external to the Service Mesh, and do not use the Envoy proxy, such as a database. This sampling rate determines how much data the distributed tracing system collects and stores. For more information, see Distributed tracing configuration options.

Procedure
  1. In the OpenShift Container Platform web console, click OperatorsInstalled Operators.

  2. Click the Project menu and select the project where you installed the control plane, for example istio-system.

  3. Click the Red Hat OpenShift Service Mesh Operator. In the Istio Service Mesh Control Plane column, click the name of your ServiceMeshControlPlane resource, for example basic.

  4. To adjust the sampling rate, set a different value for spec.tracing.sampling.

    1. Click the YAML tab.

    2. Set the value for spec.tracing.sampling in your ServiceMeshControlPlane resource. In the following example, set it to 100.

      Jaeger sampling example
      spec:
        tracing:
          sampling: 100
    3. Click Save.

  5. Click Reload to verify the ServiceMeshControlPlane resource was configured correctly.

Accessing the Jaeger console

To access the Jaeger console you must have Red Hat OpenShift Service Mesh installed, Red Hat OpenShift distributed tracing platform (Jaeger) installed and configured.

The installation process creates a route to access the Jaeger console.

If you know the URL for the Jaeger console, you can access it directly. If you do not know the URL, use the following directions.

Procedure from OpenShift console
  1. Log in to the OpenShift Container Platform web console as a user with cluster-admin rights. If you use Red Hat OpenShift Dedicated, you must have an account with the dedicated-admin role.

  2. Navigate to NetworkingRoutes.

  3. On the Routes page, select the Service Mesh control plane project, for example istio-system, from the Namespace menu.

    The Location column displays the linked address for each route.

  4. If necessary, use the filter to find the jaeger route. Click the route Location to launch the console.

  5. Click Log In With OpenShift.

Procedure from Kiali console
  1. Launch the Kiali console.

  2. Click Distributed Tracing in the left navigation pane.

  3. Click Log In With OpenShift.

Procedure from the CLI
  1. Log in to the OpenShift Container Platform CLI as a user with the cluster-admin role. If you use Red Hat OpenShift Dedicated, you must have an account with the dedicated-admin role.

    $ oc login --username=<NAMEOFuser> https://<HOSTNAME>:6443
  2. To query for details of the route using the command line, enter the following command. In this example, istio-system is the Service Mesh control plane namespace.

    $ export JAEGER_URL=$(oc get route -n istio-system jaeger -o jsonpath='{.spec.host}')
  3. Launch a browser and navigate to https://<JAEGER_URL>, where <JAEGER_URL> is the route that you discovered in the previous step.

  4. Log in using the same user name and password that you use to access the OpenShift Container Platform console.

  5. If you have added services to the service mesh and have generated traces, you can use the filters and Find Traces button to search your trace data.

    If you are validating the console installation, there is no trace data to display.

For more information about configuring Jaeger, see the distributed tracing documentation.

Accessing the Grafana console

Grafana is an analytics tool you can use to view, query, and analyze your service mesh metrics. In this example, istio-system is the Service Mesh control plane namespace. To access Grafana, do the following:

Procedure
  1. Log in to the OpenShift Container Platform web console.

  2. Click the Project menu and select the project where you installed the Service Mesh control plane, for example istio-system.

  3. Click Routes.

  4. Click the link in the Location column for the Grafana row.

  5. Log in to the Grafana console with your OpenShift Container Platform credentials.

Accessing the Prometheus console

Prometheus is a monitoring and alerting tool that you can use to collect multi-dimensional data about your microservices. In this example, istio-system is the Service Mesh control plane namespace.

Procedure
  1. Log in to the OpenShift Container Platform web console.

  2. Click the Project menu and select the project where you installed the Service Mesh control plane, for example istio-system.

  3. Click Routes.

  4. Click the link in the Location column for the Prometheus row.

  5. Log in to the Prometheus console with your OpenShift Container Platform credentials.

Integrating with user-workload monitoring

By default, Red Hat OpenShift Service Mesh (OSSM) installs the Service Mesh control plane (SMCP) with a dedicated instance of Prometheus for collecting metrics from a mesh. However, production systems need more advanced monitoring systems, like OpenShift Container Platform monitoring for user-defined projects.

The following steps show how to integrate Service Mesh with user-workload monitoring.

Prerequisites
  • user-workload monitoring is enabled.

  • Red Hat OpenShift Service Mesh Operator 2.4 is installed.

  • Kiali Operator 1.65 is installed.

Procedure
  1. Create a token to Thanos for Kiali by running the following commands:

    1. Set the SECRET environment variable by running the following command:

      $ SECRET=`oc get secret -n openshift-user-workload-monitoring |
       grep  prometheus-user-workload-token | head -n 1 | awk '{print $1 }'`
    2. Set the TOKEN environment variable by running the following command:

      $ TOKEN=`oc get secret $SECRET -n openshift-user-workload-monitoring -o jsonpath='{.data.token}' | base64 -d`
    3. Create a token to Thanos for Kiali by running the following command:

      $ oc create secret generic thanos-querier-web-token -n istio-system --from-literal=token=$TOKEN
  2. Configure Kiali for user-workload monitoring:

    apiVersion: kiali.io/v1alpha1
    kind: Kiali
    metadata:
      name: kiali-user-workload-monitoring
      namespace: istio-system
    spec:
      external_services:
        istio:
          url_service_version: 'http://istiod-basic.istio-system:15014/version'
          config_map_name: istio-basic (1)
        prometheus:
          auth:
            token: secret:thanos-querier-web-token:token
            type: bearer
            use_kiali_token: false
          query_scope:
            mesh_id: "basic-istio-system"
          thanos_proxy:
            enabled: true
          url: https://thanos-querier.openshift-monitoring.svc.cluster.local:9091
      version: v1.65
    1 Set the ServiceMeshControlPlane name prefixed with istio-.
  3. Configure the SMCP for external Prometheus:

    apiVersion: maistra.io/v2
    kind: ServiceMeshControlPlane
    metadata:
      name: basic
      namespace: istio-system
    spec:
      addons:
        prometheus:
          enabled: false (1)
        grafana:
          enabled: false (2)
        kiali:
          name: kiali-user-workload-monitoring
      meshConfig:
        extensionProviders:
        - name: prometheus
          prometheus: {}
    1 Disable the default Prometheus instance provided by OSSM.
    2 Disable Grafana. It is not supported with an external Prometheus instance.
  4. Apply a custom network policy to allow ingress traffic from the monitoring namespace:

    apiVersion: networking.k8s.io/v1
    kind: NetworkPolicy
    metadata:
      name: user-workload-access
      namespace: bookinfo (1)
    spec:
      ingress:
      - from:
        - namespaceSelector:
            matchLabels:
              network.openshift.io/policy-group: monitoring
      podSelector: {}
      policyTypes:
      - Ingress
    1 The custom network policy must be applied to all namespaces.
  5. Apply a Telemetry object to enable traffic metrics in Istio proxies:

    apiVersion: telemetry.istio.io/v1alpha1
    kind: Telemetry
    metadata:
      name: enable-prometheus-metrics
      namespace: istio-system (1)
    spec:
      selector: (2)
        matchLabels:
          app: bookinfo
      metrics:
      - providers:
        - name: prometheus
    1 A Telemetry object created in the control plane namespace applies to all workloads in a mesh. To apply telemetry to only one namespace, create the object in the target namespace.
    2 Optional: Setting the selector.matchLabels spec applies the Telemetry object to specific workloads in the target namespace.
  6. Apply a ServiceMonitor object to monitor the Istio control plane:

    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      name: istiod-monitor
      namespace: istio-system (1)
    spec:
      targetLabels:
      - app
      selector:
        matchLabels:
          istio: pilot
      endpoints:
      - port: http-monitoring
        interval: 30s
        relabelings:
        - action: replace
          replacement: "basic-istio-system" (2)
          targetLabel: mesh_id
    1 Create this ServiceMonitor object in the Istio control plane namespace because it monitors the Istiod service. In this example, the namespace is istio-system.
    2 The string "basic-istio-system" is a combination of the SMCP name and its namespace, but any label can be used as long as it is unique for every mesh using user workload monitoring in the cluster. The spec.prometheus.query_scope of the Kiali resource configured in Step 2 needs to match this value.

    If there is only one mesh using user-workload monitoring, then both the mesh_id relabeling and the spec.prometheus.query_scope field in the Kiali resource are optional (but the query_scope field given here should be removed if the mesh_id label is removed).

    If there might be multiple meshes using user-workload monitoring, then both the mesh_id relabelings and the spec.prometheus.query_scope field in the Kiali resource are required so that Kiali only sees metrics from its associated mesh. If Kiali is not being deployed, applying the mesh_id relabeling is still recommended so that metrics from different meshes can be distinguished from one another.

  7. Apply a PodMonitor object to collect metrics from Istio proxies:

    apiVersion: monitoring.coreos.com/v1
    kind: PodMonitor
    metadata:
      name: istio-proxies-monitor
      namespace: istio-system (1)
    spec:
      selector:
        matchExpressions:
        - key: istio-prometheus-ignore
          operator: DoesNotExist
      podMetricsEndpoints:
      - path: /stats/prometheus
        interval: 30s
        relabelings:
        - action: keep
          sourceLabels: [__meta_kubernetes_pod_container_name]
          regex: "istio-proxy"
        - action: keep
          sourceLabels: [__meta_kubernetes_pod_annotationpresent_prometheus_io_scrape]
        - action: replace
          regex: (\d+);(([A-Fa-f0-9]{1,4}::?){1,7}[A-Fa-f0-9]{1,4})
          replacement: '[$2]:$1'
          sourceLabels: [__meta_kubernetes_pod_annotation_prometheus_io_port,
          __meta_kubernetes_pod_ip]
          targetLabel: __address__
        - action: replace
          regex: (\d+);((([0-9]+?)(\.|$)){4})
          replacement: $2:$1
          sourceLabels: [__meta_kubernetes_pod_annotation_prometheus_io_port,
          __meta_kubernetes_pod_ip]
          targetLabel: __address__
        - action: labeldrop
          regex: "__meta_kubernetes_pod_label_(.+)"
        - sourceLabels: [__meta_kubernetes_namespace]
          action: replace
          targetLabel: namespace
        - sourceLabels: [__meta_kubernetes_pod_name]
          action: replace
          targetLabel: pod_name
        - action: replace
          replacement: "basic-istio-system" (2)
          targetLabel: mesh_id
    1 Since OpenShift Container Platform monitoring ignores the namespaceSelector spec in ServiceMonitor and PodMonitor objects, you must apply the PodMonitor object in all mesh namespaces, including the control plane namespace.
    2 The string "basic-istio-system" is a combination of the SMCP name and its namespace, but any label can be used as long as it is unique for every mesh using user workload monitoring in the cluster. The spec.prometheus.query_scope of the Kiali resource configured in Step 2 needs to match this value.

    If there is only one mesh using user-workload monitoring, then both the mesh_id relabeling and the spec.prometheus.query_scope field in the Kiali resource are optional (but the query_scope field given here should be removed if the mesh_id label is removed).

    If there might be multiple meshes using user-workload monitoring, then both the mesh_id relabelings and the spec.prometheus.query_scope field in the Kiali resource are required so that Kiali only sees metrics from its associated mesh. If Kiali is not being deployed, applying the mesh_id relabeling is still recommended so that metrics from different meshes can be distinguished from one another.

  8. Open the OpenShift Container Platform web console, and check that metrics are visible.