$ oc get pods -n openshift-operators
This section describes how to identify and resolve common problems in Red Hat OpenShift service Mesh. Use the following sections to help troubleshoot and debug problems when deploying Red Hat OpenShift service Mesh on OpenShift Container Platform.
In order to understand what version of Red Hat OpenShift service Mesh you have deployed on your system, you need to understand how each of the component versions is managed.
Operator version - The most current Operator version is 2.2.3. The Operator version number only indicates the version of the currently installed Operator. Because the Red Hat OpenShift service Mesh Operator supports multiple versions of the service Mesh control plane, the version of the Operator does not determine the version of your deployed serviceMeshControlPlane
resources.
Upgrading to the latest Operator version automatically applies patch updates, but does not automatically upgrade your service Mesh control plane to the latest minor version. |
serviceMeshControlPlane version - The serviceMeshControlPlane
version determines what version of Red Hat OpenShift service Mesh you are using. The value of the spec.version
field in the serviceMeshControlPlane
resource controls the architecture and configuration settings that are used to install and deploy Red Hat OpenShift service Mesh. When you create the service Mesh control plane you can set the version in one of two ways:
To configure in the Form View, select the version from the Control Plane Version menu.
To configure in the YAML View, set the value for spec.version
in the YAML file.
Operator Lifecycle Manager (OLM) does not manage service Mesh control plane upgrades, so the version number for your Operator and serviceMeshControlPlane
(SMCP) may not match, unless you have manually upgraded your SMCP.
In addition to the information in this section, be sure to review the following topics:
When you install the Red Hat OpenShift service Mesh Operators, OpenShift automatically creates the following objects as part of a successful Operator installation:
config maps
custom resource definitions
deployments
pods
replica sets
roles
role bindings
secrets
service accounts
services
You can verify that the Operator pods are available and running by using the OpenShift Container Platform console.
Navigate to Workloads → Pods.
Select the openshift-operators
namespace.
Verify that the following pods exist and have a status of running
:
istio-operator
jaeger-operator
kiali-operator
Select the openshift-operators-redhat
namespace.
Verify that the elasticsearch-operator
pod exists and has a status of running
.
Verify the Operator pods are available and running in the openshift-operators
namespace with the following command:
$ oc get pods -n openshift-operators
NAME READY STATUS RESTARTS AGE
istio-operator-bb49787db-zgr87 1/1 Running 0 15s
jaeger-operator-7d5c4f57d8-9xphf 1/1 Running 0 2m42s
kiali-operator-f9c8d84f4-7xh2v 1/1 Running 0 64s
Verify the Elasticsearch operator with the following command:
$ oc get pods -n openshift-operators-redhat
NAME READY STATUS RESTARTS AGE
elasticsearch-operator-d4f59b968-796vq 1/1 Running 0 15s
If you experience Operator issues:
Verify your Operator subscription status.
Verify that you did not install a community version of the Operator, instead of the supported Red Hat version.
Verify that you have the cluster-admin
role to install Red Hat OpenShift service Mesh.
Check for any errors in the Operator pod logs if the issue is related to installation of Operators.
You can install Operators only through the OpenShift console, the OperatorHub is not accessible from the command line. |
You can view Operator logs by using the oc logs
command. Red Hat may request logs to help resolve support cases.
To view Operator pod logs, enter the command:
$ oc logs -n openshift-operators <podName>
For example,
$ oc logs -n openshift-operators istio-operator-bb49787db-zgr87
The service Mesh control plane is composed of Istiod, which consolidates several previous control plane components (Citadel, Galley, Pilot) into a single binary. Deploying the serviceMeshControlPlane
also creates the other components that make up Red Hat OpenShift service Mesh as described in the architecture topic.
When you create the service Mesh control plane, the service Mesh Operator uses the parameters that you have specified in the serviceMeshControlPlane
resource file to do the following:
Creates the Istio components and deploys the following pods:
istiod
istio-ingressgateway
istio-egressgateway
grafana
prometheus
wasm-cacher
Calls the Kiali Operator to create Kaili deployment based on configuration in either the SMCP or the Kiali custom resource.
You view the Kiali components under the Kiali Operator, not the service Mesh Operator. |
Calls the Red Hat OpenShift distributed tracing platform Operator to create distributed tracing platform components based on configuration in either the SMCP or the Jaeger custom resource.
You view the Jaeger components under the Red Hat OpenShift distributed tracing platform Operator and the Elasticsearch components under the Red Hat Elasticsearch Operator, not the service Mesh Operator. |
You can verify the service Mesh control plane installation in the OpenShift Container Platform web console.
Navigate to Operators → Installed Operators.
Select the <istio-system>
namespace.
Select the Red Hat OpenShift service Mesh Operator.
Click the Istio service Mesh Control Plane tab.
Click the name of your control plane, for example basic
.
To view the resources created by the deployment, click the Resources tab. You can use the filter to narrow your view, for example, to check that all the Pods have a status of running
.
If the SMCP status indicates any problems, check the status:
output in the YAML file for more information.
Navigate back to Operators → Installed Operators.
Select the OpenShift Elasticsearch Operator.
Click the Elasticsearch tab.
Click the name of the deployment, for example elasticsearch
.
To view the resources created by the deployment, click the Resources tab. .
If the Status
column any problems, check the status:
output on the YAML tab for more information.
Navigate back to Operators → Installed Operators.
Select the Red Hat OpenShift distributed tracing platform Operator.
Click the Jaeger tab.
Click the name of your deployment, for example jaeger
.
To view the resources created by the deployment, click the Resources tab.
If the Status
column indicates any problems, check the status:
output on the YAML tab for more information.
Navigate to Operators → Installed Operators.
Select the Kiali Operator.
Click the Istio service Mesh Control Plane tab.
Click the name of your deployment, for example kiali
.
To view the resources created by the deployment, click the Resources tab.
If the Status
column any problems, check the status:
output on the YAML tab for more information.
Run the following command to see if the service Mesh control plane pods are available and running, where istio-system
is the namespace where you installed the SMCP.
$ oc get pods -n istio-system
NAME READY STATUS RESTARTS AGE
grafana-6776785cfc-6fz7t 2/2 Running 0 102s
istio-egressgateway-5f49dd99-l9ppq 1/1 Running 0 103s
istio-ingressgateway-6dc885c48-jjd8r 1/1 Running 0 103s
istiod-basic-6c9cc55998-wg4zq 1/1 Running 0 2m14s
jaeger-6865d5d8bf-zrfss 2/2 Running 0 100s
kiali-579799fbb7-8mwc8 1/1 Running 0 46s
prometheus-5c579dfb-6qhjk 2/2 Running 0 115s
wasm-cacher-basic-5b99bfcddb-m775l 1/1 Running 0 86s
Check the status of the service Mesh control plane deployment by using the following command. Replace istio-system
with the namespace where you deployed the SMCP.
$ oc get smcp -n <istio-system>
The installation has finished successfully when the STATUS column is ComponentsReady
.
NAME READY STATUS PROFILES VERSION AGE
basic 10/10 ComponentsReady ["default"] 2.1.3 4m2s
If you have modified and redeployed your service Mesh control plane, the status should read UpdateSuccessful
.
NAME READY STATUS TEMPLATE VERSION AGE
basic-install 10/10 UpdateSuccessful default v1.1 3d16h
If the SMCP status indicates anything other than ComponentsReady
check the status:
output in the SCMP resource for more information.
$ oc describe smcp <smcp-name> -n <controlplane-namespace>
$ oc describe smcp basic -n istio-system
Check the status of the Jaeger deployment with the following command, where istio-system
is the namespace where you deployed the SMCP.
$ oc get jaeger -n <istio-system>
NAME STATUS VERSION STRATEGY STORAGE AGE
jaeger Running 1.30.0 allinone memory 15m
Check the status of the Kiali deployment with the following command, where istio-system
is the namespace where you deployed the SMCP.
$ oc get kiali -n <istio-system>
NAME AGE
kiali 15m
You can view your application’s topology, health, and metrics in the Kiali console. If your service is experiencing problems, the Kiali console lets you view the data flow through your service. You can view insights about the mesh components at different levels, including abstract applications, services, and workloads. Kiali also provides an interactive graph view of your namespace in real time.
To access the Kiali console you must have Red Hat OpenShift service Mesh installed, Kiali installed and configured.
The installation process creates a route to access the Kiali console.
If you know the URL for the Kiali console, you can access it directly. If you do not know the URL, use the following directions.
Log in to the OpenShift Container Platform web console with an administrator role.
Click Home → Projects.
On the Projects page, if necessary, use the filter to find the name of your project.
Click the name of your project, for example, bookinfo
.
On the Project details page, in the Launcher section, click the Kiali link.
Log in to the Kiali console with the same user name and password that you use to access the OpenShift Container Platform console.
When you first log in to the Kiali Console, you see the Overview page which displays all the namespaces in your service mesh that you have permission to view.
If you are validating the console installation and namespaces have not yet been added to the mesh, there might not be any data to display other than istio-system
.
Log in to the OpenShift Container Platform web console with a developer role.
Click Project.
On the Project Details page, if necessary, use the filter to find the name of your project.
Click the name of your project, for example, bookinfo
.
On the Project page, in the Launcher section, click the Kiali link.
Click Log In With OpenShift.
To access the Jaeger console you must have Red Hat OpenShift service Mesh installed, Red Hat OpenShift distributed tracing platform installed and configured.
The installation process creates a route to access the Jaeger console.
If you know the URL for the Jaeger console, you can access it directly. If you do not know the URL, use the following directions.
Log in to the OpenShift Container Platform web console as a user with cluster-admin rights. If you use Red Hat OpenShift Dedicated, you must have an account with the dedicated-admin
role.
Navigate to Networking → Routes.
On the Routes page, select the service Mesh control plane project, for example istio-system
, from the Namespace menu.
The Location column displays the linked address for each route.
If necessary, use the filter to find the jaeger
route. Click the route Location to launch the console.
Click Log In With OpenShift.
Launch the Kiali console.
Click Distributed Tracing in the left navigation pane.
Click Log In With OpenShift.
Log in to the OpenShift Container Platform CLI as a user with the cluster-admin
role. If you use Red Hat OpenShift Dedicated, you must have an account with the dedicated-admin
role.
$ oc login --username=<NAMEOFUSER> https://<HOSTNAME>:6443
To query for details of the route using the command line, enter the following command. In this example, istio-system
is the service Mesh control plane namespace.
$ export JAEGER_URL=$(oc get route -n istio-system jaeger -o jsonpath='{.spec.host}')
Launch a browser and navigate to https://<JAEGER_URL>
, where <JAEGER_URL>
is the route that you discovered in the previous step.
Log in using the same user name and password that you use to access the OpenShift Container Platform console.
If you have added services to the service mesh and have generated traces, you can use the filters and Find Traces button to search your trace data.
If you are validating the console installation, there is no trace data to display.
If you are experiencing issues while deploying the service Mesh control plane,
Ensure that the serviceMeshControlPlane
resource is installed in a project that is separate from your services and Operators. This documentation uses the istio-system
project as an example, but you can deploy your control plane in any project as long as it is separate from the project that contains your Operators and services.
Ensure that the serviceMeshControlPlane
and Jaeger
custom resources are deployed in the same project. For example, use the istio-system
project for both.
The data plane is a set of intelligent proxies that intercept and control all inbound and outbound network communications between services in the service mesh.
Red Hat OpenShift service Mesh relies on a proxy sidecar within the application’s pod to provide service mesh capabilities to the application.
Red Hat OpenShift service Mesh does not automatically inject proxy sidecars to pods. You must opt in to sidecar injection.
Check to see if automatic injection is enabled in the Deployment for your application. If automatic injection for the Envoy proxy is enabled, there should be a sidecar.istio.io/inject:"true"
annotation in the Deployment
resource under spec.template.metadata.annotations
.
Check to see if automatic injection is enabled in the Deployment for your application. If automatic injection for the Jaeger agent is enabled, there should be a sidecar.jaegertracing.io/inject:"true"
annotation in the Deployment
resource.
For more information about sidecar injection, see Enabling automatic injection
The Envoy proxy intercepts all inbound and outbound traffic for all services in the service mesh. Envoy also collects and reports telemetry on the service mesh. Envoy is deployed as a sidecar to the relevant service in the same pod.
Envoy access logs are useful in diagnosing traffic failures and flows, and help with end-to-end traffic flow analysis.
To enable access logging for all istio-proxy containers, edit the serviceMeshControlPlane
(SMCP) object to add a file name for the logging output.
Log in to the OpenShift Container Platform CLI as a user with the cluster-admin role. Enter the following command. Then, enter your username and password when prompted.
$ oc login --username=<NAMEOFUSER> https://<HOSTNAME>:6443
Change to the project where you installed the service Mesh control plane, for example istio-system
.
$ oc project istio-system
Edit the serviceMeshControlPlane
file.
$ oc edit smcp <smcp_name>
As show in the following example, use name
to specify the file name for the proxy log. If you do not specify a value for name
, no log entries will be written.
spec:
proxy:
accessLogging:
file:
name: /dev/stdout #file name
For more information about troubleshooting pod issues, see Investigating pod issues
If you experience difficulty with a procedure described in this documentation, or with OpenShift Container Platform in general, visit the Red Hat Customer Portal. From the Customer Portal, you can:
Search or browse through the Red Hat Knowledgebase of articles and solutions relating to Red Hat products.
Submit a support case to Red Hat Support.
Access other product documentation.
To identify issues with your cluster, you can use Insights in OpenShift Cluster Manager. Insights provides details about issues and, if available, information on how to solve a problem.
If you have a suggestion for improving this documentation or have found an error, submit a Jira issue for the most relevant documentation component. Please provide specific details, such as the section name and OpenShift Container Platform version.
The Red Hat Knowledgebase provides rich content aimed at helping you make the most of Red Hat’s products and technologies. The Red Hat Knowledgebase consists of articles, product documentation, and videos outlining best practices on installing, configuring, and using Red Hat products. In addition, you can search for solutions to known issues, each providing concise root cause descriptions and remedial steps.
In the event of an OpenShift Container Platform issue, you can perform an initial search to determine if a solution already exists within the Red Hat Knowledgebase.
You have a Red Hat Customer Portal account.
Log in to the Red Hat Customer Portal.
In the main Red Hat Customer Portal search field, input keywords and strings relating to the problem, including:
OpenShift Container Platform components (such as etcd)
Related procedure (such as installation)
Warnings, error messages, and other outputs related to explicit failures
Click Search.
Select the OpenShift Container Platform product filter.
Select the Knowledgebase content type filter.
The oc adm must-gather
CLI command collects the information from your cluster that is most likely needed for debugging issues, including:
Resource definitions
service logs
By default, the oc adm must-gather
command uses the default plug-in image and writes into ./must-gather.local
.
Alternatively, you can collect specific information by running the command with the appropriate arguments as described in the following sections:
To collect data related to one or more specific features, use the --image
argument with an image, as listed in a following section.
For example:
$ oc adm must-gather --image=registry.redhat.io/container-native-virtualization/cnv-must-gather-rhel8:v4.9.0
To collect the audit logs, use the -- /usr/bin/gather_audit_logs
argument, as described in a following section.
For example:
$ oc adm must-gather -- /usr/bin/gather_audit_logs
Audit logs are not collected as part of the default set of information to reduce the size of the files. |
When you run oc adm must-gather
, a new pod with a random name is created in a new project on the cluster. The data is collected on that pod and saved in a new directory that starts with must-gather.local
. This directory is created in the current working directory.
For example:
NAMESPACE NAME READY STATUS RESTARTS AGE
...
openshift-must-gather-5drcj must-gather-bklx4 2/2 Running 0 72s
openshift-must-gather-5drcj must-gather-s8sdh 2/2 Running 0 72s
...
You can use the oc adm must-gather
CLI command to collect information about your cluster, including features and objects associated with Red Hat OpenShift service Mesh.
Access to the cluster as a user with the cluster-admin
role.
The OpenShift Container Platform CLI (oc
) installed.
To collect Red Hat OpenShift service Mesh data with must-gather
, you must specify the Red Hat OpenShift service Mesh image.
$ oc adm must-gather --image=registry.redhat.io/openshift-service-mesh/istio-must-gather-rhel8
To collect Red Hat OpenShift service Mesh data for a specific service Mesh control plane namespace with must-gather
, you must specify the Red Hat OpenShift service Mesh image and namespace. In this example, replace <namespace>
with your service Mesh control plane namespace, such as istio-system
.
$ oc adm must-gather --image=registry.redhat.io/openshift-service-mesh/istio-must-gather-rhel8 gather <namespace>
For prompt support, supply diagnostic information for both OpenShift Container Platform and Red Hat OpenShift service Mesh.
You have installed the OpenShift CLI (oc
).
You have a Red Hat Customer Portal account.
You have access to OpenShift Cluster Manager.
Log in to the Red Hat Customer Portal and select SUPPORT CASES → Open a case.
Select the appropriate category for your issue (such as Defect / Bug), product (OpenShift Container Platform), and product version (4.6, if this is not already autofilled).
Review the list of suggested Red Hat Knowledgebase solutions for a potential match against the problem that is being reported. If the suggested articles do not address the issue, click Continue.
Enter a concise but descriptive problem summary and further details about the symptoms being experienced, as well as your expectations.
Review the updated list of suggested Red Hat Knowledgebase solutions for a potential match against the problem that is being reported. The list is refined as you provide more information during the case creation process. If the suggested articles do not address the issue, click Continue.
Ensure that the account information presented is as expected, and if not, amend accordingly.
Check that the autofilled OpenShift Container Platform Cluster ID is correct. If it is not, manually obtain your cluster ID.
To manually obtain your cluster ID using the OpenShift Container Platform web console:
Navigate to Home → Dashboards → Overview.
Find the value in the Cluster ID field of the Details section.
Alternatively, it is possible to open a new support case through the OpenShift Container Platform web console and have your cluster ID autofilled.
From the toolbar, navigate to (?) Help → Open Support Case.
The Cluster ID value is autofilled.
To obtain your cluster ID using the OpenShift CLI (oc
), run the following command:
$ oc get clusterversion -o jsonpath='{.items[].spec.clusterID}{"\n"}'
Complete the following questions where prompted and then click Continue:
Where are you experiencing the behavior? What environment?
When does the behavior occur? Frequency? Repeatedly? At certain times?
What information can you provide around time-frames and the business impact?
Upload relevant diagnostic data files and click Continue. It is recommended to include data gathered using the oc adm must-gather
command as a starting point, plus any issue specific data that is not collected by that command.
Input relevant case management details and click Continue.
Preview the case details and click Submit.