Troubleshooting tips and frequently asked questions

Review the troubleshooting tips and frequently asked questions to help you troubleshoot issues that you might encounter.

The following troubleshooting content describes the main components of Open Horizon (Open Horizon) and how to investigate the included interfaces to determine the state of the system.

Troubleshooting tools

Many interfaces that are included with Open Horizon provide information that can be used to diagnose problems. This information is available through the management console, and HTTP REST APIs and a Linux shell tool, hzn.

On an edge node you might need to troubleshoot host issues, Horizon software issues, Docker issues, or issues in your configuration or the code in service containers. Edge node host issues are beyond the scope of this document. If you need to troubleshoot Docker issues, you can use many Docker commands and interfaces. For more information, see the Docker documentation.

If the service containers you are running use Apache Kafka (which is based on Kafka) for messaging, you can manually connect to the Kafka streams for Open Horizon to diagnose problems. You can either subscribe to a message topic to observe what was received by Apache Kafka, or you can publish to a message topic to simulate messages from another device. The kafkacat Linux command is a way to publish or subscribe to Apache Kafka. Use the most recent version of this tool. Apache Kafka also provides graphical web pages that you can use to access some information.

On any edge node where Horizon is installed, use the hzn command to debug issues with the local Horizon agent and the remote Horizon exchange. Internally, the hzn command interacts with the provided HTTP REST APIs. The hzn command simplifies access and provides a better user experience than the REST APIs themselves. The hzn command often provides more descriptive text in its output, and it includes a built-in online help system. Use the help system to obtain information and details about what commands to use and details about command syntax and arguments. To view this help information, run the hzn --help or hzn <subcommand> --help commands.

On edge nodes where Horizon packages are not supported or installed, you can directly interact with the underlying HTTP REST APIs instead. For example, you can use the curl utility or other REST API CLI utilities. You can also write a program in a language that supports REST queries.

For example, run the curl utility to check the status of your edge node:

curl localhost:8510/status

Troubleshooting tips

To help troubleshoot specific issues, review the questions about your system state and any associated tips about the following topics. For each question, a description is provided of why the question is relevant to troubleshooting your system. For some questions, tips or a detailed guide is provided to learn how to obtain the related information for your system.

These questions are based on the nature of debugging issues and are related to different environments. For example, when troubleshooting issues on an edge node, you might need complete access to and control of the node, which can increase your capability to collect and view information.

Troubleshooting tips
Review the common issues that you might encounter when you use Open Horizon.

Open Horizon risks and resolution

Although Open Horizon creates unique opportunities, it also presents challenges. For example, it transcends the cloud data center physical boundaries, which can expose security, addressability, management, ownership, and compliance issues. More importantly, it multiplies the scaling issues of cloud-based management techniques.

Edge networks increase the number of compute nodes by an order of magnitude. Edge gateways increase that by another order of magnitude. Edge devices increase that number by 3 to 4 orders of magnitude. If DevOps (continuous delivery and continuous deployment) are critical to managing a hyper-scale cloud infrastructure, then zero-ops (operations with no human intervention) is critical to managing at the massive scale that Open Horizon represents.

It is critical to deploy, update, monitor, and recover the edge compute space without human intervention. All of these activities and processes must be:

Fully automated
Capable of independent decision-making about work allocation
Able to recognize and recover from changing conditions without intervention.

All of these activities must be secure, traceable, and defensible.