Apache Ignite exposes metrics in JMX and OpenCensus formats, making it possible to monitor the clusters with a broad range of tools, including Zabbix, Prometheus, Grafana, and AppDynamics.
Besides, below is a list of tools developed specifically for Ignite clusters management and monitoring:
The Ignite community officially supports a couple of command-line tools for the clusters management and monitoring:
- Visor Command Line tool provides basic statistics about cluster nodes, caches, and compute tasks. It also lets you manage the size of your cluster by starting or stopping nodes.
- Control Script is an advanced command-line utility that can change the baseline topology, activate and deactivate the cluster, perform consistency checks of your data and indexes, detect long-running or hanging transactions.
GridGain Control Center is a management and monitoring tool for Apache Ignite that supports the following:
- Monitor the state of the cluster with customizable dashboards
- Define custom alerts to track and react on any of over 200 cluster, node, and storage metrics
- Execute and optimize SQL queries as well as monitor already running commands.
- Perform OpenCensus-based root cause analysis with visual debugging of API calls as they execute across the nodes of the cluster
- Take full, incremental, and continuous cluster backups to enable disaster recovery in the event of data loss or corruption
Datadog is a general-purpose monitoring service that integrates with Apache Ignite natively to provide the following capabilities:
- Collect and visualize metrics from your Ignite nodes on an out-of-the-box dashboard
- Track memory usage across the nodes including detailed garbage collection activity
- Make use of the built-in health check for Ignite to create an alert that notifies you about the "node goes offline" events