Author: Senior Devops. Hostkey Lead Infrastructure Specialist Nikita Zubarev
In the last article we talked about how to set up metric collection and the display alerts to monitor services using Prometheus. However, during the process, the question arose of how to monitor individual services on Linux servers. For example, in one of our previous articles, we described the case of using Ovirt web consoles using Apache Guacamole. Now let's look at how to organize the monitoring of services on Linux machines.
Guacamole is a cross-platform remote desktop gateway that requires Apache Tomcat and Guacd to run.
Open the service created for node_exporter:
/usr/lib/systemd/system/node_exporter.service
[Unit]
Description=Prometheus exporter for machine metrics, written in Go with pluggable metric collectors.
Documentation=https://github.com/prometheus/node_exporter
After=network.target
[Service]
EnvironmentFile=-/etc/default/node_exporter
User=root
ExecStart=/usr/bin/node_exporter $NODE_EXPORTER_OPTS
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=multi-user.target
by default, $NODE_EXPORTER_OPTS is loaded from the file:
# /etc/default/node_exporter
If necessary, we can either monitor individual services by adding the collector.systemd.unit-whitelist option:
#NODE_EXPORTER_OPTS="--collector.systemd --collector.systemd.unit-whitelist="(tomcat|guacd).service""
After restarting the service, we get the metric on node_exporter:
All you need to do is describe the rule for the alert:
- alert: guacd_service
expr: node_systemd_unit_state{name="guacd.service",state="active"} == 0
for: 1s
annotations:
summary: "Instance {{ $labels.instance }} is down"
description: "{{ $labels.instance }} of job {{ $labels.job }} is down."
- alert: tomcat_service
expr: node_systemd_unit_state{name="tomcat.service",state="active"} == 0
for: 1s
annotations:
summary: "Instance {{ $labels.instance }} is down"
description: "{{ $labels.instance }} of job {{ $labels.job }} is down."
An example of an alert on the Alertmanager dashboard in Grafana:
Conclusions
In this short article, we looked at how services are monitored on Linux machines.This solution allows you to monitor the work of any system server and greatly simplifies the administration of server hardware. In the next article, we will explain how you should not display the basic parameters that the exporter returns.