31.10.2022

Monitoring Linux Services with Prometheus

Rent dedicated and virtual servers with instant deployment in reliable TIER III class data centers in the Netherlands and the USA. Free protection against DDoS attacks included, and your server will be ready for work in as little as 15 minutes. 24/7 Customer Support.

Author: Senior Devops. Hostkey Lead Infrastructure Specialist Nikita Zubarev

In the last article we talked about how to set up metric collection and the display alerts to monitor services using Prometheus. However, during the process, the question arose of how to monitor individual services on Linux servers. For example, in one of our previous articles, we described the case of using Ovirt web consoles using Apache Guacamole. Now let's look at how to organize the monitoring of services on Linux machines.

Guacamole is a cross-platform remote desktop gateway that requires Apache Tomcat and Guacd to run.

Open the service created for node_exporter:

/usr/lib/systemd/system/node_exporter.service

[Unit]

Description=Prometheus exporter for machine metrics, written in Go with pluggable metric collectors.
Documentation=https://github.com/prometheus/node_exporter
After=network.target


[Service]

EnvironmentFile=-/etc/default/node_exporter
User=root
ExecStart=/usr/bin/node_exporter $NODE_EXPORTER_OPTS
Restart=on-failure
RestartSec=5s

[Install]

WantedBy=multi-user.target

by default, $NODE_EXPORTER_OPTS is loaded from the file:
# /etc/default/node_exporter


If necessary, we can either monitor individual services by adding the collector.systemd.unit-whitelist option:

#NODE_EXPORTER_OPTS="--collector.systemd --collector.systemd.unit-whitelist="(tomcat|guacd).service""

After restarting the service, we get the metric on node_exporter:

All you need to do is describe the rule for the alert:

- alert: guacd_service
	expr: node_systemd_unit_state{name="guacd.service",state="active"} == 0
	for: 1s
	annotations:
		summary: "Instance {{ $labels.instance }} is down"
		description: "{{ $labels.instance }} of job {{ $labels.job }} is down."
- alert: tomcat_service
	expr: node_systemd_unit_state{name="tomcat.service",state="active"} == 0
	for: 1s
	annotations:
		summary: "Instance {{ $labels.instance }} is down"
		description: "{{ $labels.instance }} of job {{ $labels.job }} is down."

An example of an alert on the Alertmanager dashboard in Grafana:

Conclusions

In this short article, we looked at how services are monitored on Linux machines.This solution allows you to monitor the work of any system server and greatly simplifies the administration of server hardware. In the next article, we will explain how you should not display the basic parameters that the exporter returns.

Monitoring Linux Services with Prometheus

Conclusions

Other articles

Other topics