Lock in your current rates now! ⭐ Price update planned for January.
EN
Currency:
EUR – €
Choose a currency
  • Euro EUR – €
  • United States dollar USD – $
VAT:
OT 0%
Choose your country (VAT)
  • OT All others 0%

02.10.2023

Monitoring oVirt SSL Certificates

server one
HOSTKEY
Rent dedicated and virtual servers with instant deployment in reliable TIER III class data centers in the Netherlands and the USA. Free protection against DDoS attacks included, and your server will be ready for work in as little as 15 minutes. 24/7 Customer Support.

Author: Stepan Vakheta, DevOps specialist at the Hostkey company

At Hostkey we use oVirt as our main virtualization system. It is extremely important to keep the system running at a high level despite the constant growth of the infrastructure to dozens and even hundreds of physical servers. In this article, we will briefly describe our company's approach to oVirt certificate monitoring.

In past articles, we described options for using Prometheus + Alertmanager + Node Exporter and HTTP and SSL via Prometheus Blackbox_Exporter.

Today we are going to talk about monitoring certificates in local storage of two main components of oVirt: oVirt Engine and oVirt Node. It is through these certificates that communication between these components takes place.

  • The oVirt Engine is the central management component that controls all virtualization hosts, disk shares and virtual networks.
  • oVirt Node is a component installed on each individual host that manages all the resources of that host and the virtual machines running on it.

Depending on the architecture, oVirt nodes can be combined into clusters. In this case, it is important to maintain a high level of reliability of communication between system components.

Communication between the oVirt Engine and oVirt hosts is performed over an encrypted SSL connection based on the certificates of these components. Depending on the oVirt version, the validity period of these certificates may vary: before version 4.5 it was 398 days, and from version 4.5 it has been increased to 5 years.

It is important not to miss the next certificate reissuance. Once they expire, Engine hosts will not be able to communicate, making it impossible to manage virtual machines entailing considerable investment in time to restore performance.

The best solution to the problem is to prevent it from occurring in the first place. Accordingly, we will collect the necessary metrics using SSL Exporter - it allows you to assign a target parameter to collect metrics in the form of local files, which is ideal for our task.

After installing and launching the exporter, it is necessary to define the target parameters (targets) for each of the system components. According to the Documentation, the certificates of interest for each of the components are located in the following paths:

  • for ovirt-engine — /etc/pki/ovirt-engine;
  • for ovirt-host — /etc/pki/vdsm/ and /etc/pki/libvirt/.

This exporter has the ability to search and sample multiple files simultaneously (using the doublestar package), which we will use in our query.

Target parameter for the oVirt Engine:

http://<engine_address>:9219/probe?module=file&target=/etc/pki/ovirt-engine/**/**.pem

Target parameter for the oVirt Hosts:

http://<node_address>:9219/probe?module=file&target=/etc/pki/vdsm/**/**.pem
http://<node_address>:9219/probe?module=file&target=/etc/pki/libvirt/**/**.pem

A sample of the metrics collected:

Then it is necessary to describe the configuration for Prometheus and add it to the database. For clarity, we will divide it by job_name for further visualization in the AlertManager panel:

/etc/prometheus/prometheus.yml

- job_name: ssl_file_engine
metrics_path: /probe
params:
	module:
	- file
	target:
	- /etc/pki/ovirt-engine/**/**.pem
static_configs:
- targets:
	- engine_address:9219
	- engine_address:9219

- job_name: ssl_file_vdsm_node
metrics_path: /probe
params:
	module:
	- file
	target:
	- /etc/pki/vdsm/**/**.pem
static_configs:
- targets:
	- node_address:9219
	- node_address:9219

- job_name: ssl_file_libvirt_node
metrics_path: /probe
params:
	module:
	- file
	target:
	- /etc/pki/libvirt/**/**.pem
static_configs:
- targets:
	- node_address:9219
	- node_address:9219

Next we need to describe a configuration file with rules for triggering alerts. We will be interested in the certificate expiration date.

Let's add a rule that will be triggered 70 days or less before the certificate expiration date.

ssl_file_engine.yml

groups:
- name: ssl_file_engine
	rules:
	- alert: SSLCertExpiringSoon
	expr:  ssl_file_cert_not_after{job="ssl_file_engine"} - time() < 86400 * 70
	for: 10m
	labels:
		severity: critical
	annotations:
		description: "SSL certificate will expire in {{ $value | humanizeDuration }} (instance {{ $labels.instance }}) (instance {{ $labels.file }})"

ssl_file_libvirt_node.yml

groups:
- name: ssl_file_libvirt_node
	rules:
	- alert: SSLCertExpiringSoon
	expr:  ssl_file_cert_not_after{job="ssl_file_libvirt_node"} - time() < 86400 * 70
	for: 10m
	labels:
		severity: critical
	annotations:
		description: "SSL certificate will expire in {{ $value | humanizeDuration }} (instance {{ $labels.instance }}) (instance {{ $labels.file }})"

ssl_file_vdsm_node.yml

groups:
- name: ssl_file_vdsm_node
	rules:
	- alert: SSLCertExpiringSoon
	expr:  ssl_file_cert_not_after{job="ssl_file_vdsm_node"} - time() < 86400 * 70
	for: 10m
	labels:
		severity: critical
	annotations:
		description: "SSL certificate will expire in {{ $value | humanizeDuration }} (instance {{ $labels.instance }}) (instance {{ $labels.file }})"

When the specified deadline expires, we will get the following visualization in the AlertManager panel:

Monitoring in this way helps prevent failures due to the tardy replacement of SSL certificates and ensures the stable operation of the virtual infrastructure. With a few simple steps, you can avoid problems that would otherwise cause downtime for a large number of resources.

Rent dedicated and virtual servers with instant deployment in reliable TIER III class data centers in the Netherlands and the USA. Free protection against DDoS attacks included, and your server will be ready for work in as little as 15 minutes. 24/7 Customer Support.

Other articles

04.12.2025

Improving LLM Benchmark for GPU Servers Equipped with NVIDIA Cards: A Focus on the Ollama Framework

How did we create our LLM benchmark for GPU servers using Ollama? We developed a script, tested it with DeepSeek R1, and configured the necessary contexts. We identified some patterns and compared the performance of different GPUs, all of which are now available on GitHub.

04.12.2025

What is the Cloud and How It Outperforms Traditional Hosting: A Comprehensive Overview of Cloud Computing

The cloud is a flexible and cost-effective solution that adapts to demand, enabling businesses to grow without unnecessary expenses. Through a simple yet realistic example, we demonstrate how cloud services work internally and why they often outperform traditional hosting in certain situations.

01.12.2025

Debian 13 “Trixie” and Proxmox VE 9.0: Implementation and Testing in Production

The new version of Debian 13 and the release of Proxmox VE 9.0 came out almost simultaneously, generating significant interest from customers. In this article, we detail how the HOSTKEY team adapted their processes, automated deployments, and prepared their infrastructure for these new releases.

27.10.2025

Checklist: 5 Signs It's Time for Your Business to Upgrade from VPS to a Dedicated Server

Do you still rely on cloud services despite paying for them? If your budget is at least €50 per year, a dedicated server could be more cost-effective. Please review the checklist and the comparative tests between cloud and bare-metal solutions.

29.09.2025

What to Do If Your Laptop Breaks Down? How Kasm Turns Even an Old Tablet into a Workstation

When technical issues disrupt work, Kasm Workspaces becomes a lifesaver, turning outdated devices into powerful workstations through a browser. The article discusses how the platform addresses issues with broken laptops and equipment shortages, compares different versions (Community, Starter, Enterprise, Cloud), examines resource requirements, and reviews test results on VPS.

Upload