29.11.2022

Hostkey Windows Infrastructure Monitoring Architecture

server one
HOSTKEY
Rent dedicated and virtual servers with instant deployment in reliable TIER III class data centers in the Netherlands and the USA. Free protection against DDoS attacks included, and your server will be ready for work in as little as 15 minutes. 24/7 Customer Support.

Author: Senior Devops. Hostkey Lead Infrastructure Specialist Nikita Zubarev

Earlier we talked about how Linux systems are monitored. Now let's look at how to set up Windows Server.

In one of the previous articles, we discussed integrating FreeIPA with Active Directory. In the process of developing the solution described in that article, we needed to note the metrics from the servers using Windows infrastructure. In this article, we will look at how you can set up monitoring for Windows servers using an existing system built on a combination of Prometheus + Grafana + Alertmanager. We will collect general system metrics: cpu, ram, hdd.
The WMI exporter can also be used to monitor IIS sites and applications, network interfaces, services, and even local temperature.

Detailed information about the WMI exporter can be found on GitHub.
After installing the necessary exporter, go to the "Windows Services" panel (just type "Services" in the Windows search menu) and find the WMI exporter in the list. Next, you need to check that the service is working correctly.

If the exporter is running, it should start providing metrics at http://localhost:9182/metrics, similar to the Node Exporter. All you need to do is activate the collection of metrics in Prometheus, add a dashboard and set up your alerts:

Add to prometheus.yml:

  - job_name: 'Windows_Infra'
	static_configs:
		- targets: ['”localhost”:9182'] 

Add a Grafana dashboard. It monitors the display of the dashboard, connects and optimizes the output of a detailed summary of resources. It provides a visual panel for each host:

What you have to do now is describe the rule_files by adding the necessary metrics:

 groups:
	- name: Windows_Infra
		rules:
		- alert: InstanceDown
			expr: up{job="Windows_Infra"} == 0
			for: 5m
			labels:
				severity: page
			annotations:
				summary: "Instance {{ .instance }} down"
				description: "{{ .instance }} of job {{ .job }} has been down for more than 5 minutes."
	
		- alert: DiskSpaceUsage
			expr: 100.0 - 100 * (windows_logical_disk_free_bytes{job="Windows_Infra"} / windows_logical_disk_size_bytes{job="Windows_Infra"}) > 90
			for: 10m
			labels:
				severity: high
			annotations:
				summary: "Disk Space Usage (instance {{ $labels.instance }})"
				description: "Disk Space on Drive is used more than 90%\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

Add any other necessary parameters in the same way. When Alertmanager is configured, notifications will automatically be sent to connected systems.

As you can see, monitoring Windows servers can be done easily with Prometheus and Grafana. This solution greatly simplifies the process of managing a large fleet of machines and allows you to quickly solve problems that arise when administering server equipment with our company. In the following articles, we will consider the task of connecting Windows services.

Rent dedicated and virtual servers with instant deployment in reliable TIER III class data centers in the Netherlands and the USA. Free protection against DDoS attacks included, and your server will be ready for work in as little as 15 minutes. 24/7 Customer Support.

Other articles

13.01.2023

Collecting logs using Go

How to solve some issues with collecting logs using Go

05.12.2022

WindowsPE Live-CD in Jenkins/Foreman Infrastructure

Building a WindowsPE distribution in a Linux infrastructure, automating this process using Jenkins, and deploying MS Windows-based systems through this helper.

29.11.2022

Using the RabbitMQ message broker for monitoring with Prometheus and Grafana

How to organize monitoring and collection of RabbitMQ cluster metrics, as well as check the number of unread messages

11.11.2022

How to ignore tmpfs, udf, iso9660 when dealing with filesystem metrics

How to avoid issues using Foreman when installing Windows OS?

31.10.2022

Monitoring Linux Services with Prometheus

Monitoring services and implementing an alert system on Linux servers with Prometheus.

HOSTKEY Dedicated servers and cloud solutions Pre-configured and custom dedicated servers. AMD, Intel, GPU cards, Free DDoS protection amd 1Gbps unmetered port 30
4.3 67 67
Upload