- Lab
- A Cloud Guru
Configuring Prometheus Alertmanager for High Availability
Prometheus Alertmanager is a great way to handle your Prometheus alerts. However, a lone instance of Alertmanager can serve as a single point of failure if it goes down. Luckily, you can configure Alertmanager to run in a multi-instance cluster to provide failure resilience. In this hands-on lab, you will make an existing single-instance Alertmanager setup highly available by adding an additional instance.
Path Info
Table of Contents
-
Challenge
Configure the Two Alertmanager Instances to Form a Cluster
-
Log in to both the
Prometheus Server
andAlertmanager 2
server. -
On both servers, edit the Alertmanager unit file:
sudo vi /etc/systemd/system/alertmanager.service
-
Locate the
ExecStart
section. On each server, add the other server's private IP address using thecluster.peer
flag. -
On the
Prometheus Server
:ExecStart=/usr/local/bin/alertmanager --config.file /etc/alertmanager/alertmanager.yml --storage.path /var/lib/alertmanager/ --cluster.peer=10.0.1.102:9094
-
On the
Alertmanager 2
server:ExecStart=/usr/local/bin/alertmanager --config.file /etc/alertmanager/alertmanager.yml --storage.path /var/lib/alertmanager/ --cluster.peer=10.0.1.101:9094
-
On both servers, reload the unit file:
sudo systemctl daemon-reload
-
On the
Prometheus Server
, restart Alertmanager:sudo systemctl restart alertmanager
-
On the
Alertmanager 2
server, enable and start Alertmanager:sudo systemctl enable alertmanager
sudo systemctl start alertmanager
-
Test your cluster setup by creating a silence on one instance and verifying it appears on the other instance. Access both instances in a browser:
http://<PUBLIC_IP>:9093
-
On one instance, click Silences and create a new silence.
-
Click Silences on the other instance and verify the silence you created appears.
-
-
Challenge
Configure Prometheus to Use Your Multi-Instance Alertmanager Setup
-
On the
Prometheus Server
, edit the Prometheus configuration file:sudo vi /etc/prometheus/prometheus.yml
-
Add the new Alertmanager (
10.0.1.102:9093
) to the list of Alertmanager targets:alerting: alertmanagers: - static_configs: - targets: - localhost:9093 - 10.0.1.102:9093
-
Restart Prometheus to reload the config:
sudo systemctl restart prometheus
-
Access the Prometheus server in a browser:
http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090
-
Click Status > Runtime & Build Information.
-
Verify both of your Alertmanagers appear under the Alertmanagers section.
-
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the author’s guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.