Managing Downtime with Ansible and Checkmk
Ansible, combined with the tribe29.checkmk module, can automate the process of setting downtime in Checkmk monitoring. This article will guide you through installing the module, setting up your inventory, and writing and executing playbooks for scheduling and removing downtime.
Installation
First, you need to install the Checkmk collection for Ansible:
ansible-galaxy collection install tribe29.checkmk
Documentation
To understand how to use the module, you can check its documentation:
ansible-doc tribe29.checkmk.downtime
Inventory Setup
Your Ansible inventory should contain the full FQDN of the servers as they are registered in your Checkmk system, example:
[downtime]
server1.example.com
server2.example.com
server3.example.com
Playbook for Scheduling Downtime
Here’s a sample playbook to schedule downtime:
---
- hosts: downtime
gather_facts: false
vars_prompt:
- name: checkmkuser
prompt: "Enter your checkmk username:"
private: no
- name: checkmkpass
prompt: "Enter your checkmk password:"
tasks:
- name: "Schedule host downtime."
delegate_to: localhost
tribe29.checkmk.downtime:
server_url: "https://checkmk.example.com"
site: "sitename"
automation_user: "{{ checkmkuser }}"
automation_secret: "{{ checkmkpass }}"
host_name: "{{ inventory_hostname }}"
comment: "patch"
state: present
start_time: 2024-09-21T17:30:00Z
end_time: 2024-09-21T23:30:00Z
Execution
Before running the playbook, ensure the comment, start_time, and end_time are set correctly in your playbook:
ansible-playbook -i inventory/downtime-test playbooks/downtime_tribe29_time.yml
The playbook will prompt for the Nordmon username and password during execution.
Playbook for Removing Downtime
To remove previously scheduled downtime with a specific comment:
---
- hosts: downtime
gather_facts: false
vars_prompt:
- name: checkmkuser
prompt: "Enter your checkmk username:"
private: no
- name: checkmkpass
prompt: "Enter your checkmk password:"
tasks:
- name: "Remove scheduled host downtime."
delegate_to: localhost
tribe29.checkmk.downtime:
server_url: "https://checkmk.example.co,"
site: "nordmon"
automation_user: "{{ checkmkuser }}"
automation_secret: "{{ checkmkpass }}"
host_name: "{{ inventory_hostname }}"
comment: "patch"
state: absent
Conclusion
Using Ansible with the Checkmk module simplifies the management of server downtimes. This automation reduces human error and saves time, making it a valuable tool for system administrators.