Checkmk Downtime

Managing Downtime with Ansible and Checkmk

Ansible, combined with the tribe29.checkmk module, can automate the process of setting downtime in Checkmk monitoring. This article will guide you through installing the module, setting up your inventory, and writing and executing playbooks for scheduling and removing downtime.

Installation

First, you need to install the Checkmk collection for Ansible:

ansible-galaxy collection install tribe29.checkmk
Documentation

To understand how to use the module, you can check its documentation:

ansible-doc tribe29.checkmk.downtime
Inventory Setup

Your Ansible inventory should contain the full FQDN of the servers as they are registered in your Checkmk system, example:

[downtime]
server1.example.com
server2.example.com
server3.example.com
Playbook for Scheduling Downtime

Here’s a sample playbook to schedule downtime:

---
- hosts: downtime
  gather_facts: false
  vars_prompt:
    - name: checkmkuser
      prompt: "Enter your checkmk username:"
      private: no
    - name: checkmkpass
      prompt: "Enter your checkmk password:"

  tasks:
   - name: "Schedule host downtime."
     delegate_to: localhost
     tribe29.checkmk.downtime:
       server_url: "https://checkmk.example.com"
       site: "sitename"
       automation_user: "{{ checkmkuser }}"
       automation_secret: "{{ checkmkpass }}"
       host_name: "{{ inventory_hostname }}"
       comment: "patch"
       state: present
       start_time: 2024-09-21T17:30:00Z
       end_time: 2024-09-21T23:30:00Z
Execution

Before running the playbook, ensure the comment, start_time, and end_time are set correctly in your playbook:

ansible-playbook -i inventory/downtime-test playbooks/downtime_tribe29_time.yml

The playbook will prompt for the Nordmon username and password during execution.

Playbook for Removing Downtime

To remove previously scheduled downtime with a specific comment:

---
- hosts: downtime
  gather_facts: false
  vars_prompt:
    - name: checkmkuser
      prompt: "Enter your checkmk username:"
      private: no
    - name: checkmkpass
      prompt: "Enter your checkmk password:"

  tasks:
   - name: "Remove scheduled host downtime."
     delegate_to: localhost
     tribe29.checkmk.downtime:
       server_url: "https://checkmk.example.co,"
       site: "nordmon"
       automation_user: "{{ checkmkuser }}"
       automation_secret: "{{ checkmkpass }}"
       host_name: "{{ inventory_hostname }}"
       comment: "patch"
       state: absent
Conclusion

Using Ansible with the Checkmk module simplifies the management of server downtimes. This automation reduces human error and saves time, making it a valuable tool for system administrators.

 

Matinen.com

Blog about shit I do


2025-01-10