Create Nagios Docker container and use it to Monitor the Health of scale-out POWER systems

Nagios is an open source IT infrastructure monitoring tool, and is very common in datacenters. Initial roll-out of Nagios in a datacenter is bit of a challenge. However, with the introduction of appliance delivery model, things are much simpler now.

There is also a Nagios Dockerfile available, that makes it extremely easy to setup Nagios in a docker container.

In this article, I’ll show you how to extend the existing Nagios Dockerfile to add IPMI based health-check, and using the same to monitor the scale-out Power servers, running either PowerKVM or baremetal. You can use the same procedure to manage both Intel and Power servers.

Setting up Nagios in a Docker Container

I’ll create a Nagios docker container on one of my management node to check the health of my servers. The container uses check_ipmi_sensor nagios plugin to monitor the health of servers using IPMI. You can find more details on the Nagios IPMI plugin in Thomas Krenn’s wiki page, who is the original author of the plugin.

You can get the Dockerfile to configure the IPMI based health-check plugin in Nagios from my github link.

Steps : 

1. Clone the github repository having the nagios Dockerfile and change to the directory

 $git clone https://github.com/bpradipt/docker-nagios.git

 $cd docker-nagios

 Modify remotehost.cfg to add the details of the host you want to monitor as well as add the service definitions. In this example I have added my PowerKVM server.

 ‘host_ip’ is the IP assigned to the installed Operating System

‘_ipmi_ip’ is the IP assigned to the management controller (FSP/BMC/RSA etc)

## HOST DEFINITIONS
define host{
     use                 linux-server              ; Name of host template to use
     host_name           powerkvm1
     alias               powerkvm1
     address             A.B.C.D                   ; host OS IP
    _ipmi_ip             W.X.Y.Z                   ; host management controller IP 
 }

## SERVICE DEFINITIONS
# Check all the sensors
define service{
     use                 generic-service
     host_name           powerkvm1
     service_description SENSOR_HEALTHCHECK
     check_command       check_ipmi_sensor!/etc/ipmi-config/ipmi.cfg 
}

You can even customize the service to check for only specific sensors. For eg.
# Check only temperature and voltage sensors
define service{
     use                 generic-service
     host_name           powerkvm1
     service_description VOLT_TEMP_HEALTHCHECK
     check_command       check_ipmi_sensor!/etc/ipmi-config/ipmi.cfg!142,144
}

Similarly you can add definition for checking SEL logs, other sensors etc.

Modify commands_ipmi.cfg to add the relevant command details, that will be actually used to fetch the data.

Additionally, modify ipmi.cfg to add IPMI credentials or any other options required for FreeIPMI tool.

2. Build the container

$sudo docker build –force-rm=true -t bpradipt/nagios_ipmi .

On successful completion a docker container image will be created. Use the image ID to start the container

3. Run the nagios container

$sudo docker run -d -t -p 80:80 d164bc3c5904

This will start the container, which will be accessible on port 80.

4. Customizing the container (optional)

If you need to make any further changes to the container or debug any Nagios issues, then attach to the shell by using the following command:

$sudo docker exec -it <container-id> bash

This will take you to the shell, and you can modify any configuration, install or remove any packages etc.

For detaching the shell use the following key combination Ctrl-P + Ctrl-Q

Following are some screenshots from Nagios web page displaying the sensor data:

nagios_2

The above screen-shot shows all the available sensors

nagios_1

The above screen-shot shows only the power and the temperature sensors.

Pradipta Kumar Banerjee

I'm a Cloud and Linux/ OpenSource enthusiast, with 16 years of industry experience at IBM. You can find more details about me here - Linkedin

You may also like...