ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Panagiotis Garefalakis
ICS-FORTH
Heraklion, Greece




                         pgaref@ics.forth.gr
?   Storage management challenges

?   What is Nagios

?   Tutorial topics:
    ?   How to start a Nagios server
    ?   Writing storage service monitoring code
    ?   Monitoring local & remote storage
    ?   Event handling




                                     pgaref@ics.forth.gr
pgaref@ics.forth.gr
?   A key measurement tool for actively
    monitoring availability of devices and
    services.
?   The most used open source network
    monitoring software.
?   Can support monitoring and
    management of thousands of devices
    and services.


                           pgaref@ics.forth.gr
pgaref@ics.forth.gr
define host{
     name                       generic-host
     notifications_enabled         1
     event_handler_enabled         1
     flap_detection_enabled        1
     process_perf_data             1
     retain_status_information     1
     retain_nonstatus_information 1
     check_command                 check-host-alive
     max_check_attempts            5
     notification_interval         60
     notification_period           24x7
     notification_options          d,r
     contact_groups                nobody
     register                      0
     }




                                        pgaref@ics.forth.gr
define host{
     use              generic-host
     host_name        switch1
     alias            Core_switches
     address          192.168.1.2
     parents          router1
     contact_groups   switch_group
}




                                      pgaref@ics.forth.gr
define service{
     name                             generic-service
     active_checks_enabled            1
     passive_checks_enabled           1
     parallelize_check                1
     obsess_over_service              1
     check_freshness                  0
     notifications_enabled            1
     event_handler_enabled            1
     flap_detection_enabled           1
     process_perf_data                1
     retain_status_information        1
     retain_nonstatus_information     1
     is_volatile                      0
     check_period                     24x7
     max_check_attempts           5
     normal_check_interval        5
     retry_check_interval             1
     notification_interval            60
     notification_period              24x7
     notification_options             c,r
     register                         0
     }



                                             pgaref@ics.forth.gr
define service{
     host_name               switch1
     use                     generic-service
     service_description     PING
     check_command           check-host-alive
     max_check_attempts      5
     normal_check_interval   5
     notification_options    c,r,f
     contact_groups          switch-group
}




                                         pgaref@ics.forth.gr
?   Commands wrap the check scripts.
    define command{
            command_name    check-host-alive
            command_line    $USER1$/check_ping -H
      $HOSTADDRESS$ -w 99,99% -c 100,100% -p 1
            }
?   Check scripts can be in any language.




                                 pgaref@ics.forth.gr
pgaref@ics.forth.gr
?   Manual install.

?   Read the installation instructions (USB).
    ? Installation commands ¨C Ubuntu
    ? Installation script - CentOS




                                  pgaref@ics.forth.gr
pgaref@ics.forth.gr
?   Locate Nagios configuration files.
    /usr/local/nagios/etc/objects
?   Open localhost.cfg (Sudo access)
?   Add the lines:
define service{
     use                              local-service   ; Name of service template to use
     host_name                       localhost
     service_description             DISK_TEST
     check_command                   check_local_disk!70%!20%!/dev/sda3
     }

?   Validate:              sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg



?   Restart Nagios: /etc/init.d/nagios restart


                                                                  pgaref@ics.forth.gr
?   Copy the given plugin ¡°check_dir.sh¡± to
    nagios plugin directory:/usr/local/nagios/libexec
?   Modify the commands.cfg file:
define command{
     command_name          check_dir
     command_line          $USER1$/check_dir $ARG1$
     }

?   Modify the localhost.cfg file:
define service{
     use                               local-service
     host_name                         localhost
     service_description               CHECK_LOG
     check_command                     check_dir!9999
     }



                                                   pgaref@ics.forth.gr
pgaref@ics.forth.gr
oRemote Host IP is: 139.91.70.76
oYour IP has to be added at nrpe.cfg before running!




                                pgaref@ics.forth.gr
?   Follow the instructions to install NRPE Server
    ¡°Enable NRPE Server-Ubuntu.txt¡±
?   You can Skip the Command and Service
    Definitions.
?   You can check your connection by running
    the following command and using the IP
    Address of the remote box you want to
    monitor. You should get the return ¡°NRPE
    v2.8.1¡± if all is working.
    ? Command:   /usr/lib/nagios/plugins/check_nrpe -H 139.91.70.76



                                        pgaref@ics.forth.gr
?   We have an NFS server running in the remote host. A
    plugin for monitoring NFS is included ¡°check_nfsmount.pl¡±
?   We will modify NRPE configuration at the server part to be
    able to run check nfs remotely.
?   Finally test the command:
    ? /usr/lib/nagios/plugins/check_nrpe -H 139.91.70.76 -c
      check_nfs


                                         pgaref@ics.forth.gr
?   Nagios can attempt to rectify a fault by
    running a script.
?   We can use Event Handlers to take action
    when something goes wrong.
    ? Growing File example:
     ? Print error message
     ? Compress File
     ? Truncate File




                              pgaref@ics.forth.gr
?   We want to react to above-threshold growth
    of files.
?   Copy myhandler.sh to libexec/eventhandler
    ? Change permission to nagios user!
?   Add the following line to our command:
?   event_handler my_handler!$SERVICESTATE$ $STATETYPE$
    $SERVICEATTEMPT$


Finally add the command:
define command{
         command_name my_eventhandler
         command_line $USER1$/eventhandlers/myhandler $ARG1$
}




                                               pgaref@ics.forth.gr
pgaref@ics.forth.gr
?   Nagios is a very useful tool saving time of administrators but
    can appear very complex when you first look at it.
?   My advice is:
    ? Install it on your test node (though this may well end up as your
      master server)
    ? Run a few check scripts by hand to get the feel for them
    ? Set up a simple config file that runs a few check on the local host
    ? Install nrpe on the host and nrpe and nagios-plugins on a remote
      host
    ? Run check nrpe by hand to get it working then add a couple of
      simple checks on the remote host
    ? Now add hosts and service until you run out, then write some
      more




                                             pgaref@ics.forth.gr
?   http://www.nagios.org Nagios web site
?   http://sourceforge.net/projects/nagiosplug
    Nagios plugins site
?   http://www.nagiosexchange.org Unofficial
    Nagios plugin site
?   http://www.debianhelp.co.uk/nagios.htm A
    Debian tutorial on Nagios
?   http://www.nagios.com/ Commercial Nagios
    support




                               pgaref@ics.forth.gr

More Related Content

Storage managment using nagios

  • 2. ? Storage management challenges ? What is Nagios ? Tutorial topics: ? How to start a Nagios server ? Writing storage service monitoring code ? Monitoring local & remote storage ? Event handling pgaref@ics.forth.gr
  • 4. ? A key measurement tool for actively monitoring availability of devices and services. ? The most used open source network monitoring software. ? Can support monitoring and management of thousands of devices and services. pgaref@ics.forth.gr
  • 6. define host{ name generic-host notifications_enabled 1 event_handler_enabled 1 flap_detection_enabled 1 process_perf_data 1 retain_status_information 1 retain_nonstatus_information 1 check_command check-host-alive max_check_attempts 5 notification_interval 60 notification_period 24x7 notification_options d,r contact_groups nobody register 0 } pgaref@ics.forth.gr
  • 7. define host{ use generic-host host_name switch1 alias Core_switches address 192.168.1.2 parents router1 contact_groups switch_group } pgaref@ics.forth.gr
  • 8. define service{ name generic-service active_checks_enabled 1 passive_checks_enabled 1 parallelize_check 1 obsess_over_service 1 check_freshness 0 notifications_enabled 1 event_handler_enabled 1 flap_detection_enabled 1 process_perf_data 1 retain_status_information 1 retain_nonstatus_information 1 is_volatile 0 check_period 24x7 max_check_attempts 5 normal_check_interval 5 retry_check_interval 1 notification_interval 60 notification_period 24x7 notification_options c,r register 0 } pgaref@ics.forth.gr
  • 9. define service{ host_name switch1 use generic-service service_description PING check_command check-host-alive max_check_attempts 5 normal_check_interval 5 notification_options c,r,f contact_groups switch-group } pgaref@ics.forth.gr
  • 10. ? Commands wrap the check scripts. define command{ command_name check-host-alive command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 99,99% -c 100,100% -p 1 } ? Check scripts can be in any language. pgaref@ics.forth.gr
  • 12. ? Manual install. ? Read the installation instructions (USB). ? Installation commands ¨C Ubuntu ? Installation script - CentOS pgaref@ics.forth.gr
  • 14. ? Locate Nagios configuration files. /usr/local/nagios/etc/objects ? Open localhost.cfg (Sudo access) ? Add the lines: define service{ use local-service ; Name of service template to use host_name localhost service_description DISK_TEST check_command check_local_disk!70%!20%!/dev/sda3 } ? Validate: sudo /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg ? Restart Nagios: /etc/init.d/nagios restart pgaref@ics.forth.gr
  • 15. ? Copy the given plugin ¡°check_dir.sh¡± to nagios plugin directory:/usr/local/nagios/libexec ? Modify the commands.cfg file: define command{ command_name check_dir command_line $USER1$/check_dir $ARG1$ } ? Modify the localhost.cfg file: define service{ use local-service host_name localhost service_description CHECK_LOG check_command check_dir!9999 } pgaref@ics.forth.gr
  • 17. oRemote Host IP is: 139.91.70.76 oYour IP has to be added at nrpe.cfg before running! pgaref@ics.forth.gr
  • 18. ? Follow the instructions to install NRPE Server ¡°Enable NRPE Server-Ubuntu.txt¡± ? You can Skip the Command and Service Definitions. ? You can check your connection by running the following command and using the IP Address of the remote box you want to monitor. You should get the return ¡°NRPE v2.8.1¡± if all is working. ? Command: /usr/lib/nagios/plugins/check_nrpe -H 139.91.70.76 pgaref@ics.forth.gr
  • 19. ? We have an NFS server running in the remote host. A plugin for monitoring NFS is included ¡°check_nfsmount.pl¡± ? We will modify NRPE configuration at the server part to be able to run check nfs remotely. ? Finally test the command: ? /usr/lib/nagios/plugins/check_nrpe -H 139.91.70.76 -c check_nfs pgaref@ics.forth.gr
  • 20. ? Nagios can attempt to rectify a fault by running a script. ? We can use Event Handlers to take action when something goes wrong. ? Growing File example: ? Print error message ? Compress File ? Truncate File pgaref@ics.forth.gr
  • 21. ? We want to react to above-threshold growth of files. ? Copy myhandler.sh to libexec/eventhandler ? Change permission to nagios user! ? Add the following line to our command: ? event_handler my_handler!$SERVICESTATE$ $STATETYPE$ $SERVICEATTEMPT$ Finally add the command: define command{ command_name my_eventhandler command_line $USER1$/eventhandlers/myhandler $ARG1$ } pgaref@ics.forth.gr
  • 23. ? Nagios is a very useful tool saving time of administrators but can appear very complex when you first look at it. ? My advice is: ? Install it on your test node (though this may well end up as your master server) ? Run a few check scripts by hand to get the feel for them ? Set up a simple config file that runs a few check on the local host ? Install nrpe on the host and nrpe and nagios-plugins on a remote host ? Run check nrpe by hand to get it working then add a couple of simple checks on the remote host ? Now add hosts and service until you run out, then write some more pgaref@ics.forth.gr
  • 24. ? http://www.nagios.org Nagios web site ? http://sourceforge.net/projects/nagiosplug Nagios plugins site ? http://www.nagiosexchange.org Unofficial Nagios plugin site ? http://www.debianhelp.co.uk/nagios.htm A Debian tutorial on Nagios ? http://www.nagios.com/ Commercial Nagios support pgaref@ics.forth.gr