December 24, 2010

Nagged by Nagios

Filed under: Free and Open — jason @ 12:19 am

If you do any sort of network administration, then you have surely heard of Nagios.  It has been around for a long time, having gone by the name of NetSaint years ago and it now offers a commercial version with more bells and whistles.  Nagios is our first line of defense for alerting us to problems on the network.

Where Cacti is a more complex tool for graphing traffic patterns, Nagios simply sends e-mails when something is down.  But, don’t let me sell Nagios as being too simple.  It can be configured to monitor during specific time periods per device and you can create different groups of people to be alerted for different items.  Nagios understands dependencies so it won’t alert you about every switch in a building being down when it knows that they are connected via the main switch for that building – you’ll only receive an alert about that main switch.  Nagios plugins allow you to monitor many different services such as POP, IMAP, SMTP and HTTP as well as disk space, CPU load and other metrics on remote machines (the latter examples via ssh or a Windows service).  It keeps some overall alert history and uptime statistics, as well.

Here is a host list view in the Nagios console… (click for larger version):

…and an event log view (click for larger version):

The key to utilizing Nagios effectively is in crafting its configuration files.  While there are a few tools out there such as NConf to help you generate config files, you may be better off just writing them on your own.  Once you get a few under your belt, you can easily copy and paste or even write your own scripts to maintain them.  The online documentation has some decent examples to get you started, or your old pal Jason can send you some of his.

Many IT shops use both Cacti and Nagios for “double barrel” network monitoring.  While Cacti can indeed provide some basic down/up notifications for network devices, the granularity of configuration available in Nagios offers some good features for front line alerting.  There are other monitoring tools out there such as Zenoss that try to combine the features of these two stalwarts, but the jury is still out for me on how they stack up.

• • •
Powered by WordPress |•| Wordpress Themes by priss