Network Monitoring with Nagios and MRTG
Novell Cool Solutions: Trench
By Gert-Jan de Boer
Digg This -
Posted: 30 Apr 2004
Here is a document about our implementation of Nagios and MRTG to monitor our NetWare and Windows servers, our network links and other stuff.
Nagios is a great monitoring tool and it's completely free too!
Nagios runs on Linux and uses a plugin system which makes it perfectly customizeable.
At our facility I worked the last couple of months primarily on replacing our commercial monitoring system with Nagios. We use Nagios for monitoring our WAN and LAN infrastructure. It started out to monitor just the perfomance and availability of our routers and switches, but quickly it begun to grow more and more. While I discovered new features and plugins for Nagios, my colleagues started to ask me if I could monitor other stuff. This resulted in a great deal of work, but working and learning with Nagios is fun!
Server and operating system
Because I am a great fan of Linux and don't want to use Windows anymore, it was clear from the start that our monitoring system would be built on Linux. I chose to run a Debian Unstable installation on a HP Netserver E800 with 1024mb RAM.
Our monitoring setup is made of 2 components:
1) Nagios for Availability and Server monitoring.
2) MRTG for performance monitoring.
Around these two systems I wrote a web-based shell so it looks like an integrated system.
What do we monitor?
We monitor almost all our services for WAN and LAN infrastructure. This means at the moment almost 200 services and this number is still growing. Other locations of our company (we have 4 locations with own Administrators, our location is responsible for the WAN infrastructure) are rapidly asking to monitor their services too.
The services we monitor are:
Monitoring Server Room
We monitor our server room using an serial interface temperature sensor logger. This kit is easy to build. Then we wrote an application that reads out the Temperature. This program works under Linux and NetWare. This way we don't have to set up Linux boxes at our remote offices. MRTG reads out the temperature values and graphs them nicely.
Monitoring WAN links
For monitoring our WAN links we use Nagios and MRTG. Nagios pings the remote locations every 10 minutes, if the link is slower then 100ms we receive a warning, does it get slower then 500ms we get a critical error message. MRTG reads out performance information of our remote routers with SNMP and graphs them. This way we can see exactly the bandwith utilization of our links. Before we implemented this solution we received a lot of questions of our remote locations why their lines were slow or if they needed an bandwith upgrade. This way we can present them with graphics and they can see for their selves that the problem wasn't in the WAN link but in their own network.
We monitor NetWare with Nagios thru a plugin which pulls his information from a NetWare server with MRTGEXT.NLM loaded. MRTEXT.NLM is a program which you'll load on the server and collects statistics from the server. Then you can read it out with the nwstat plugin from nagios or with MRTG. We use both! With Nagios we receive warnings when a server has abended, when it's out of timesync, we can get notification if certain NLMs aren't loaded (great for monitoring Arcserve.) We also receive warnings if the SYS: volume has less then 10% free or if the Average load over 15 minutes get's higher then 90%.
Nagios and NetWare monitoring
With MRTG we graph and monitor the performance of NetWare servers. All the statistics we can gather with Nagios are graphed by MRTG by using the same MRTGEXT.NLM.
Monitoring Windows NT/2000 servers
We use Nagios to monitor our Windows servers. We ping them if they are still there and we monitor if some applications are still running. We had one application which would kill itself once a while. Now we get an email message when the application is unloaded.
Nagios and MRTG provide us with an essential insight in our Network perfomance and availability. It has enabled us to respond quickly to errors in our network, and most importantly: it enables us to solve problems even before anyone notices there are problems! If you are thinking of implementing an monitoring system you should take a look to these to great free programs. Another great example of how the Open Source community can help you in network management.
If you have questions about Nagios or about our implementation please contact me. I am always available at GjdeBoer@rocfriesepoort.nl.
Gert-Jan de Boer
Network Administrator ROC Friese Poort
www.nagios.org - Nagios
http://people.ee.ethz.ch/~oetiker/webtools/mrtg/ - MRTG
http://forge.novell.com/modules/xfmod/project/?mrtgext - MRTGEXT.NLM
Novell Cool Solutions (corporate web communities) are produced by WebWise Solutions. www.webwiseone.com