Novell Home

My Favorites

Close

Please to see your favorites.

Using mdadm to send e-mail alerts for RAID failures

This document (7001034) is provided subject to the disclaimer at the end of this document.

Environment

Novell SUSE Linux Enterprise Desktop 10
Novell SUSE Linux Enterprise Server 10
Novell SUSE Linux Enterprise Server 10 Service Pack 1
Novell SUSE Linux Enterprise Desktop 10 Service Pack 1

Situation

Mdadm is a command line utility that can be used to create, manage, and monitor Linux software RAID devices.

This TID will explain how to use mdadm to monitor and report issues with a software raid configuration in SLE Linux. This document is not intended to explain software raid setup in SLE Linux. The setup steps for mdadm are for use after a system has an active software raid setup.

Steps for setting up e-mail alerting of errors with mdadm:

E-mail error alerting with mdadm can be accomplished in several ways:

  1. Using a command line directly
  2. Using the /etc/mdadm.conf file to specify an e-mail address
NOTE: e-mails are only sent when the following events occur: 

Fail, FailSpare, DegradedArray, and TestMessage

Specifying an e-mail address using the mdadm command line

Using the command line simply involves including the e-mail address in the command. The following explains the mdadm command and how to set it up so that it will load every time the system is started.

mdadm --monitor --scan --daemonize --mail=jdoe@somemail.com

The command could be put /etc/init.d/boot.local so that it was loaded every time the system was started.

Verification that mdadm is running can be verified by typing the following in a terminal window:

ps aux | grep mdadm

Specifying an e-mail address using the mdadm.conf file

Using mdadm with the /etc/mdadm.conf file is very similar to the command line, except that the e-mail address is included in the mdadm.conf file. The following is an example of an mdadm.conf file:

#~~~~~~~~~~~~ Sample mdadm.conf file ~~~~~~~~~~~~~~~~~~~~~~~~

DEVICE partitions

ARRAY /dev/md0 level=raid1 UUID=1e60d34a:2900a5a6:016ce23d:edbe1177

ARRAY /dev/md1 level=raid1 UUID=b9db4840:b9f19361:ed0112d1:74f6071a

ARRAY /dev/md2 level=raid1 UUID=f6135aa0:dc21f04e:24d4c1e1:4fe7b596

MAILADDR jdoe@somemail.com

#~~~~~~~~~~~~ end of file ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The lines beginning with # were added for this documentation.


Utilizing the /etc/mdadm.conf would simplify the command line and make it look like this:

mdadm –monitor –scan –daemonize

This command could be added to the /etc/init.d/boot.local so that mdadm ran every time the system was started.

NOTE: It has been found that mdadm will not send an e-mail if the DEVICE partitions section does not exist in the /etc/mdadm.conf file. If those sections do not exist a new /etc/mdadm.conf file can be created by using the following command:

mdadm –detail –scan > /etc/mdadm.conf

The MAILADDR line could then be added as well.


Running an external program when an event occurs

Another option provided with the /etc/mdadm.conf file is to run an external application when an error is detected.

An example application could be something as simple as a script that causes messages to popup on the screen when an event occurs. The following script is one example:

NOTE: The following script is for example purposes only and is NOT supported by Novell.


#!/bin/bash
#
# mdadm RAID health check
#
# Events are being passed to xmessage via $1 (events) and $2 (device)
#
# Setting variables to readable values
event=$1
device=$2
# Check event and then popup a window with appropriate message based on event
if [ $event == "Fail"];then
    xmessage "A failure has been detected on device" $device
    else
    if [ $event == "FailSpare"]; then
        xmessage "A failure has been detected on spare device" $device
        else
        if [ $event == "DegradedArray"]; then
            xmessage "A Degraded Array has been detected on device" $device
            else
            if [ $event == "TestMessage"]; then
                xmessage "A Test Message has been generated on device" $device
            fi
        fi
    fi    
fi
#~~~~~~~~~~~~~~~~~~~~~~~~~ End of Script ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To add an external program simply add the following line to the /etc/mdadm.conf file:

PROGRAM /etc/raid-events

Where /etc/raid-events is the file that contains the script listed above. Ensure that the file is also marked as executable.


Testing the configuration to ensure that e-mails are sent

After everything has been setup you can verify that the e-mail alerts are sent and can be received by running mdadm in test mode. This can be accomplished by doing the following:

  1. Open a terminal window and type su to login as root
  2. type mdadm --monitor --scan --test
                  Add the –mail parameter if the /etc/mdadm.conf does not contain a MAILADDR line

An e-mail should be received for each arrary device listed in the /etc/mdadm.conf file.  

If e-mails are not received the /var/log/mail* files can be used to help debug why the failure occurred. The most common cause is that the e-mail address is being blocked by the receving gateway.

Another item to check is to ensure the postfix is installed on the system as mdadm uses postfix to send out the e-mails.

Disclaimer

This Support Knowledgebase provides a valuable tool for NetIQ/Novell/SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.

  • Document ID:7001034
  • Creation Date:25-JUL-08
  • Modified Date:27-APR-12
    • SUSESUSE Linux Enterprise Desktop
      SUSE Linux Enterprise Server

Did this document solve your problem? Provide Feedback