9.1 Monitoring Server Health

Monitoring the health of your server can help prevent it from getting to a state in which your users cannot access the server or the data on it. Monitoring your server’s health involves the following tasks:

9.1.1 Accessing the Health Monitor

The Health Monitor page allows you to monitor your server's overall health, configure which items determine the server's overall health status, and configure which items you want to be notified about.

To access the Health Monitor page, click one of the following links in Novell Remote Manager:

  • Server's health status is good Overall server health status indicator icon

  • Health Monitor link in the header frame Health Monitor icon in the header frame

  • Diagnose > Health Monitor link in the navigation frame

9.1.2 Viewing the Health Monitor

The Health Monitor page reports information about the operating system and the services that are running on the operating system as shown in Figure 9-1. You can use this page to monitor your server's overall health, configure which items determine the server's overall health status, and configure which items you want to be notified about.

Figure 9-1 Novell Remote Manager Health Monitor

9.1.3 Monitoring Overall Server Health or the Health of a Specific Item

Using Novell Remote Manager, you can monitor the server’s overall health and the health of a specific item.

Overall Server Health Status

The server’s overall health is indicated by the color of the circle displayed next to the Server icon Server's health status is good in the header frame for Novell Remote Manager. The following table lists and explains each health status that might be displayed.

Table 9-1 Server Health Status

Icon

Server Health Status

Explanation

Good

All parameters included in the server's health configuration list are good.

Suspect

The status of one or more of the parameters included in the server's health configuration list is suspect or has a minor problem.

Bad

The status of one or more of the parameters included in the server's health configuration list is bad or has a critical problem.

Lost connection

The connection to the server from Novell Remote Manager has been lost.

The server’s overall health is determined by items that are selected in the Include list on the detailed Health Monitor page as shown in Figure 9-1, Novell Remote Manager Health Monitor. By default, all items are selected. The items represent the processes that are loaded on the server.

If the status of any item that is selected in the Include list changes to yellow (suspect) or red (bad), the health status indicator light in the header frame changes to indicate there is a problem. If more than one item changes, the worst status indicates the server’s overall status. When the status for all items returns to green (good), then the health light indicator changes back to green (good).

Health Status Refresh Rate

The server’s health status, reported by the health status indicator, is updated every five seconds, but the graphic refreshes only if the status changes.

  • To modify the refresh rate, select a rate from the Page Refresh Rate drop-down menu, then click Begin Refresh. The selected refresh rate applies to this page only, and persists until you modify the value.

  • To stop refreshing the page, select Stop Refresh. The page does not refresh until you click Begin Refresh.

  • To begin refreshing after stopping, select Begin Refresh. The last used refresh rate is applied automatically when it begins.

Operating System Health

The Operating System table on the Health Monitor page shows the health status (green/good, yellow/suspect, or red/bad) for all known components of the operating system, as well as current, peak, and maximum values. When an item is not selected in the Include list, it is not included when determining the overall server health and the values for Status, Current, Peak, and Max are not displayed.

The following items in the Operating System table are key indicators of your server’s health:

IMPORTANT:You must click the Apply Settings button below the Operating System table to apply your changes to values in that table. If you leave the page without applying the changes, the settings return to their saved values.

You cannot change the thresholds for the Suspect and Critical values of these indicators. See the online help for each parameter to see the set thresholds.

Table 9-2 describes the information that is provided for each of the operating system components:

Table 9-2 Operating System Health Information

Parameter

Description

Status

For specific details regarding the status indicator of an item, click the Information icon for that item.

Description

A list of resources, processes, or items that can affect the health of your server. When you want to see the specific details or status for an item, click the description name for that item.

Current

Represents the current value being reported for the item’s specific health status.

For current memory, the value is the total amount of free memory that was available when the server was recently polled.

Peak

Represents the highest value reported for the item’s specific health status since the server was started.

For peak memory, the peak value is the least amount of memory that has been available while Novell Remote Manager has been running. It reports the amount of free memory available when the server’s memory usage peaked during the observed interval.

Max

Represents the highest value possible for the item's specific health status.

For maximum memory, the value is the total amount of memory in the system.

Info

For specific details regarding the status indicator, settings, or meaning of an item, click the Information icon for that item.

Include

When you want to include an item in determining the overall health status of your server, check the check box for that item. By default, all items are checked. When an item is not checked, it is not included when determining the overall server health; nor are its values for Status, Current, Peak, and Max displayed.

Notify

When you want to be notified about a status of an item, check the check box for that item. You will be notified when the status changes.

Before you can receive notifications, you must also configure email addresses in the /etc/opt/novell/httpstkd.conf file. You can edit this file via the link provided on the Configuration page. Restart httpstkd after making these changes by executing the following command on the Linux server as the root user:

/etc/init.d/novell-httpstkd restart

Services Health

The Services table on the Health Monitor page also shows the health status of the services installed on the server as well as their online or offline status. When a service is offline, the health status of the service is not included in the server’s overall health whether or not it is selected in the Include list.

The mode indicates that the server is running or stopped . To change the mode of the service, click the mode link for that service. The mode page opens for the service where you can start, stop, or restart the service by clicking the applicable button.

You can modify the Include and Notify settings in the Services table by selecting and deselecting the check boxes in those columns, then clicking Apply Settings below the table.

IMPORTANT:You must click the Apply Settings button below the Services table to apply your changes to values in that table. If you leave the page without applying the changes, the settings return to their saved values.

9.1.4 Configuring the Items to Monitor

As stated in the previous section, the server’s overall health is determined by items that are selected in the Include list on the detailed Health Monitor page. By default, all of the items are selected.

Therefore, if you have a server that has specific parameters that you know will cause a suspect or bad status and you want to be notified only when other parameters have changed, you can remove the items with the suspect or bad parameters from the Include and Notify lists by deselecting them and clicking Apply Settings. You need to apply the settings for the Services items separately from the Operating System items.

9.1.5 Configuring Email Notification for Server Health Status

Rather than manually checking the status, you can configure Novell Remote Manager to send an email to notify you when the server’s health status changes to any value other than green (good).

  1. Select the Notify check box for the item on the Health Monitor page.

  2. Specify the required information for email notification in the /etc/opt/novell/httpstkd.conf file.

    You can edit this file via the link provided on the Configuration Options page.

  3. After changing the /etc/opt/novell/httpstkd.conf file, restart HTTPSTKD.

    Click the Restart HTTPSTKD button on the Configuration Options page or execute the following command in a console shell on the Linux server:

    rcnovell-httpstkd restart