Cluster Resource Problems

Resources in a Comatose State
The following screen is an example of what you will see in the ConsoleOne® Cluster State View when a cluster resource is in a comatose state.


A comatose state indicates that the resource is not running properly and requires administrator intervention.

A resource goes into the comatose state when it encounters an error during the load or unload script execution or when the scripts do not complete within the load or unload script time-out period.

Comatose resources are almost always caused by an error or typo in the load or unload script or by interference in the cluster as a result of an administrator's manual intervention.

The best way to determine the cause is to watch the server console screen while you online or offline the resource. Watch for any error messages or warnings on the screen. These will most likely reveal the cause of the error.

Since an error or typo in a load or unload is a common cause of comatose resources, you should first check the scripts to ensure everything is correct, especially volume names.

Common Load Script Errors

The following screen is an example of what you will see on the server console screen when a volume name is incorrectly spelled in a load script.

The following screen is an example of what you will see on the server console screen when an IP address specified in the load script is already in use somewhere else.

The following screen provides an example of what you will see on the server console screen when an IP address specified in the load script is not associated with a local binding. This means the secondary IP address can not be added because a primary IP address with the same mask does not exist.

The following screen is an example of what you will see on the server console screen when load script commands are in the wrong order. In this case, a dependent module or volume was not available and script commands could not be completed.

Comatose resources caused by administrator intervention are most commonly due to an administrator manually executing portions of the resource load or unload script. You should use ConsoleOne, NetWare® Remote Manager, or the cluster command line interface (available with Novell Cluster ServicesTM 1.6 and later) to start or stop cluster resources. Treat the resource as a whole. Do not manually execute parts of resource load or unload scripts.

Resources in an NDS Sync State
The following screen is an example of what you will see in the ConsoleOne Cluster State View when a cluster resource is in an NDS® Sync state.


The most common cause of a resource going into and remaining in an NDS Sync state is when a number is used as the first character in a resource name. This problem does not exist in Novell Cluster Services version 1.01 Support Pack 2 and later.

When a number is used as the first character in a resource name, one of two things will occur:

When the resource never leaves the NDS Sync state, you will see continuous error messages similar to those shown in the following server console screen.

As an interesting test, create a cluster resource with the same name as another cluster resource, but add a number to the beginning of the cluster resource name. For example, you can create a cluster resource named DNS, and then create a second cluster resource named 2DNS. Both resources will go into a comatose state when they start and you won't be able to bring the DNS resource online or offline until you delete the 2DNS Resource object.

There isn't much troubleshooting to do here; just remember not to use numbers as the first character in cluster resources names.

A trademark symbol (®, TM, etc.) denotes a Novell trademark. An asterisk (*) denotes a third-party trademark. For information on trademarks, see Legal Notices.