9.11 Preventing a Cluster Node Reboot after a Node Shutdown

If LAN connectivity is lost between a cluster node and the other nodes in the cluster, it is possible that the lost node will be automatically shut down by the other cluster nodes. This is normal cluster operating behavior, and it prevents the lost node from trying to load cluster resources because it cannot detect the other cluster nodes. By default, cluster nodes are configured to reboot after an automatic shutdown.

On certain occasions, you might want to prevent a downed cluster node from rebooting so you can troubleshoot problems.

9.11.1 OES 2 SP2 with Patches and Later

Beginning in the OES 2 SP2 Maintenance Patch for May 2010, the Novell Cluster Services reboot behavior conforms to the kernel panic setting for the Linux operating system. By default the kernel panic setting is set for no reboot after a node shutdown.

You can set the kernel panic behavior in the /etc/sysctl.conf file by adding a kernel.panic command line. Set the value to 0 for no reboot after a node shutdown. Set the value to a positive integer value to indicate that the server should be rebooted after waiting the specified number of seconds. For information about the Linux sysctl, see the Linux man pages on sysctl and sysctl.conf.

  1. As the root user, open the /etc/sysctl.conf file in a text editor.

  2. If the kernel.panic token is not present, add it.

    kernel.panic = 0
    
  3. Set the kernel.panic value to 0 or to a positive integer value, depending on the desired behavior.

    • No Reboot: To prevent an automatic reboot after a node shutdown, set the kernel.panic token to value to 0. This allows the administrator to determine what caused the kernel panic condition before manually rebooting the server. This is the recommended setting.

      kernel.panic = 0
      
    • Reboot: To allow a cluster node to reboot automatically after a node shutdown, set the kernel.panic token to a positive integer value that represents the seconds to delay the reboot.

      kernel.panic = <seconds>
      

      For example, to wait 1 minute (60 seconds) before rebooting the server, specify the following:

      kernel.panic = 60
      
  4. Save your changes.

9.11.2 OES 2 SP2 Release Version and Earlier

In OES 2 SP release version and earlier, you can modify the opt/novell/ncs/bin/ldncs file for the cluster to trigger the server to not automatically reboot after a shutdown.

  1. Open the opt/novell/ncs/bin/ldncs file in a text editor.

  2. Find the following line:

    echo -n $TOLERANCE > /proc/sys/kernel/panic
    
  3. Replace $TOLERANCE with a value of 0 to cause the server to not automatically reboot after a shutdown.

  4. After editing the ldncs file, you must reboot the server to cause the change to take effect.