Novell Home

Cluster Resource Service Check and Preferred Node Failback via CRON and Server Console Conditional Commands

Novell Cool Solutions: Tip
By Jonathan Laudicina

Digg This - Slashdot This

Posted: 8 Jun 2006
 

Clustering rocks.

That said, I've never really been happy with the cluster resource "Failback" options of Auto, Manual or Disabled. You can review these Failback options here: http://www.novell.com/documentation/nw65/orionenu/data/h2mdblj1.html

Let me elaborate; I've always wanted a scheduler. A nice button that said, "Failback resource after my users go home," would have been ideal, but I would have settled for "Failback Auto given X time of day". What I've outlined here is a method for setting up an scheduled failback of cluster resources by leveraging NetWare conditional commands and CRON.

The pieces and parts...

commands leveraged:
%IF- NetWare's basic IF/THEN; checkout HELP %IF
%ENV- NetWare's Global Environment Variables; checkout %ENV, or HELP
%ENV CLUSTER MIGRATE - console command to migrate cluster resources

utilities leveraged:
CRON- the NetWare Scheduler; configured in SYS:ETC\crontab

assumed cluster Resource settings:
Start- set to AUTO
FailOver- set to AUTO
FailBack- set to DISABLED

The setup...

We'll assume 3 cluster nodes(FS1, FS2, FS3) in our cluster serving up 3 resources(APP1, APP2, APP3). I prefer that APP1 always run on FS1 because I've made certain configuration changes to FS1 especially for APP1's improved performance. If FS1 fails I want my APP1 resource to automatically FailOver to another node. I don't want it failing back automatically in the middle of the day; I don't ever want to see something like a bad fan causing my cluster node to reboot, take the resource, overheat, reboot, take the resource...and so on until I catch it.

To implement Failback via a schedule I define the RESOURCE as ON or OFF in my environment variables.

%env APP1=1 
%env APP1=0

These variables must be set in 3 places.

  1. In the autoexec.ncf of the preferred server append the line:
    %env APP1=0


  2. In the LOAD SCRIPT for the resource APP1 add the line:
    %env APP1=1


  3. In the UNLOAD SCRIPT for the resource APP1 add the line:
    %env APP1=0

Now, at any given time I can query FS1 for the variable of APP1. This is exactly what I need because FS1 is my preferred server for the APP1 resource. Knowing that FS1's 'ownership' of that resource is now loosely represented as a variable, I've got another way of checking the status of that resource.

Checking my Global Environment Variables, at the console of FS1, I execute:

%env

FS1 will echo back all of the Global Environment Variables, and one of these will be my APP1 variable (1 or 0) representing the APP1 resource. To review, after booting, FS1 will show APP1 as 0 (because we state it in the autoexec), after loading the APP1 resource FS1 will show this as 1 (because we state it in the LOAD SCRIPT), and after UNloading the APP1 resource, FS1 will show this as 0 (because we state it in the UNLOAD SCRIPT). Now we're ready to act on the ON/OFF, 0/1 information.

Here's where we use our conditional command, %IF.

The logic here is, if APP1 is not 1 - which means its not on FS1 - then take it from where ever it is and load it on FS1.

%IF APP1!=1; then cmd SVCTAKE.NCF

...where SVCTAKE.NCF is the NCF that contains my CLUSTER MIGRATE command, which is:

CLUSTER MIGRATE APP1 FS1

Of course, if APP1 does equal 1, FS1 does nothing because he already owns that resource.

Now tie it altogether...

CRON's our scheduler of choice. I wrap it all up with 2 CRON'd events, one at 1900 and one at 0600, and I call an NCF because I think it makes CRON 'neater'.

<CRONTAB file contents>
0 19 * * * svcchk.ncf
0 6 * * * svcchk.ncf
</CRONTAB file contents>

1900 or 0600 rolls around and CRON runs SVCCHCK.NCF.

<SVCCHK.NCF file contents>
# if the environment variable is NOT set to 1, then it will call the
SVCTAKE.NCF
# if the envrionment variable IS set to 1, then it will do nothing
#
%if env APP1!=1; then cmd svctake.ncf 
#
</SVCCHK.NCF file contents>

and of course SVCTAKE.NCF does migration work IF its called...

<SVCTAKE.NCF file contents>
# migrate resource from current location to this server
#
cluster migrate APP1 FS1
#
</SVCTAKE.NCF file contents>

That'll do it. The scheduled times are arbitrary and work for my shop; tune CRONTAB at will for your resource migration needs. If you set up all of your servers to migrate a resource back to their preferred server, I recommend an offset in your CRONTAB between migrations. In other words, don't have FS1, FS2 and FS3 all execute a CLUSTER MIGRATE command at 1900; more appropriate would be FS1 @1900, FS2 @1905 and FS3 @1910.

Good Luck.
-JL


Novell Cool Solutions (corporate web communities) are produced by WebWise Solutions. www.webwiseone.com

© 2014 Novell