Cluster Resource Service Check and Preferred Node Failback via CRON and Server Console Conditional Commands
Novell Cool Solutions: Tip
By Jonathan Laudicina
Digg This -
Posted: 8 Jun 2006
That said, I've never really been happy with the cluster resource "Failback" options of Auto, Manual or Disabled. You can review these Failback options here: http://www.novell.com/documentation/nw65/orionenu/data/h2mdblj1.html
Let me elaborate; I've always wanted a scheduler. A nice button that said, "Failback resource after my users go home," would have been ideal, but I would have settled for "Failback Auto given X time of day". What I've outlined here is a method for setting up an scheduled failback of cluster resources by leveraging NetWare conditional commands and CRON.
The pieces and parts...
%IF- NetWare's basic IF/THEN; checkout HELP %IF
%ENV- NetWare's Global Environment Variables; checkout %ENV, or HELP
%ENV CLUSTER MIGRATE - console command to migrate cluster resources
CRON- the NetWare Scheduler; configured in SYS:ETC\crontab
assumed cluster Resource settings:
Start- set to AUTO
FailOver- set to AUTO
FailBack- set to DISABLED
We'll assume 3 cluster nodes(FS1, FS2, FS3) in our cluster serving up 3 resources(APP1, APP2, APP3). I prefer that APP1 always run on FS1 because I've made certain configuration changes to FS1 especially for APP1's improved performance. If FS1 fails I want my APP1 resource to automatically FailOver to another node. I don't want it failing back automatically in the middle of the day; I don't ever want to see something like a bad fan causing my cluster node to reboot, take the resource, overheat, reboot, take the resource...and so on until I catch it.
To implement Failback via a schedule I define the RESOURCE as ON or OFF in my environment variables.
%env APP1=1 %env APP1=0
These variables must be set in 3 places.
- In the autoexec.ncf of the preferred server append the line:
- In the LOAD SCRIPT for the resource APP1 add the line:
- In the UNLOAD SCRIPT for the resource APP1 add the line:
Now, at any given time I can query FS1 for the variable of APP1. This is exactly what I need because FS1 is my preferred server for the APP1 resource. Knowing that FS1's 'ownership' of that resource is now loosely represented as a variable, I've got another way of checking the status of that resource.
Checking my Global Environment Variables, at the console of FS1, I execute:
FS1 will echo back all of the Global Environment Variables, and one of these will be my APP1 variable (1 or 0) representing the APP1 resource. To review, after booting, FS1 will show APP1 as 0 (because we state it in the autoexec), after loading the APP1 resource FS1 will show this as 1 (because we state it in the LOAD SCRIPT), and after UNloading the APP1 resource, FS1 will show this as 0 (because we state it in the UNLOAD SCRIPT). Now we're ready to act on the ON/OFF, 0/1 information.
Here's where we use our conditional command, %IF.
The logic here is, if APP1 is not 1 - which means its not on FS1 - then take it from where ever it is and load it on FS1.
%IF APP1!=1; then cmd SVCTAKE.NCF
...where SVCTAKE.NCF is the NCF that contains my CLUSTER MIGRATE command, which is:
CLUSTER MIGRATE APP1 FS1
Of course, if APP1 does equal 1, FS1 does nothing because he already owns that resource.
Now tie it altogether...
CRON's our scheduler of choice. I wrap it all up with 2 CRON'd events, one at 1900 and one at 0600, and I call an NCF because I think it makes CRON 'neater'.
<CRONTAB file contents> 0 19 * * * svcchk.ncf 0 6 * * * svcchk.ncf </CRONTAB file contents>
1900 or 0600 rolls around and CRON runs SVCCHCK.NCF.
<SVCCHK.NCF file contents> # if the environment variable is NOT set to 1, then it will call the SVCTAKE.NCF # if the envrionment variable IS set to 1, then it will do nothing # %if env APP1!=1; then cmd svctake.ncf # </SVCCHK.NCF file contents>
and of course SVCTAKE.NCF does migration work IF its called...
<SVCTAKE.NCF file contents> # migrate resource from current location to this server # cluster migrate APP1 FS1 # </SVCTAKE.NCF file contents>
That'll do it. The scheduled times are arbitrary and work for my shop; tune CRONTAB at will for your resource migration needs. If you set up all of your servers to migrate a resource back to their preferred server, I recommend an offset in your CRONTAB between migrations. In other words, don't have FS1, FS2 and FS3 all execute a CLUSTER MIGRATE command at 1900; more appropriate would be FS1 @1900, FS2 @1905 and FS3 @1910.
Novell Cool Solutions (corporate web communities) are produced by WebWise Solutions. www.webwiseone.com