Novell Home

Automated Disaster Recovery System

Novell Cool Solutions: Trench
By Chris Pratt

Digg This - Slashdot This

Updated: 4 Apr 2002
 

Chris Pratt has cooked up an ingenious way to capture everything necessary to gracefully recover from a disaster (if there is a graceful way to recover). His solution documents, in fantastic detail, the process he uses to backup pertinent information from scores of servers in numerous locations and have the files at-the-ready just in case.

Products:

  • NetWare 5.1
  • ArcServe 6.6
  • GroupWise 5.5

Solution:

  • Automated Disaster Recovery System

Tools Needed:

  • Cron.nlm
  • Toolbox.nlm
  • TrustBar.nlm
  • ServerMagic

Sample Code:

  • Crontab
  • Tbsys.NCF
  • Copycfg.NCF
  • Savecfg.NCF
  • Recovery.NCF

Description:
We are a large state agency with about 250 servers spread over 180 locations with 10,000+ users. We've created an automated process (as a disaster recovery tool) that is run by the Cron.nlm on a monthly basis. This process executes:

  1. A series of NCF files that use the TrustBar.nlm to back up trustees on all of our volumes.
  2. Then a Copycfg.NCF file executes that backs up all server specific information to a single directory structure. This information includes the startup.ncf, autoexec.ncf, the ETC directory, ArcServe configuration files, all the DOS boot (isn't there a movie by that name?) files that are specific to the type of server hardware we are running (see the Copycfg.NCF info below).
  3. Once the Copycfg.NCF is run another NCF file called Savecfg.NCF is run that copies all of the saved information to another server.

At this point we have a copy of everything that makes a server come up gracefully saved to another server. The next part is to build a generic server (tempserver) into a generic tree (temptree) and apply all of your current service packs, patches and utilities (especially the backup software you use). We use ServerMagic (a imaging software by PowerQuest, the same people that make Drive Image Pro)(you should be able to use Ghost as well) to make an image of your fully patched ready-to-go server.

When (not if) you have a hardware failure that destroys your server (disk drive problems, RAID problems, fire, flood or theft), you repair or replace the hardware and perform the following steps:

  1. Remove the failed server from your NDS tree, clean up any replica rings that may have been on this server.
  2. Copy the saved configuration information to a floppy diskette from the server that the Savecfg.NCF file copied it to.
  3. Use the server image software to place the "tempserver" image on the repaired/replaced hardware. (15 minutes for a 600 MB image).
  4. Use the Recovery.NCF file to restore the saved configuration information back to the server from your floppy diskette.
  5. Remove NDS using NWCONFIG.NLM form the 'tempserver'.
  6. Reboot the server, it will now come up with the original servers name and configuration.
  7. Use NWCONFIG to place the repaired server back into your tree.
    * At this point you have spent about 30 minutes on the rebuild.
  8. Use your backup software to restore any lost user data or applications.
  9. Use NDSMGR to replace any removed NDS replicas back onto the server.

We stage the server CD Images out to our remote locations and have had users at remote sites perform the recovery with us giving them instructions over the phone. We usually recreate the CD image about twice a year as new service packs and patches come out.

**Crontab
# min. hour date month day-of-week action
# This will make backup the trustee rights on your volumes
0 18 23 * * tbsys.ncf
0 18 24 * * tbdata1.ncf
0 19 1 * * tbdata2.ncf
0 19 2 * * tbdata3.ncf
0 19 3 * * tbdata4.ncf
0 19 4 * * tbdata5.ncf
# This will make copies of config files.
0 19 5 * * copycfg.ncf
0 19 6 * * savecfg.ncf

NCF Files Used:
**************************************
**Tbsys.NCF --You will need an NCF for each volume on your server, or just one big one with all of the commands.***
Load trustbar.nlm sys: -b

**Copycfg.NCF***
unload toolbox
load toolbox
delay 5000

#CLEAN UP THE OLD INFO
del sys:etc\recover\*.* /y/s

#BACKUP THE ETC DIRECTORY
copy sys:etc\*.* sys:etc\recover\etc\

#VARIOUS NCF AND CONFIGURATION FILES IN THE SYSTEM DIRECTORY
copy sys:system\ldrconag.ncf sys:etc\recover\system\
copy sys:system\autoexec.ncf sys:etc\recover\system\
copy sys:system\recovery.ncf sys:etc\recover\system\
copy sys:system\*.mta sys:etc\recover\system\
copy sys:system\*.poa sys:etc\recover\system\
copy sys:system\ast*.* sys:etc\recover\system\
copy sys:system\tbsys.ncf sys:etc\recover\system\
copy sys:system\tbdata1.ncf sys:etc\recover\system\
copy sys:system\tbdata2.ncf sys:etc\recover\system\
copy sys:system\tbdata3.ncf sys:etc\recover\system\
copy sys:system\tbdata4.ncf sys:etc\recover\system\
copy sys:system\tbdata5.ncf sys:etc\recover\system\
copy sys:system\timesync.cfg sys:etc\recover\system\
copy sys:system\copycfg.ncf sys:etc\recover\system\
copy sys:system\savecfg.ncf sys:etc\recover\system\
copy sys:system\duplcfg.ncf sys:etc\recover\system\
copy sys:system\recovery.ncf sys:etc\recover\system\
copy sys:system\rcsys.ncf sys:etc\recover\system\
copy sys:system\rcdata1.ncf sys:etc\recover\system\
copy sys:system\rcdata2.ncf sys:etc\recover\system\
copy sys:system\rcdata3.ncf sys:etc\recover\system\
copy sys:system\rcdata4.ncf sys:etc\recover\system\
copy sys:system\rcdata5.ncf sys:etc\recover\system\
copy c:\nwserver\startup.ncf sys:etc\recover\
copy sys:system\arcserve.ncf sys:etc\recover\

#DOS INFO FROM THE C: DRIVE
copy c:\network\net.cfg sys:etc\recover\network\
copy c:\network\net.bat sys:etc\recover\network\
copy c:\autoexec.bat sys:etc\recover\dos\
copy c:\config.sys sys:etc\recover\dos\

#THE ARCSERVE CONFIG AND LICENSE INFO
copy data1:arcserve.6\asconfig.ini sys:etc\recover\arcserve\ copy data1:arcserve.6\nlm\tapesvr.cfg sys:etc\recover\arcserve\nlm\ copy data1:arcserve.6\license\*.* sys:etc\recover\arcserve\license\

#THIS DELETES ALL OF THE LOG FILES OUT OF THE RECOVERY DIRECTORY
del sys:etc\recover\etc\*.log /y

#THIS MOVES THE TRUSTEE INFO TO THE RECOVERY DIRECTORY
move sys:\trustees.xml sys:etc\recover\trust\sys.xml
move data1:\trustees.xml sys:etc\recover\trust\data1.xml
move data2:\trustees.xml sys:etc\recover\trust\data2.xml
move data3:\trustees.xml sys:etc\recover\trust\data3.xml
move data4:\trustees.xml sys:etc\recover\trust\data4.xml
move data5:\trustees.xml sys:etc\recover\trust\d

**Savecfg.NCF***
#COPIES THE RECOVER DIRECTORY TO ANOTHER SERVER
unload toolbox
toolbox
delay 5000
copy sys:etc\recover\ AnotherServer\*.* data1:filelibr\novell\recovery\ThisServersName\ .user.your.context password -s

**Recovery.ncf***
#This program restores all the original server configurations from the recovery diskette.
unload toolbox
load toolbox
pause

#This will copy each file to its proper location
copy a:\system\ldrconag.ncf sys:\system\
copy a:\system\autoexec.ncf sys:\system\
copy a:\system\recovery.ncf sys:\system\
copy a:\system\*.mta sys:\system\
copy a:\system\*.poa sys:\system\
copy a:\startup.ncf c:\nwserver\
copy a:\network\net.bat c:\network\
copy a:\network\net.cfg c:\network\
copy a:\dos\autoexec.bat c:\
copy a:\dos\config.sys c:\
copy a:\etc\*.* sys:\etc\

If you have questions about this fantastic solution, you can contact Chris at (CPRATT@dot.state.tx.us).


Novell Cool Solutions (corporate web communities) are produced by WebWise Solutions. www.webwiseone.com

© 2014 Novell