Novell Home

Using RSYNC to back up to a Hub Backup Server

Novell Cool Solutions: Trench
By Sam Morris

Digg This - Slashdot This

Posted: 22 Aug 2003
 

I was looking for detailed instructions for using RSYNC without BOMA to back up multiple servers to a hub backup server. I could find them nowhere. Hopefully this will help others. I have used this to replace a copy job that we used to have running once a day using a third-party vendor. The nlm would frequently abend, or would never finish syncing the data.

RSYNC copies data between two servers. For this documentation we'll refer to the servers are "fromserv" and "toserv". "Fromserv" will be the server that has the data on it that is being backed up. "Toserv" will typically be a hub backup.

NOTE: rsync is case-sensitive!

FROMSERV SETUP

RSYNC Files

Make a directory off the root of SYS called rsync. Into the rsync directory, copy the rsync files:

  • rsync.nlm
  • rsyncd.motd
  • rsyncdn.nlm
  • rsyncst.nlm
  • syncgrp.ncf *
  • synci-k.ncf *
  • syncpriv.ncf *

(*These are specific to my environment. We have directories called "privdir" (user's home directory), and "groupdir" (shared among members of a department), and I: and K: drives, which hold applications. You can make ncf files called something relative to your environment, using these as a guide.)

The contents of the ncf files are as follows (the commands all go on a single line, except for synci-k, which syncs two different root directories):

SYNCGRP.NCF:
sys:rsync\rsync -vaz /Groupdir 167.132.19.99::KCMO0445 --volume=KCMO0445\VOL1: --no-blocking-io

SNYCPRIV.NCF:
sys:rsync\rsync -vaz /Privdir --exclude-from=/exclude.txt 167.132.19.99::KCMO0445 --volume=KCMO0445\VOL1: --no-blocking-io

SYNCI-K.NCF:
sys:rsync\rsync -vaz /Develop 167.132.19.99::KCMO-I-K --volume=KCMO0445\VOL1: --no-blocking-io
sys:rsync\rsync -vaz /Software 167.132.19.99::KCMO-I-K --volume=KCMO0445\VOL1: --no-blocking-io

As you can see the format of these files is basically the same:

<command> <options> <source files> <exclusion list> <target ip address (of toserv)> <source volume> <more options>

As further explanation, using the above example and applying it to the syncpriv.ncf above (in this example we are backing up vol1:\privdir\*.* on KCMO0445 to KCMO0238\vol1\kcmo\privdir. KCMO0238 has an ip address of 167.132.19.99. Comments are in italic, the actual ncf file contents are non-italic:

sys:rsync\rsync kicks off the rsync process -vaz verbose, archive mode, compress, /Privdir back up everything in and under privdir --exclude-from=/exclude.txt don't back up any filespec in the file sys:exclude.txt 167.132.19.99::KCMO0445 send the data to KCMO0445 --volume=KCMO0445\VOL1: put the data in the section defined as KCMO0445 in sys:etc/rsyncd.conf on the toserv --no-blocking-io more options

CRONTAB Setup

You will need to modify sys:etc\crontab so that RSYNC keeps data up-to-date. In this example, the privdir and groupdir are updated each hour, and I and K once each day:

# Sync the privdir
1 * * * * sys:rsync\syncpriv.ncf

# Sync the Groupdir
31 * * * * sys:rsync\syncgrp.ncf

# Sync I and K
* 5 * * * sys:rsync\synci-k.ncf

Purging old data from the toserv

When users delete data on fromserv, it will not automatically get removed from toserv. Over time this will run toserv out of space. To *groom* the toserv and delete the files existing on toserv but that have been deleted on fromserv, simply append --delete to the end of the command line in the ncf files. For example, syncpriv.ncf would change from this:

sys:rsync\rsync -vaz /Privdir --exclude-from=/exclude.txt 167.132.19.99::KCMO0445 --volume=KCMO0445\VOL1: --no-blocking-io

to this:

sys:rsync\rsync -vaz /Privdir --exclude-from=/exclude.txt 167.132.19.99::KCMO0445 --volume=KCMO0445\VOL1: --no-blocking-io --delete

Make sure you only run this once (or as-needed), and then change it back. Also, I have noticed that for some reason it will not work on some servers. The RSYNC will still run, but the --delete statement will be ignored. This appears to be a bug.

A word about exclude.txt

Users will make backup copies of their C drive and other things on their H drive (home directory) that we do not want to sync. The exclude.txt file enables you to list the filespecs and directoryspecs that you do not want to sync. A portion of the exclude.txt file might look like this (note that these are case-sensitive):

TEMP/
Temp/
temp/
data.old/
backupLEL/
My Briefcase/
temp/
C drive backup/
C-DRIVE/
C DRIVE/
CD/
Notesdat-old-B/
old laptop files saved to H/
mp3/
*.bak

This file needs to reside in the root of the sys volume on the fromserv. It can reside elsewhere, but you will need to specify the location. In this example it's in the root of SYS.

TOSERV SETUP

The hub backup may have only one server backing up to it, or it may have more. In this example we are using KCMO0238 as the hub backup server (or repository) which has data from KCMO0445, KCEP0257, MARY0171, and SEDA0490 being synced to it.

The rsync nlms, etc, need to be copied to the sys:system directory:

rsync.nlm
rsyncd.motd
rsyncdn.nlm
rsyncst.nlm
rsyncstp.ncf
rsyncstr.ncf

rsyncd.conf needs to be copied to the sys:etc directory and has to be modified. Here is the rsyncd.conf, with explanations in italic JUST FOR THE AREAS THAT NEED TO BE CHANGED. If there is no comment, it is the same for all servers:

uid = nobody 
gid = nobody 
max connections = 0 
syslog facility = local5 
pid file = SYS:/rsync/rsyncd.pid 
log file = SYS:/rsync/rsyncd.log
motd file = SYS:/rsync/rsyncd.motd

[KCMO0445]This will be the "tag" KCMO0445 looks for when rsync runs
        path = VOL1:/kcmo	This is the path on vol1 of KCMO0238 
		  where KCMO0445 is to put its data
        comment = KCMO0445 backup area	Just a comment
        read only = no
	use chroot = no 
	transfer logging = yes

[SEDA0490]This will be the "tag" SEDA0490 looks for when rsync runs
        path = VOL1:/seda 	This is the path on vol1 of KCMO0238 
		  where SEDA0490 is to put its data
        comment = SEDA0490 backup area 	Just a comment
        read only = no
	use chroot = no 
	transfer logging = yes

[MARY0171]This will be the "tag" MARY0171 looks for when rsync runs
        path = VOL1:/Mary 	This is the path on vol1 of KCMO0238 
		  where MARY0171 is to put its data
        comment = MARY0171 backup area 	Just a comment
        read only = no
	use chroot = no 
	transfer logging = yes

[KCEP0257]This will be the "tag" KCEP0257 looks for when rsync runs
        path = VOL1:/kcep 	This is the path on vol1of KCMO0238 
		  where KCEP0257 is to put its data
        comment = KCEP0257 backup area 	Just a comment
        read only = no
	use chroot = no 
	transfer logging = yes

[KCMO-I-K]This will be the "tag" KCMO0445 looks for when rsync runs
        path = VOL1:/ 	This is the path on vol1of KCMO0238 
		  where KCMO0445 is to put its I and K  data
        comment = KCMO0445 I and K drive backup area Just a comment
        read only = no
	use chroot = no 
	transfer logging = yes

Also, it is important that you make a directory off the root of SYS on toserv called rsync. You do not need to put anything in this directory but it must exist.

Additional Comments

If you have more than one server backing up to a hub backup you should stagger the jobs so that they don't all start at the same time. RSYNC is extremely fast, so ten minutes between each job beginning should be adequate, once the server is synced up initially. (Only changed data will be re-synced.)

One note: You will see an error now and then about "couldn't write x number of bytes" or an I/O error, but most likely that's cosmetic (it is here anyhow.) It looks like it goes ahead and backs everything up. What I've noticed is that it will "error" but if there is something after the error that says "read x bytes, wrote x bytes" the backup was successful despite this error message.

To help keep track and be sure if the backup completed without having to manually go check each server, I wrote up a kludge that helps me (I have about 150 servers to track, and don't have time to do it all manually). I'll give you a brief synopsis of what I do, and perhaps you can improve on it. (If you have any suggestions I'd love to hear them.)

Again, I call the server being backed up "fromserv" and the server that the data is being copied to the "toserv".

We'll pick just one of these fromservs to look at. KCEP0257. Here's the crontab I have running on KCEP0257 (recall that data is copied from KCEP0257 to KCMO0238:

# The next lines sync Privdir up to KCMO0238 every two hours, 
   5 minutes past the hour, between 6 am and 8 pm
5 6 * * * sys:rsync\syncpriv.ncf
5 8 * * * sys:rsync\syncpriv.ncf
5 10 * * * sys:rsync\syncpriv.ncf
5 12 * * * sys:rsync\syncpriv.ncf
5 14 * * * sys:rsync\syncpriv.ncf
5 16 * * * sys:rsync\syncpriv.ncf
5 18 * * * sys:rsync\syncpriv.ncf
5 20 * * * sys:rsync\syncpriv.ncf

# The next lines sync Groupdir up to KCMO0238 every two hours, 
   35 minutes past the hour, between 6 am and 8 pm
35 7 * * * sys:rsync\syncgrp.ncf
35 9 * * * sys:rsync\syncgrp.ncf
35 11 * * * sys:rsync\syncgrp.ncf
35 13 * * * sys:rsync\syncgrp.ncf
35 15 * * * sys:rsync\syncgrp.ncf
35 17 * * * sys:rsync\syncgrp.ncf
35 19 * * * sys:rsync\syncgrp.ncf

(In our environment, for the most part I'm only concerned with backing up vol1:\groupdir\ and vol1:\privdir\)

What I did was to manually sync these directories over and over (watching for the errors noted above where it seems to stop prematurely) until the job finished successfully. Then I invoke the crontab. I make the crontab run every-other hour between 6 am and 8 pm for each of these two directories. This keeps them in sync, and since they run so frequently, each job takes only a few minutes, sometimes only a few seconds.

So, to determine if they were successful, I created a file called 12345678.xyz and put it in vol1:groupdir and vol1:privdir on each of the fromservs. I flagged it di, ri, and ro. It's just a dummy file with a few letters in it (Mine says "This is a test file that is used to confirm backups ran. Please do not delete it.")

Then each morning at 2:00 am, I have toserv delete the files. Here's the crontab from the above toserv (KCMO0238):

# Delete the test files starting at 2:00 am. This file is used to 
# ensure the backups ran and were successful.
0 2 * * * del vol1:\kcmo\privdir\12345678.xyz
1 2 * * * del vol1:\kcmo\groupdir\12345678.xyz
2 2 * * * del vol1:\seda\privdir\12345678.xyz
3 2 * * * del vol1:\seda\groupdir\12345678.xyz
4 2 * * * del vol1:\mary\privdir\12345678.xyz
5 2 * * * del vol1:\mary\groupdir\12345678.xyz
8 2 * * * del vol1:\kcmo\software\12345678.xyz
9 2 * * * del vol1:\kcmo\develop\12345678.xyz

Obviously all of the servers that backup to KCMO0238 are listed here, but you get the idea. One word of caution: Be sure to load toolbox in your autoexec.ncf file with the /nl switch on it so it won't prompt for a login. Also in your autoexec.ncf file on toserv you will want to add the line rsyncstr so the RSYNC deamon starts.

So, each morning at 2:00 a.m. these files will be deleted, and then should be replaced the first time RSYNC runs the next morning. (Toolbox will ignore the di ri and ro flags on the file).

Next, I am having my tools guys write a service that will run on an NT server that will go out mid-morning sometime (after the first few iterations of RSYNC have run) and check for the existence of the 12345678.xyz files on each of the directories on toserv. If they do not exist, I'll know there is a problem with the RSYNC jobs. The service will send me an email to all of my email accounts, at which time I can investigate.

If you have any questions you may contact Sam at smorris@up.com


Novell Cool Solutions (corporate web communities) are produced by WebWise Solutions. www.webwiseone.com

© 2014 Novell