Novell Home

AppNote: RSYNC Q&A - An Open Discussion

Novell Cool Solutions: AppNote
By Gary Childers

Digg This - Slashdot This

Updated: 8 Dec 2005
 

"It's a potpourri of FAQ's dealing with RSYNC, containing a load of good information."

Kirk Coombs–
Product Specialist

Novell, Inc.

08 Dec 2005 - Updated with another question from an RSYNC user.

Table of Contents

Introduction
Q: Is RSYNC outside of Nterprise Branch Office supported by Novell technical support?
Q: What kind of logging is provided when using RSYNC for data synchronization?
Q: Do I have to have an rsync server on a network, or can I use the (Windows) client to copy/sync files remotely to another Windows XP computer?
Q: Can this be done between 2 different trees, and if so, how does the security work? I don't see a way to use the command line to login into a different server and it's tree.
Q: I am trying to sync data between two NetWare servers, but keep getting the messages 'read timeout: Error peer socket closed', 'error in socket IO'.
Q: Can RSYNC use SSL outside of Branch Office?
Q: Why am I getting my RSYNC server screen (and log file) filled with 'Error 10053 with FD_Close'?
Q: My rsyncd.log file is now 80+MB and growing! What can I do?
Q: I want to backup up the Documents folders of several Windows laptops on a LAN using RSYNC. Do I have to have static IP addresses for each laptop?
Q: How can I schedule multiple rsyncs so they don't overlap each other?
Q: Can anyone post solutions/methods for REPORTING on RSYNC success/failure status?
Q: Is anybody aware of the maximum supported path/filename length, as I suspect a possible ABEND coming out of that situation?
Q: Can I run my RSYNC server on a Linux platform?
Q: If I'm using RSYNC to replicate my server data, then do I really need to do tape backup anymore?
Q: I need to copy one volume of about 900GB from one server to another, using RSYNC. I need to figure out how to sync groups of folders off the root of a volume.
Q: Is it possible to synchronize two directories with the same name to the same destination server?
Q: I am trying to get RSYNC 2.6.3 to run on NetWare 6 SP3. When I try to load RSYNC I get public symbol errors "Loader cannot find public symbol..."
Q: To distribute files to remote NetWare servers, do I need to copy the entire RSYNC folder to the remote servers or just the rsync.nlm and the include/exclude files?
Q: I can successfully synchronize my NetWare server to a Linux server with no problem, except that the ownership changes to root for everything. I'd like to be able to sync everything and retain all the ownership, permissions, and attributes. Is that possible?
Q: When I am replicating data between two NetWare servers (using SSL), occasionally I notice that the replication stops, and I see a file with a name like ".accounts.mdb.000150" in the destination folder. What is happening?
Q: Does RYSNC running on NBO and a CO support or at least replicate MAC volumes?
What Works for You

Introduction

My previous articles on the RSYNC utility for NetWare (see "The Complete Works of Gary") have generated a number of responses – and questions – that I have attempted to address in individual emails. I thought perhaps I could submit this article as an informal RSYNC "Q & A" to answer some of the most common questions that I receive.

Many NetWare systems administrators are excited about the opportunity to replace regular tape backups at some of their remote sites with a data synchronization solution that replicates the remote site data to a central office (see "Goodbye to Tape"). Indeed, this is the very opportunity that Novell saw when they ported the open-source RSYNC utility to the NetWare platform, specifically for Novell's Nterprise Branch Office product.

Since the utility is now available on the NetWare platform, many admins are asking, "Why not use it straight up, outside of Branch Office?" (see: "Using Rsync Outside of NBO"). So this is obviously a kind of grassroots movement, a "roll your own" solution, rather than an official Novell solution. But many people have found ways to use it quite successfully, and others are looking to jump on the same bandwagon.

Since every network is unique, and everyone's needs are slightly different, there may be no way to define any "right" or perfect way to use a utility like RSYNC. But here is an opportunity to share some of the experience I have in using RSYNC, and I'd like to also invite any comments or additions from readers to share any novel ways you might be using the utility.

Now on to some questions:

If you have other questions about RSYNC or would like to share your experiences using the utility, send them in here.

Q: "Is RSYNC outside of Nterprise Branch Office supported by Novell technical support?"

A: Unfortunately, no, not at this time. That's the price of a "roll your own" solution – you also end up having to "support your own", as well. But who knows, if there is an overwhelming interest in having RSYNC as a supported solution in Open Enterprise Server (both on NetWare and Linux), perhaps then we can also gain the benefit of official Novell tech support.

Q: "What kind of logging is provided when using RSYNC for data synchronization?"

A: Essentially, the RSYNC client by default operates in "quiet" mode, so it doesn't show any play-by-play activity, and only displays a brief summary at the end. No log files are created. When the client is run in verbose mode by supplying the "--verbose" or "-v" switch on the command line (see "Using RSYNC in Data Backup Solution"), then it displays the file-by-file transfer process, but still does not generate any log files.

To produce log files on the client side, use the "--log-file=" parameter in the command, such as:

SYS:/rsync/rsync -rav --volume=DATA: --log-file=sys:/rsync/rsync1.log /TEST RSYNCHOST::TEST --delete

This parameter, combined with the "-v" or "--verbose" parameter, gives you the file-by-file record of how the rsync client performed, which can be useful in troubleshooting. This also allows you to specify different log files for each command, if desired. Some admins like to run a separate RSYNC command after the replication job, just to forward the client-side log file to the central server, as a record of the job completion.

The RSYNC daemon, however, by default records all RSYNC sessions in its rsyncd.log file (SYS:\RSYNC\rsyncd.log). But (by default) this records only summary information for each RSYNC session. If you wish to see verbose information in the rsyncd.log file, then you will have to set the verbosity level in the RSYNC daemon's configuration file, rsyncd.conf (this by default resides in SYS:\ETC, but I prefer to move it to SYS:\RSYNC, and then specify that location in the RSYNC startup file). For each module (location) in the rsyncd.conf file, you can set 'transfer logging=yes' (or no).

Q: "Do I have to have an rsync server on a network, or can I use the (Windows) client to copy/sync files remotely to another Windows XP computer on the private (192.168.0.x) network?"

A: By design, the default way for RSYNC to work is in "client-server" mode. This allows the (active) client and (passive) server to work together in tandem to synchronize the files on both sides of the link. Previous and other versions of file synchronization utilities must rely on one computer (typically, the client) to have control of the file systems on both sides of the link, in order to perform the file comparisons and copy/update operations. This usually requires some type of mapped drive or other form of control so that the client can operate the server file system.

RSYNC thus gains some efficiencies and eliminates the need for drive mappings by operating as a client-server utility. That said, with the Windows RSYNC client (CWRSYNC, available from http://www.itefix.no), it is possible to operate the utility as a simple "client-client" utility, to synchronize files on different drives of the same computer, or to drives that are mapped to another computer. (see "AppNote: Using RSYNC with Windows"). This is terrific for synchronizing data on a workstation to a removable USB drive.

Q: "I read your 'Using RSYNC in Data Backup Solution' Article. I'm very interested. Can this be done between 2 different trees, and if so, how does the security work? I don't see a way to use the command line to login into a different server and it's tree."

A: Yes, RSYNC can occur between two eDir trees, since the utility is completely NDS-ignorant. It is solely a file synchronization utility that operates between two servers, without regard to NDS authentication or any file security whatsoever. The only built-in security measures are that RSYNC client must reference a "module" in the rsyncd.conf file on the RSYNC server (analogous to a username), and that the module configuration can specify a "hosts allow=" by IP address.

That said, then, be careful how you use the utility. RSYNC was ported to NetWare explicitly for use with Novell's Branch Office (NBO) product, which uses SSL security to encrypt data transferred between the NBO server and the Central Office server (which, by design, are in separate eDir trees). Thus, it is recommended that RYSNC (outside of NBO) should only be used in a LAN or private WAN environment, not over public (Internet) connections where the data might be intercepted. If you need the added security of SSL encryption during data synchronization, then I'd recommend that you use Nterprise Branch Office, or configure RSYNC specifically to run using SSL.

NOTE: Because RSYNC pays no attention to the local server's file system security, the NetWare attributes and trustees do not get replicated with each file. In order to capture the trustee assignments with the files being replicated, you will need to run a separate script (prior to the replication) to record the trustee assignments to a text file, so that this information can be synchronized along with the data. If you need to restore data from the RSYNC synchronization, then you would have to restore the trustee assignments back again using this text file. See: "Using RSYNC in Data Backup Solution" for details.

Q: "I am trying to sync data between two NetWare servers, but keep getting the messages 'read timeout: Error peer socket closed', 'error in socket IO', and 'error starting client-server protocol' at the server running as the client, and no indications at all at the server (RSYNC daemon). What could be happening?"

A: What I suspect is happening in your case is that the RSYNC server (daemon) is running in SSL mode, and the RSYNC client is operating in non-SSL mode. If your rsyncstr.ncf file on the RSYNC server looks like this:

SYS:/rsync/rsync --progress --address=10.10.10.10 --port=873 --ssl --daemon --config=SYS:/rsync/rsyncd.conf

then you will need to remove the "--ssl" switch from the file to run in non-SSL mode.

Q: "I've been using RSYNC to backup a remote Branch Office server using SSL with no problems. Now I wish to do the same with a remote NW6.5 box, but am having trouble getting it to work via SSL. Can RSYNC use SSL outside of Branch Office?"

A: Many RSYNC users have trouble with the SSL option, for various reasons. In most cases that I have heard they don't really need the SSL encryption, so they just turn it off. But I have successfully gotten the SSL to work using the following method:

You will first need to export the Trusted Root Certificate on the central office (RSYNC daemon) server. Open ConsoleOne, find the SSL Certificate IP for that server, open the properties, Certificates tab, and click Export. Select No to export the private key, Next, select .DER format and name it root.der, Next, and Finish.

Once completed, copy the root.der file to the SYS:\ETC folders on both the sending and receiving servers. The RSYNC daemon can then be run using the "--ssl" parameter, such as:

sys:rsync/rsync --ssl --progress --address=10.10.0.10 --port=873 --daemon

And then the RSYNC client will also have to be run in SSL mode, such as:

rsync -rav --ssl --volume=DATA: /USERS/ NW65::USERS

The modes (SSL or non-SSL) will have to match, for the two servers to successfully replicate. If you need to use the same server in SSL mode, say for Nterprise Branch Office, and also in non-SSL mode, you can run two instances of the daemon, one SSL and one non-SSL, at the same time, on a secondary IP address. Read "Double SYNC -- Using RSYNC Simultaneously With and Without Branch Office for Data Backup" for more information on this.

Q: "Why am I getting my RSYNC server screen (and log file) filled with 'Error 10053 with FD_Close'?"

A: This is a problem that shows up when applying NW65sp3 to the server running the RSYNC daemon. It will constantly display an "Error 10053 with FD_Close" on the RSYNC screen, about every three minutes, even when no syncs are occurring. The RSYNC ping packet is causing this error.

The error is cosmetic when seen after applying NW65SP3. Although only a cosmetic error, it is annoying and it also fills up the rsyncd.log file with all that garbage. See TID 10097275. The RSYNC.NLM that supposedly fixes this error, version 2.6.3, is currently in Beta 3. But I'm running the Beta 3 RSYNC.NLM, and it still produces the error. At this point, I'd prefer to stick with NW65SP2 on the RSYNC server to avoid the error (or you can just put up with it, until the fix is released).

UPDATE: (10/20/2005) This issue is fixed in NW65SP4. See the updated TID 10097275. On the servers I am running, the RSYNC.NLM version still shows the identical version 2.6.2 and file date, but the error messages (thankfully) now are gone. This works as well, if the RSYNC files were downloaded separately from the Novell developer site (http://developer.novell.com/ndk/rsync.htm), so the fix is probably in other modules within the service pack.

Q: "There has got to be a better way to handle RSYNC logs. My rsyncd.log file is now 80+MB and growing! What can I do?"

A: RSYNC currently doesn't have a method to automatically recycle the log file. One quick and dirty way of managing the logging on the RSYNC server (daemon) is to setup a separate script to run daily (weekly, monthly – whatever suits you) to cycle the logs automatically for you. I use this COPYLOGS.NCF file that uses TOOLBOX.NLM to run daily, so that I have a separate file for each day of the week. At the end of the week, the Central Office server gets a full backup to tape, so the log files can run through the 7-day cycle again:

REM COPYLOGS.NCF
REM *** Increment RSYNC logs 1-7 ***

load toolbox.nlm
delay 3
echo Incrementing RSYNC logs ...
copy sys:\rsync\rsyncd6.log sys:\rsync\rsyncd7.log /q > sys:\null.txt
copy sys:\rsync\rsyncd5.log sys:\rsync\rsyncd6.log /q > sys:\null.txt
copy sys:\rsync\rsyncd4.log sys:\rsync\rsyncd5.log /q > sys:\null.txt
copy sys:\rsync\rsyncd3.log sys:\rsync\rsyncd4.log /q > sys:\null.txt
copy sys:\rsync\rsyncd2.log sys:\rsync\rsyncd3.log /q > sys:\null.txt
copy sys:\rsync\rsyncd1.log sys:\rsync\rsyncd2.log /q > sys:\null.txt
copy sys:\rsync\rsyncd.log sys:\rsync\rsyncd1.log /q > sys:\null.txt
copy sys:\rsync\empty.log sys:\rsync\rsyncd.log /q > sys:\null.txt
flag sys:\rsync\rsyncd.log -R > sys:\null.txt
unload toolbox.nlm

Obviously, you don't want to run the script during an active RSYNC session, so schedule it to run during non-replication hours. There is no need to stop and re-start the RSYNC daemon – when it begins to process an incoming RSYNC session, it will write log information to whatever rsyncd.log it finds in the SYS:\RSYNC directory. You will have to create the zero-byte (empty) "empty.log" file to start out (type EDIT SYS:\RSYNC\empty.log at the console, Yes to create the file, then exit EDIT). The rest of the log files here specified will be created by the script as time goes by.

Some people like to load TOOLBOX.NLM in their AUTOEXEC.NCF file, making it unnecessary to load and unload it here in this file. The "delay 3" command merely inserts a 3-second delay, so that TOOLBOX.NLM can fully load before the .NCF file attempts to execute the copy command. The "> sys:\null.txt" at the end of each copy command is just there to eliminate any output to the console screen, so that the .NCF file operates silently.

It is easy to use the built-in scheduler in NetWare Remote Manager (https://yourserverIP:8009) to schedule the COPYLOGS.NCF file to run at the time you wish to specify. Use Schedule Tasks in the (left) navigation pane, and set up your task to run, such as this:

Q: "I want to backup up the Documents folders of several Windows laptops on a LAN using RSYNC. Do I have to have static IP addresses for each laptop, and then do I have to specify a separate module in the rsyncd.conf file for each laptop?"

A: No, that's not necessary. You can specify one module (location) in the rsyncd.conf file for all the laptops, and then modify the RSYNC command on each PC to direct the file synchronization to a particular folder. For example, you can specify a module that looks like this:

[XPLAPTOPS]

path=DATA:/rsync/XPLaptops
comment=Data Backup for XP Laptops on the LAN
read only=no
use chroot=no
strict modes = no
transfer logging=yes
timeout=3600
use lfs=yes
hosts allow=10.1.0.0/24
hosts deny=*

The line "hosts allow=10.1.0.0/24" specifies the LAN address and subnet mask (class C or 255.255.255.0) - use the ones appropriate for your network IP subnet. This allows all nodes on your specified LAN to access this module in RSYNC communications. Then, in order to separate out the replicated files from each particular source, you would modify the RSYNC commands used on each laptop, as such:

User JaneDoe might use this command:

c:\cwrsync\rsync -rav /cygdrive/c/mydata RSYNC1::XPLAPTOPS/JaneDoe

Whereas user JoeSchmoe might use this command:

c:\cwrsync\rsync -rav /cygdrive/c/mydata RSYNC1::XPLAPTOPS/JoeSchmoe

NOTE: The RSYNC command is case-sensitive, particularly on the module name (such as XPLAPTOPS above). In this example, the host name "RSYNC1" would need to be resolved either by DNS or the local HOSTS file. If your source or destination paths contain any spaces, such as "My Documents", then be sure to enclose the string within quotes, such as:

c:\cwrsync\rsync -rav "/cygdrive/c/Documents and Settings/JoeSchmoe/My Documents" RSYNC1::XPLAPTOPS/JoeSchmoe

Also note that the slashes make a difference: the source string "/mydata" will copy the /mydata folder, and all files and sub-folders in it (assuming -r or -a is used), whereas the source string "/mydata/" (with a trailing slash) will only copy the files and sub-folders within the /mydata folder. The rsync command will only recognize the use of the forward slash (/) in the source string. So get used to using forward slashes.

Finally, note that in the module name ([XPLAPTOPS]) in the rsyncd.conf file, the RSYNC daemon expects that the specified path (path=DATA:/rsync/XPLaptops) already exists, so create the folder ahead of time. And when using the RSYNC command to specify a sub-folder in that module, as above (XPLAPTOPS/JaneDoe), also be sure to create it on the server in advance, to avoid a failure of the command.

For more on using RSYNC with Windows clients, see "AppNote: Using RSYNC with Windows".

Q: "How about some info on scheduling multiple rsyncs so they don't overlap each other?"

A: That will all happen on the client side of the equation, just as it does in Nterprise Branch Office. The most direct way I can think of is just to schedule the replications, which are executed via .NCF files, using NetWare Remote Manager at each client (sending) server, as in the example above. You will have to calculate how many sessions you have going each time period, about how long they take to complete, and then set the Start Time for each accordingly.

However, I hear reports of RSYNC users out there performing as many as 16 simultaneous synchronizations without any problem. Still, I would choose to space them out, if possible, to better utilize the available bandwidth, and also to keep the RSYNC server log entries from being interleaved, such as:

2005/08/24 20:11:39 [162] recv SERVER1 [10.1.0.4] SERVER1 ()
users/remote/JaneDoe/Documents/Word/Document1.doc 28672
2005/08/24 20:11:39 [164] recv SERVER2 [10.2.0.4] SERVER2 ()
users/remote/JoeSchmoe/Documents/Word/Document2.doc 19456
2005/08/24 20:11:39 [162] recv SERVER1 [10.1.0.4] SERVER1 ()
users/remote/JaneDoe/Documents/Word/Document2.doc 45928
2005/08/24 20:11:39 [164] recv SERVER2 [10.2.0.4] SERVER2 ()
users/remote/JoeSchmoe/Documents/Word/Document2.doc 25346

In this example excerpt from the rsyncd.log file, the bracketed numbers ([162] and [164]) after the time stamp refer to the RSYNC sessions that are currently being processed. This example shows that SERVER1 and SERVER2 are replicating at the same time, and the log entries are therefore being interleaved (here we are using verbose transfer logging).

Q: "Can anyone post solutions/methods for REPORTING on Rsync success/failure status? It is running on 40+ servers here and I have no idea what is and is not being RSYNCed."

A: Keep in mind that RSYNC is a simple file synchronization utility, and that by itself it does not contain any advanced scheduling and reporting capabilities. As I mentioned at the beginning, at this point this is very much a "roll your own" solution.

The solution listed above for copying the logs (COPYLOGS.NCF) is one way to segregate the RSYNC server progress logging by date. Judicious timing of the RSYNC jobs, as above, can also help separate the log entries, and prevent their intermixing.

Also, as mentioned above with the client-side logging option, some admins schedule a second replication job after the main data replication, just to copy the client-side RSYNC log to go with the data just replicated. A simple RSYNCLOGS.NCF file might look like:

rsync -rav --volume=SYS: /rsync/rsync1.log RSYNC1::Site1/logs

However, I have also heard of some RSYNC users out there getting pretty fancy with using PERL or other scripts to parse out the rsyncd.log file, for a type of reporting functionality. Care to share that with us, anybody?

Q: "Is anybody aware of the maximum supported path/filename length, as I suspect a possible ABEND coming out of that situation?"

A: I have run into situations where the RSYNC fails, due to excessively long path/filename combinations (greater than 512 characters). But I have also seen where I cannot copy or delete files on a NetWare server, just using the NetWare client, due to the same long path/filenames. In the RSYNC situation, the sync often simply fails with a cryptic error message – but I haven't seen any ABENDs that I can attribute to that cause.

On the other hand, I have experienced ABENDs on some older servers, when trying to RSYNC a very large number of files (in excess of 100,000). RSYNC has to build a file list, which is stored in RAM, before it can perform the file compare and synchronize operations.

Sometimes the solution is just to break up the RSYNC job into smaller bytes (pun intended), and schedule them to run successively. Thus, instead of a monolithic RSYNC command that synchronizes an entire volume, such as:

#SYNCDATA.NCF
SYS:/rsync/rsync -rav --volume=DATA: / RSYNC1::SITE1 -–delete

We could break it up into smaller commands to be run in succession, such as:

#SYNCUSERS.NCF
SYS:/rsync/rsync -rav --volume=DATA: /Users/ RSYNC1::SITE1/Users -–delete

#SYNCAPPS.NCF
SYS:/rsync/rsync -rav --volume=DATA: /APPS/ RSYNC1::SITE1/APPS -–delete

#SYNCSHARED.NCF
SYS:/rsync/rsync -rav --volume=DATA: /SHARED/ RSYNC1::SITE1/SHARED -–delete

If we determined, in our baseline testing, that each job took roughly two hours to complete, then we might schedule the three jobs to launch two or more hours apart.

Q: "Can I run my RSYNC server on a Linux platform?"

A: Well, of course you can. Remember, RSYNC was originally written for the Linux platform, and subsequently ported to NetWare, so the RSYNC utility feels right at home on a Linux box.

RSYNC is installed by default on the SUSE OES installation, so there is nothing extra to do there. The command line for launching the RSYNC daemon on the Linux server is almost identical to it's NetWare variant:

rsync --progress --address=10.10.0.10 --port=873 --daemon --config=/etc/rsyncd.conf

The switches supplied here for "port=" and "config=" are at their default settings, and thus only need to be specified if you select anything other than the defaults. Of course, the "address=" will need to match your server IP address. By default the RSYNC configuration file, rsyncd.conf, lives in the /etc folder.

The rsyncd.conf file on the Linux platform is also almost identical to what we use on NetWare. Here is an example of a module in that file, which we might use for synchronizing data from Windows XP clients:

[XPBACKUP]

path = /data/rsync/clients
comment = XP client data sync area on SUSEOES server
read only = no
use chroot = no
timeout = 3600
transfer logging = yes
hosts allow=10.10.0.0/24
hosts deny=*

As mentioned before, make sure the path specified is created in advance, and that access rights are specified. Be mindful that Linux is case-sensitive, so it's really best to specify everything in lowercase. Note in this example that we have again set the "hosts allow=" to an entire subnet, to allow multiple clients to access this module. This would work well with the previous example to backup the Windows clients.

Q: "If I'm using RSYNC to replicate my server data, then do I really need to do tape backup anymore?"

A: This topic is discussed at greater length in the "Goodbye to Tape" article. Most people today who are using data replication strategies like RSYNC are replacing tape backup for remote site servers and individual workstations. But very few are entirely abandoning tape backup at this point.

The simplest reason is that data synchronization gives you one primary benefit – redundancy of your data. That way if we lose the primary data set, we have a backup copy to fall back on, so we're not out of business. And when the synchronized data set is in another physical location from the primary data set, there is a benefit in terms of disaster recovery.

But tape backup provides one benefit that is not usually provided by data synchronization strategies, and that is version archiving. Almost all network admins have been called upon to restore a particular version of a critical file or set of files, because the current version is corrupted, deleted, or otherwise rendered useless. Tape backup (if done properly) provides archival data sets, so that we can restore a version of a file that was known to be good on, let's say, Thursday two weeks ago.

Most people who are using data synchronization today are using it in combination with tape backup. That is, they synchronize data from remote servers and workstations to a central server, and subsequently perform tape backup of the same data on the central server. This gives them immediate access to current versions of files on the central server, and offline or near-line access to archived version of the files via tape.

This, by the way, is precisely the model of data backup that Novell devised for their Nterprise Branch office product. Tape backup is eliminated at the branch offices in favor of data replication. But the central office server should be backed up to tape.

But it can be argued that with the ever-increasing capacities and declining costs of disk media, that eventually we will also use hard disks for our archival purposes, as well. And some folks say we may have reached that point already.

Q: We are moving to a cluster and I need to copy one volume of about 900GB from one server to another, using RSYNC. I need to figure out how to sync groups of folders off the root of a volume.

A: Thanks to Lance for that question. Obviously, he wants to have two synchronized data sets, to prepare to cut over to a new file server. After the main file copy, RSYNC will only need to synchronize changes in files, so it becomes very efficient. After some more investigation, I found that his volume with user home directories was a flat structure like this:

HOME:      \User001
        \User002
        \User003
        .
        .
        .
        \User753

He just needs a way to break up the synchronization jobs into manageable chunks, rather than just one monolithic RSYNC job. And it certainly wouldn't make sense to have to run 753 separate RSYNC commands to sync each home folder individually.

Here is a way we found that he could do it. To separate out just a portion of the root folders for synchronization, such as User001 - User099, we could run a script such as this SYNCUSER0.NCF file on the sending server:

#SYNCUSER0.NCF
rsync -va --delete --log-file=sys:/rsync/rsyncuser0.log --volume=HOME: / Server1::USERS --include="User0*" --exclude="/*" --exclude="/*.*"

Using the "--include=User0*" parameter, this replicates only the folders that begin with "User0", which corresponds to User001 - User099. The "--exclude=" parameters exclude all other files ("/*.*") and folders ("/*") at the root of the specified volume. The "--logfile=" parameter provides a separate client-side log file for this command.

Once that completes, we can increment the folders for replication my slightly modifying the RSYNC command. For example, to move on to the next 100 folders, User100 - User199, we run a script such as this SYNCUSER1.NCF on the sending server:

#SYNCUSER1.NCF
rsync -va --delete --log-file=sys:/rsync/rsyncuser1.log --volume=HOME: / Server1::USERS --include="User1*" --exclude="/*" --exclude="/*.*"

The only change here is the name of the log file, and the "User1*" argument in the include parameter. Then we can carry this out to all of the rest of the 753 folders, 100 at a time, by change the "--include=" argument to "User2*" (201-299), "User3*" (301-399), and so on, until we reach "User7*". Thus we end up with 8 separate RSYNC jobs (0-7), each averaging about 100GB apiece.

In this case, the folders were designated numerically, but the concept works alphabetically as well. The "--include=" arguments could just as easily have been "A*", "B*", "C*", etc.

NOTE: Pay attention to the syntax. At the very core of these commands (above) is:

rsync --volume=HOME: / Server1::USERS

All of the other switches or parameters (-va, --delete, --log-file=, --include=, --exclude=) govern how the synchronization will happen, and are optional. The "--volume=" parameter is a NetWare-specific option (not found in Linux versions of RSYNC) needed to specify on which NetWare volume to run the command. The actual source string is "/", meaning the root of the volume. The destination string is "Server1::USERS", where:

  • Server1 needs to be resolved by either DNS or the hosts file to the receiving server (RSYNC daemon),
  • the "::" specifies that RSYNC communication will occur via TCP (default port 873), and
  • USERS references the module in the RSYNC (daemon) server's rsyncd.conf file

The corresponding section in the rsyncd.conf file should look something like this:

[USERS]
path=HOME:/Users
comment=Data area for synchronizing migrated Users folders
read only=no
use chroot=no
strict modes = no
transfer logging=yes
timeout=3600
use lfs=yes
hosts allow=10.1.0.101
hosts deny=*

The RSYNC daemon expects that the specified path (path=HOME:/Users) already exists on the server, so create the folder ahead of time.

Also NOTE: Since rsync does not save the NetWare attributes on the files, you have to provide a way to capture and re-apply the NetWare trustee assignments of the synchronized files from one server to another. A fairly easy way to do this is to use TRUSTEE.NLM to capture and apply trustees. For example, you could run a script like this TRUSTUSERS.NCF:

#TRUSTUSERS.NCF
TRUSTEE SAVE HOME:\ HOME:\trustees.txt

Beware, I have not tried this on a 900 GB volume -- this could take a while to complete. If it is too big, you might have to break up the job into smaller chunks, like we did on the RSYNC commands. The resultant trustees.txt file should be copied to the destination server.

If the volume names are different on the two servers, such as "HOME:" on the source server, and "DATA:" on the destination server, you will have to use a search and replace utility on the trustees.txt file, since it includes the volume name in every single line (specifying the full path to each file). Be sure to try this operation on a single test directory, before trying to do it on an entire volume.

The same TRUSTEE.NLM can be used to restore the NetWare trustee assignments to the synchronized files on the destination server. For example, using the same trustees.txt file we earlier created:

#RESTUSERS.NCF
TRUSTEE RESTORE HOME:\trustees.txt

Both of these sample TRUSTEE.NLM command files use the root of the volume as the starting point. Folders or sub-folders can also be specified, such as:

TRUSTEE SAVE HOME:\Users\Admin HOME:\Users\Admin\admin.txt

See TID2971887 for syntax and options of the TRUSTEE.NLM utility.

The solution presented here uses simple "--include=" and "--exclude=" parameters. To get a bit more granular with inclusions and exclusions, we would have to use the more powerful "--include-from=" and "--exclude-from=" parameters, which specify a list of files or file types in a simple text file to be used with the RSYNC command. For example, many people use this option to exclude unwanted file types (.mp3 files, etc.) from the data synchronization, such as:

rsync -rav --volume=DATA: users/ RSYNC1::SITE1 --delete --exclude-from=SYS:rsync/excludes.txt

Then the unwanted file types are listed in the excludes.txt file, such as:

#EXCLUDES.TXT
*.mp3
*.wma
*.rma
*.mov
*.vid

The same concept can be used for including certain file types and folders. For example, a command such as:

rsync -rav --volume=DATA: / RSYNC1::SITE1 --delete --exclude="/*" --exclude="/*.*" --include-from=SYS:rsync/include1.txt

combined with the specified include1.txt file that contains:

#INCLUDE1.TXT
ThisFolder
ThatFolder

This effectively tells the RSYNC utility to perform an rsync operation on the root of the DATA: volume to the location specified as SITE1 on the RSYNC1 server, delete files that are not on the source location, exclude all files and folders from the synchronization, but then include only the files/folders listed in the include1.txt file. Thus RSYNC will scan the DATA: volume at the root for files or folders named "ThisFolder" and "ThatFolder", and include them in the synchronization. The "-r" tells it to recurse into all subfolders of the included folders.

Whereas you can have multiple "--include=" and "--exclude=" parameters, each with one argument, you can have only one each "--include-from=" and "--exclude-from=" parameters, with their accompanying files, per RSYNC command.

You can get pretty fancy with extensive lists of files to include and exclude, but then you have to be careful not to have conflicting arguments, or you will not end up with the synchronized data set that you thought you would have. For example, if your include file specifies a certain file or file type to include in the synchronization, but your exclude file specifies that it is to be excluded, you may find that it will not synchronize.

For information on the syntax and logic of the include and exclude options, see the RSYNC MAN page.

Q: Is it possible to synchronize two directories with the same name to the same destination server? We want to synchronize data1:folder1\sub1\syncdata and data1:folder1\sub2\syncdata from one source server to many destination servers.

A: Thanks to Markus for this question. If I am interpreting this correctly, it appears that he wants to do two things: to synchronize two subfolders with the same name from server to server, and also to perform a one-to-many synchronization of the data sets in these two directories on a NetWare server to multiple other servers, using the RSYNC utility. (Note: server and folder names were changed to generic names.)

He is also currently using RSYNC to synchronize a folder on his NetWare server with a command such as:

rsync -rav --volume=data1: /folder1/sub1/Site1data RSYNC1::SITE1/sub1 --delete

The corresponding SITE1 module in the rsyncd.conf on RSYNC1 looks like:

        [SITE1]
    path=DATA1:/folder1
    comment=
    read only=no
    use chroot=no
    strict modes = no
    transfer logging=no
    timeout=3600
    use lfs=no

This should yield a synchronized set of data on the RSYNC1 server at DATA1:/folder1/sub1/Site1data, with all the subfolders and files in the original data set on the sending server (RSYNC client).

For the first part of the question, the answer is yes, we can synchronize multiple folders, even with the same name, using RSYNC. To do this, we have to employ some include and exclude parameters, to limit the data we want to synchronize. In the question above, we specified the source data as the data1:folder1\sub1\syncdata and data1:folder1\sub2\syncdata folders, with all their files and subfolders. Let's say the directory structure looks something like this:

DATA1:\     folder1\   dontsync.txt
    dontsync\   dontsync.txt
    sub1\   dontsync.txt
      dontsync\ dontsync.txt
      syncdata\ syncthis.txt
    sub2\   dontsync.txt
      dontsync\ dontsync.txt
      syncdata\ syncthis.txt

Then we wish to synchronize only the syncdata folders in sub1 and sub2, and their files, and nothing else. We would have to employ an RSYNC command using --include-from and --exclude-from parameters, such as this:

rsync -rav --volume=DATA: /folder1/ RSYNC1::SITE1 --delete --exclude-from=sys:/rsync/exclude1.txt --include-from=sys:/rsync/include1.txt

This should all be one command, with no line breaks. The accompanying ASCII files here specified would then list the parameters of what file types we want to include/exclude:

#EXCLUDE1.TXT
/*
sub1/*
sub2/*

#INCLUDE1.TXT
sub1
sub1/syncdata
sub2
sub2/syncdata

Essentially, we are thus instructing the RSYNC utility to begin synchronizing recursively (-r) at the /folder1 level on the DATA: volume, to the location specified by the SITE1 module on the RSYNC1 server. Then while doing that, use the parameters specified in the EXCLUDE1.TXT and INCLUDE1.TXT files to do the following: exclude all folders and files (/*) at the root (in this case, folder1), and also at the sub1 and sub2 subdirectories; but then include the folders sub1, sub1/syncdata, sub2 and sub2/syncdata.

While the logic here seems a bit primitive, it does allow for a great deal of granularity. This is what allows us to specify the subdirectories with the identical names. In essence, the --include parameter really means "don't exclude". This is why we've specified to exclude everything, then went back to include the particular folders we wanted.

The source files and folders as listed above, after running this particular command, will yield the following data set on the target server:


	DATA1:\	folder1\
			sub1\
					syncdata\	syncthis.txt
			sub2\
					syncdata\	syncthis.txt

All of the "dontsync.txt" files and the "dontsync" folders were excluded by the "/*", "sub1/*" and "sub2/*" parameters in the EXCLUDE1.TXT file. Then the sub1, sub2 and both syncdata folders were "un-excluded" by the parameters in the INCLUDE1.TXT file.

The second part of the question touches upon the one-to-many issue. Here we want to synchronize a data set that is on one NetWare server to multiple other servers, using RSYNC.

This can be accomplished fairly easily, when the source server is running RSYNC in the daemon mode. That way, each target server, which will replicate that data set, can individually perform a "pull" of the data to their own file systems. For example, let's say we want to synchronize the data from DATA:\Folder1 on the source server, SOURCE1, to the same location on the target servers TARGET1, TARGET2 and TARGET3.

The SOURCE1 server can run RSYNC in daemon mode with this module in the rsyncd.conf file:


[SOURCEDATA]
   path=DATA:/Folder1
   comment=Distribution Data for Target Servers
   read only=yes
   use chroot=no
   strict modes = no
   transfer logging=no
   timeout=3600
   use lfs=yes
   hosts allow=10.1.0.101, 10.2.0.102, 10.3.0.103 
   hosts deny=*

Since RSYNC clients are not writing data to this location, we have set "read only=yes", and the "hosts allow" should correspond to the IP addresses of the target servers.

Then each target server could run an RSYNC command such as this:

rsync -rav SOURCE1::SOURCEDATA --volume=DATA: /Folder1/ --delete

By switching the order of the source and destination strings, this command would copy the contents from SOURCE1::SOURCEDATA – which translates to DATA:/Folder1 on the SOURCE1 server (running the RSYNC daemon) – to the target server's DATA:/Folder1. Each target server runs the identical command.

When we put together the commands for the includes/excludes, and the "pull" feature for the one-to-many replication, we could get a command such as:

rsync -rav --log-file=sys:/rsync/logs/syncpull.log SOURCE1::SOURCEDATA --volume=DATA: /Folder1/ --delete --exclude-from=sys:/rsync/exclude1.txt --include-from=sys:/rsync/include1.txt

This, combined with the correct include and exclude lists as specified, can do the job for filtering out just the data you specified, and performing the data pull from the client side for the one-to-many synchronization. This example also adds a client-side log file, which is useful for troubleshooting (here I created a SYS:\rsync\logs folder).

NOTE: The RSYNC command is case-sensitive, so pay attention to that. Also note that the slashes make a difference: the source string "/mydata" will copy the /mydata folder, and all files and sub-folders in it (assuming -r or -a is used), whereas the source string "/mydata/" (with a trailing slash) will only copy the files and sub-folders within the /mydata folder.

For more ideas on how to use RSYNC to distribute service packs or standard corporate documents, see: "Distribute Service Packs and Corporate Documents using RSYNC".

Q: I am trying to get RSYNC 2.6.3 to run on NetWare 6 SP3. When I try to load RSYNC I get public symbol errors "Loader cannot find public symbol..."

A: Be mindful that the RSYNC.NLM 2.6.3 to date is still in beta testing, not yet in the "released" category. Version 2.6.3 will not run on NetWare 6 without a LIBC update, formerly found in NWLIB6b (no longer available from Novell), now rolled into NW6SP5. Thus, I would recommend going ahead and updating your NetWare 6 SP3 server to SP5, available from (http://support.novell.com/servlet/downloadfile?file=/sec/pub/nw6sp5.exe/). This has the LIBC module listed as:

LIBC.NLM 7.05 1178427 06-23-2004 10:56AM

This brings to mind a caution to the readership at large. Sometimes we can get a little happy about trying out the latest software on our systems, and forget that little "results may vary according to your specific environment" caveat. Beta versions of any software are so called specifically because they haven't really been tested "out in the wild", but only in a limited lab environment. So be very careful about deploying any beta software on a production server (defined as one for which someone might be upset if it were to become unavailable).

I have had the RSYNC utility ABEND a few servers -- sometimes seriously -- either due to beta code, older hardware, too many files, etc. We should always test out any new software or new configurations in a lab environment -- test servers on a test network -- before bringing that software or configuration into the production environment. End of sermon.

The same issue with LIBC.NLM applies to NetWare 5.1 Servers, if you are running RSYNC on any older boxes. RSYNC Version 2.6.3 for NetWare, and even 2.6.2, will require at minimum NW51SP7, and preferably NW51SP8.

Q: I use rsync to distribute files to the remote NetWare servers, and my question is that do I need to copy the entire rsync folder to the remote servers or just the rsync.nlm and the include/exclude files?

A: Thanks to Siraj for this question. Actually, quite a few folks have asked similar questions. The quick answer is that only the "RSYNC Server" or daemon really needs or uses the accompanying files besides RSYNC.NLM -- the RSYNCDN.NLM, rsyncd.pid, rsyncd.motd, rsyncd.conf, rsyncstp.ncf and rsyncstr.ncf. The RSYNC client can run simply by using the single RSYNC.NLM.

The .ncf files are used to start and stop the RSYNC service (always modify them to suit your environment), and the RSYNCDN.NLM (referenced in rsyncstp.ncf) is used to shut down the running daemon. The rsyncd.conf file (hint: the "d" is for daemon) is referenced by the daemon for its configuration parameters, and includes modules for each location that will replicate with the RSYNC server. The rsyncd.motd file is merely a text "Message of the Day", which can be customized. The RSYNCST.NLM (RSYNC Status) is used only for Nterprise Branch Office.

On the other hand, any include and exclude files (if used) are optional and entirely customized to each environment -- these are used typically by RSYNC running in client mode (without the --daemon parameter). See other questions in this article for tips on how to use the include/exclude parameters.

As for default locations, on the NetWare platform all these files are located in SYS:\RSYNC, except for the rsyncd.conf file, located by default in SYS:\ETC (to maintain some parity with Linux, I suppose). However, all of these files can be placed anywhere you desire on a server volume, so long as you set the appropriate search paths, and configurations in the rsyncstr.ncf, rsyncstp.ncf and rsyncd.conf files.

Thus if you want to test out RSYNC version 2.6.3 (now in Beta 3), you can copy it (RSYNC.NLM only is included in the download) into SYS:\RSYNC.263, and copy all of the accompanying files from your SYS:\RSYNC folder into the new location. Then, be sure to modify the .ncf files and rsyncd.conf in that new folder to reference the proper location. To go back to your previous version, merely launch RSYNC from that (SYS:\RSYNC) location. There's really no install/uninstall to worry about.

If you are using RSYNC with SSL (see the SSL question in this article), the default location that RSYNC will look for the SSL certificates is in SYS:\ETC, and the default name is root.der for the exported SSLCertificateIP. In this case, we need the same certificate on both the client and the server (running the daemon). If the default SYS:\ETC location is not used, then the parameter "--certfile=" must be used (on each side).

A word on personal preferences: I prefer to move rsyncd.conf into the same SYS:\RSYNC folder with the rest of the files, just to keep all the files together for ease of editing, and it is also best when using multiple versions of the RSYNC utility (as above). But this requires modification of the rsyncstr.ncf file, to add the "--config=" parameter. If you want to setup RSYNC on a NetWare cluster (see "Double Sync"), then you end up having to do this anyway, for proper failover.

Another personal preference of mine is to create a SYS:\RSYNC\LOGS folder, and redirect the logs to that location (modifying the rsyncd.conf file). This keeps the logs together, separate from the other files, and can be handy when creating multiple, daily logs (see the question in this article about managing log files).

Finally, I noticed in your question you said that you are using RSYNC to distribute files out to remote servers (see "Distributing Service Packs"). I presume this means that you have a central server, running the RSYNC daemon, and each remote server connects using the RSYNC client to "pull" the updates back to themselves. This is the preferred method.

However, I am aware of at least one RSYNC user who runs the daemon on all the remote servers, and uses the RSYNC client to push the updates from a central server to each of the daemons (I'm not really sure why). If you are attempting to do this latter method, then, yes, you would need all of the accompanying configuration files to go with each remote daemon. But it is much simpler to just use one RSYNC daemon at the central site, and each client can just run one (identical) RSYNC command to pull the data to themselves.

Q: I can successfully synchronize my NetWare server to a Linux server with no problem, except that the ownership changes to root for everything. I'd like to be able to sync everything and retain all the ownership, permissions, and attributes. Is that possible?

A: I'm pretty sure that RSYNC alone is not going to do all that you want from it. You want to essentially copy files from one server platform (NetWare) to another platform (Linux), and retain the file owners, permissions and attributes. The problem is that because the NetWare metadata information only pertains to the NetWare (NDS) user accounts and the NetWare file system, any file owners, permissions, and attributes that could get transferred to the new server and new file system would be meaningless.

The RSYNC utility's ability to replicate file ownership, permissions, and attributes really only applies in Linux-to-Linux replications, at this stage of the game. NetWare-to-NetWare, Windows-to-NetWare, NetWare-to-Linux and Windows-to-Linux file replications all end up losing that information.

For NetWare-to-NetWare replications using RSYNC, we usually rely on using a utility such as TRUSTEE.NLM to save the NetWare trustee information for the files prior to replication, and then, if needed, apply the same trustee rights back to the replicated files. See this answer for more on saving trustee rights.

Bottom line, I believe there is no easy migration path for what you are doing. You might as well take the raw files over, with root ownership, and find a way to re-assign ownership and rights to the new (Samba) user accounts, either manually (ug!) or by using some type of scripting.

On the NetWare side, you can use TRUSTEE.NLM to capture the NetWare trustee information to a text file. For example:

TRUSTEE SAVE DATA:\ DATA:\trustees.txt

That can get you a start. Then you would have to find a way to write a script to translate that information into another script that can set the same permissions for the Samba users on the Linux server. See TID2971887 for more information on TRUSTEE.NLM.

Q: When I am replicating data between two NetWare servers (using SSL), occasionally I notice that the replication stops, and I see a file with a name like ".accounts.mdb.000150" in the destination folder. What is happening?

A: I believe the problem you describe indicates some type of failure on the receiving side of the replication (the RSYNC daemon). The issue probably has nothing to do with whether your replication occurs using SSL or not. One quick way to prove this is to change the replication to non-SSL (at both the client and the daemon), and see if the results are the same, or different. See this answer for more on using SSL for NetWare replications.

To elaborate, let me describe the replication process a little. When the RSYNC client initiates a synchronization job (the RSYNC command), it first contacts the server as specified in the command, and awaits a response. If the server is not running the RSYNC daemon, the command fails. If there is a mismatch between SSL and non-SSL between the client and daemon, the command fails. If the module specified in the command does not exist, or the directory for that module does not exist (as specified in the daemon's rsyncd.conf file), the command fails.

But when all conditions are correct, the client and server successfully complete the initial handshake, and file replication can begin to occur. The client builds a list, which is stored in RAM, of all the files specified in the source path to be replicated, and performs checksums on the files. This list is transferred to the daemon, and it then compares the file list with the files (if any) already in the destination location, to determine which files need to be copied or updated. Then that list is sent back to the client, and the file transfer commences.

When the "-r" (recursive) or "-a" (archive) parameters are used on the client (as is normal), then the directory file structure of the source location will be created at the destination (if needed). Then the files are replicated to their corresponding directories at the destination. As the files are being transferred, a temporary file is created in the destination directory with a filename such as ".file1.doc.000150". Once the data transfer for that file is completed, the daemon will rename the temporary file with the original filename, such as "file1.doc", and restore its original (i.e., DOS) attributes.

For the transfer of small files, this process happens too fast for you to notice. For larger files, if you were to map a drive to the destination file location, you can actually watch the files being transferred to their target directories, with the added file extension. Refreshing the window, you could see the size of temporary file increasing, until all of the data is transferred, and then the file is renamed with its original name and attributes.

If the client process is shut down or loses communication, the RSYNC daemon will delete the temporary file that it is building, and end that session with an error such as: "read_timeout: Error peer socket closed". But if the RSYNC daemon fails, then it may not be able to delete the temporary file that it is building. This is why I say that I believe that the receiving side may be failing, in your case.

I have seen RSYNC fail during file transfer for various reasons, from corrupted files, to excessively long filepath/filename issues (greater than 256 characters – but I believe this is fixed in the latest version of RSYNC for NetWare). If your synchronization always fails on the same file, and that file seems to be otherwise uncorrupted, sometimes just copying the file from the NetWare volume to a workstation, and then back to the NetWare volume will fix some problems with the file's metadata.

Q: Does RYSNC running on NBO and a CO support or at least replicate MAC volumes? We would like to replace all of our NetWare servers with NBO servers but many of which have MAC volumes on them.

A: Thanks to Arif for this question. Unfortunately, the quick answer is "no". While RSYNC (whether in or outside of the Nterprise Branch Office product) will appear to synchronize all the files on a Mac-enabled NetWare volume, the problem comes in when trying to restore those files back to the source server. This happens irrespective to whether you are running RSYNC on a NetWare or Linux platform, as well.

The main problem is that RSYNC was not designed to handle the Macintosh file system's resource forks. The NetWare file system, with Mac-enabled volumes, was designed to handle them, but when those files are transferred using RSYNC, that type of extended attribute is lost in translation.

By the way, the NetWare extended attributes of files from a NetWare server are also lost during an RSYNC transfer. That is why NetWare trustee information has to be separately restored to files using TRUSTEE.NLM. See this answer for more on that.

Here is the official word from the NBO documentation:

Rsync Restored Macintosh Files Are Inaccessible
If Macintosh* files are transferred to the central office Rsync server and then restored to the appliance, these files will not be accessible and will not work properly for Macintosh users. Macintosh files should be backed up locally to a tape-based or other backup system.

See: www.novell.com/documentation/nbo2/pdfdoc/readme/readme.pdf

With that said, I am reading that the version of RSYNC that comes with Mac OSX 10.4 has the ability to include "extended attributes" of files (such as resource forks) when invoked with the -E switch. However, if the destination server (such as the NBO Central Office server running its RSYNC daemon) does not handle Mac resource forks on its end, then they end up not being stored at the destination. Thus you would have to RSYNC Mac volumes directly from the Mac to another Mac or some other system that handles these files.

See http://www.macgeekery.com/tips/the_new_resource_fork

What Works for You

Obviously, you have to do what works for you, in your own network environment. If you've read along this far in this article, it's apparent that you have some real interest in using data synchronization to either replace or augment traditional methods of data backup for your NetWare servers or Windows clients.

Since every network and every organization is going to be somehow different, with different requirements and considerations, I wouldn't dream of telling you exactly how you should do it in your own situation. But I hope here to spur your thinking with some ideas, and perhaps to also glean some other ideas and responses from readers. So put your two cents in – responses are welcomed!


Novell Cool Solutions (corporate web communities) are produced by WebWise Solutions. www.webwiseone.com

© 2014 Novell