Chapter 16. File Synchronization

Table of Contents

16.1. Data Synchronization Software
16.2. Determining Factors for Selecting a Program
16.3. Introduction to InterMezzo
16.4. Introduction to Unison
16.5. Introduction to CVS
16.6. Introduction to mailsync

Abstract

Today, many people use several computers — one computer at home, one or several computers at the workplace, and possibly a laptop or PDA on the road. Many files are needed on all these computers. You may want to be able work with all computers and modify the files and subsequently have the latest version of the data available on all computers.

16.1. Data Synchronization Software

Data synchronization is no problem for computers that are permanently linked by means of a fast network. In this case, use a network file system like NFS and store the files on a server, enabling all hosts to access the same data via the network. This approach is impossible if the network connection is poor or not permanent. When you are on the road with a laptop, you need to keep copies of all needed files on the local hard disk. However, it is then necessary to synchronize modified files. When you modify a file on one computer, make sure a copy of the file is updated on all other computers. For occasional copies, this can be done manually with scp or rsync. However, if many files are involved, the procedure can be complicated and requires great care to avoid errors, such as overwriting a new file with an old file.

[Warning]Risk of Data Loss

Before you start managing your data with a synchronization system, you should be well acquainted with the program used and test its functionality. A backup is indispensable for important files.

The time-consuming and error-prone task of manually synchronizing data can be avoided by using one of the programs that employ various methods to automate this job. The following summaries are merely intended to convey a general understanding of how these programs work and how they can be used. If you plan to use them, read the program documentation.

16.1.1. InterMezzo

The idea of InterMezzo is the implementation of a file system that exchanges files via the network like NFS, but stores local copies on the individual computers, thus ensuring that the files are available even when the network connection is down. The local copies can be edited. All changes are noted in a special log file. When the connection is restored, these changes are automatically forwarded and the files are synchronized. More information about InterMezzo is available in /usr/share/doc/packages/InterMezzo/InterMezzo-HOWTO.html, if the package is installed.

16.1.2. Unison

Unison is not a network file system. Rather, the files are simply saved and edited locally. The program Unison can be executed manually to synchronize files. When the synchronization is performed for the first time, a database is created on the two hosts, containing check sums, time stamps, and permissions of the selected files. The next time it is executed, Unison can recognize which files were changed and propose the transmission from or to the other host. Usually all suggestions can be accepted.

16.1.3. CVS

CVS, which is mostly used for managing program source versions, offers the possibility to keep copies of the files on multiple computers. Accordingly, it is also suitable for our purpose.

CVS maintains a central repository on the server, in which not only the files but also changes to files are saved. Changes that are performed locally are committed to the repository and can be retrieved from other computers by means of an update. Both procedures must be initiated by the user.

CVS is very resilient to errors in the event that changes occur on several computers. The changes are merged and, if changes took place in the same lines, a conflict is reported. When a conflict occurs, the database remains in a consistent state. The conflict is only visible for resolution on the client host.

16.1.4. mailsync

In contrast to the synchronization tools covered in the previous sections, mailsync merely serves the purpose of synchronizing e-mails between mailboxes. The procedure can be applied to local mailbox files as well as to mailboxes on an IMAP server.

Based on the message ID contained in the e-mail header, the individual messages are either synchronized or deleted. Synchronization is possible between individual mailboxes and between mailbox hierarchies.