If you use open source software, you've probably heard of Amanda—it's the most widely used open source backup program. It was developed at the University of Maryland in 1991. Today more than 500,000 computers are protected by Amanda. (See Figure 1.)
With Amanda you set up a single master server to back up Linux, UNIX, Mac OS-X and Windows hosts to the tape libraries, optical jukeboxes, RAID arrays and NAS devices.
This article provides a brief technical overview of Amanda to help you understand how it is different from other backup software. You can find up-to-date information about everything you need to know for deploying Amanda in production at wiki.zmanda.com. Another helpful collaboration tool for Amanda is forums.zmanda.com.
> Scalable Architecture with Non-proprietary Tools
The biggest advantage Amanda has over other backup software is that it does not use any proprietary formats. For movement of data from the client to the tape or disk, Amanda uses standard OS utilities such as dump and tar, or open source GNUtar, smbtar and star. With standard utilities you can be confident they will always be available to you. For example, you can also recover data even if Amanda isn't installed, which you can't beat if an emergency arises.
Amanda does not use any proprietary device drivers either. Any device supported by an OS works with Amanda. Support for many tape libraries provides truly hands-off and lights-out backup. If you can read and write to your tape drive and move tapes in your tape library with standard OS commands such as mt, Amanda will work with your tape library.
Amanda is designed as a client-server architecture. Each client program is instructed to write to standard output, which Amanda collects and transmits to the backup server. The architecture provides three benefits; it:
- ensures scalability from a single client and stand alone tape drive to networks with hundreds of clients and libraries with multiple tape drives with thousands of tapes
- allows all configurations to be done on the Amanda server. Once the initial configuration is done, you can easily add additional clients without worrying about breaking your tested backup procedures
- allows some CPU-intensive operations such as compression or encryption to be done on a client before sending backup to the Amanda server.
> Amanda Security
As in any client-server setup, only your trusted Amanda server should be able to communicate with Amanda clients. Amanda achieves that by using the file .amandahosts that has a list of trusted hosts. For stronger data transport security and backup client authentication Amanda can use OpenSSH.
To protect data on the backup media, Amanda encrypts backup data with symmetric or asymmetric encryption using either aespipe or gpg. The Amanda encryption can be done either on the server or the client. Client-side encryption ensures security of data on a wire, which is important for backing up remote clients.
Amanda works with Security-Enhanced Linux; it also works well with firewalls between the Amanda server and its clients (as long as you select the ports during initial setup). Please check details for firewall setup at wiki.zmanda.com.
The flexibility of the security configurations allows Amanda to fit well into security policies and processes of most IT environments, including the organizations with strict security requirements.
> Backup scheduling
Most backup products use the same approach to scheduling. You instruct software to perform a full backup on Sunday (or the last day of the month, etc.) with different levels of incremental backups in between full backups. The problem is that such scheduling does not provide any load balancing. You have to make sure that enough resources are available to manage peak demand for backup server CPU, network and I/O during full backups. Because you perform full backups only occasionally, your resources are underutilized most of the time. Of course, you can achieve load balancing by instructing your backup software to distribute full backups among all clients throughout the week, but then you have to ensure that no changes in your environment break down your balancing schema.
Amanda provides a unique approach to scheduling that optimizes load balancing of backups. Instead of giving Amanda the exact scheduling instructions, you just specify a few ground rules for scheduling. For example, you might give Amanda the following rule "Do at least one full backup within a 7-day period and do incremental backups all other days."
For any rule you specify, Amanda finds an optimal combination of full and incremental backups for all clients. The goal is to make the total amount of backup data as small as possible, and the backup window consistent from one run to another. To find a balance, Amanda uses the following considerations:
- total amount of data to be backed up
- maximum time between full backups specified by you
- size of backup media available for each backup run.
When you decide on a length of time between full backups, you should consider that shorter cycles make restores easier as there are fewer incrementals, but they use more tape and require more time to back up. Longer cycles allow Amanda to spread the load over multiple tapes but may require more steps during a restore. More information about how to choose a balanced schedule depending on the amount of data, tape drive capacity, and so forth, is available at wiki.zmanda.com.
> Amanda recovery
Two Amanda programs restore data: amrecover and amrestore. The first restores files via an interface that allows browsing of your backup file index to a certain date and choosing the files you want to restore. After you select the files, Amanda finds the right tape, brings the backup image over the network to the client and pipes data into the appropriate restore program. For full, file system recovery, amrestore retrieves the whole file system images.
The Amanda tape format is deliberately simple. In case of emergency, restoring data could be done without Amanda. The first file on a tape is a volume label. Each file after that contains one backup image using 32 KB blocks. The first block contains text showing the commands needed to do a restore. The Amanda wiki (wiki.zmanda.com) provides instructions on how to recover data without Amanda being installed.
> Amanda Enterprise 2.6
Zmanda Management Console (ZMC) is the new functionality in Amanda Enterprise 2.6. ZMC provides a browser-based UI with work flows that simplify initial configuration, adding new clients, verification of setup, scheduling, recovery and other day-to-day backup tasks. ZMC is integrated with Zmanda Network. Customers get timely updates from Zmanda and benefit from collective knowledge of the Amanda community. (See Figure 2.)
Amanda provides all the benefits of open source such as flexibility, high quality code, security and low cost, and it fully addresses the backup and recovery needs of most enterprise users.