Introduction
The digital revolution has changed our lives in many ways. Nowadays, our picture albums, music collections, bookshelves etc. are all digital - binaries written on a disk. Our life and work are shared between a multitude of devices. Companies have development, staging, testing areas and dozens of developers connected to them.
From this shift in our life dynamics emerged two problems: how to preserve the integrity of our data and how to make it synchronized across all of our devices? Enter rsync, a tool that aims and actually succeeds in solving those issues.
First released in 1996 under the GPLv3 license, rsync is an open-source file transfer and synchronization tool written in C. It works by comparing sizes and modification times (mtime
) to decide which files need to be synchronized, and uses an effective algorithm to minimize network bandwidth for transfers.
Following the Unix philosophy, rsync is very lightweight, easily configurable and scriptable. It has reached great popularity and ubiquity. Pretty much every Linux Distribution comes with rsync preinstalled; you can even find it on MacOS out of the box!
Of course, it can be built from source as well - the tarball can be found here. For Windows users though, some additional effort is required. It can be used with the Cygwin runtime, or under WSL.
Basic rsync Operations
$ rsync options source destination
The syntax is pretty straightforward. You call rsync
, list the options and choose the source and destination. All options can be found in rsync's man
pages. Let's take a look at the key options we'll be using:
-
--archive
or-a
This option is an alias for-rlptgoD
which boils down to - do a recursive call trying to preserve as much as you can. This includesmtime
, permissions and ownership, symlinks, etc. This option is almost always used. Hard Links are not included since tracking them down can be a costly operation. They can be explicitly requested with the-H
option. -
--verbose
or-v
and--human-readable
orh
.
This option, very common with command line programs, forces rsync to log the operations it performs to the standard output stream. Should be skipped in scripts, where the--quiet
or-q
option is preferred as it produces no output. -
--compress
or-z
This option uses compression during network transfer to reduce bandwidth. Very desirable and almost always used; same as the--archive
option. -
--dry-run
or-n
This tells rsync to simulate the process and tell us what will be updated or deleted without actually doing anything. Useful for double checking our arguments and dealing with sensitive data. -
--update
or-u
This option tells rsync to skip files that are newer in the destination than in the source. We will take a deeper look at its effects in the examples below. -
--del
This option tells rsync to delete all files in the destination not included in the source argument. Useful for cleaning up temporary files which we don't need anymore but backed up during previous synchronization. -
--inplace
The default way rsync works is by copying new files over, and then deleting the data it's substituting. This option tells it to substitute data in-place which doesn't take that much storage in the transfer process. Useful in situations without much storage headroom ie. transfering to a flash drive etc.
rsync Examples
$ rsync -avz src dest
This is the most common way you'll be invoking rsync
. Preserve and compress everything, while logging the operations.
For remote locations:
$ rsync -avz src user@destination:location
If you have an external hard disk you'd like to sync the files on - but use two devices to edit them (say, editing photos on a laptop and desktop), the -u
flag comes in handy as you might not want rsync
to override the more recent work on either of these with their old versions:
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
$ rsync -avuz src /mnt/my_hdd
rsync with Remote Hosts
Another thing rsync has going for it the fact that it excels with remote transfers. Using ssh
by default (there are other protocols available, it even has its own), it is fully integrated with ssh keyrings. The only caveat is that rsync needs to be installed on both ends, but with such ubiquity, that is almost never an issue.
Performing a Full System Backup
Full system backups are common with many *nix users, if they are either particularly worried with their data, or just like experimenting and having a fail safe. We will be using some additional options like -A
and -X
which preserve ACLs and extended attributes. Some file trees on our file system are unnecessary, so we will be excluding them with the --exclude
option. We will also remove any residual file which we have locally removed between the two backups:
$ sudo rsync -aAXv --del --exclude={"/dev/*", "/proc/*", "/sys/*", "/tmp/*", "/run/*", "/mnt/*", "/media/*","/lost+found"} dest
Daily Backups with rsync and crontab
Using crontab
, we can schedule daily, weekly, monthly or even variable time backups. Let's write a cron
expression and schedule our rsync
command to run every day at 00:00
:
0 0 * * * root sudo rsync -aAXv --del --exclude={"/dev/*", "/proc/*", "/sys/*", "/tmp/*", "/run/*", "/mnt/*", "/media/*","/lost+found"} dest
In your /etc
directory, you can find multiple cron.*
directories, corresponding to how often their contents should be checked and run. To create a daily backup of our entire system we can take the above command, place it in a file called, let's say, daily_backup.sh
and place it inside /etc/cron.daily/
.
This command will run at a specific time, defined by your crontab
. Make sure your script is executable and starts with a defining shebang.
Conclusion
What's not to like? A minimal and lightweight command line tool, that is simple to use, able to do a lot of things, easy to put into scripts, and does what it does pretty fast. It's no surprise rsync reached such broad popularity and availability. It does a pretty good job at this one thing everyone, aware of that or not, actually needs.