How to Use the Rsync Command in Unix

Introduction

The digital revolution has changed our lives in many ways. Nowadays, our picture albums, music collections, bookshelves etc. are all digital - binaries written on a disk. Our life and work are shared between a multitude of devices. Companies have development, staging, testing areas and dozens of developers connected to them.

From this shift in our life dynamics emerged two problems: how to preserve the integrity of our data and how to make it synchronized across all of our devices? Enter rsync, a tool that aims and actually succeeds in solving those issues.

First released in 1996 under the GPLv3 license, rsync is an open-source file transfer and synchronization tool written in C. It works by comparing sizes and modification times (mtime) to decide which files need to be synchronized, and uses an effective algorithm to minimize network bandwidth for transfers.

Following the Unix philosophy, rsync is very lightweight, easily configurable and scriptable. It has reached great popularity and ubiquity. Pretty much every Linux Distribution comes with rsync preinstalled; you can even find it on MacOS out of the box!

Of course, it can be built from source as well - the tarball can be found here. For Windows users though, some additional effort is required. It can be used with the Cygwin runtime, or under WSL.

Basic rsync Operations

$ rsync options source destination

The syntax is pretty straightforward. You call rsync, list the options and choose the source and destination. All options can be found in rsync's man pages. Let's take a look at the key options we'll be using:

  • --archive or -a
    This option is an alias for -rlptgoD which boils down to - do a recursive call trying to preserve as much as you can. This includes mtime, permissions and ownership, symlinks, etc. This option is almost always used. Hard Links are not included since tracking them down can be a costly operation. They can be explicitly requested with the -H option.

  • --verbose or -v and --human-readable or h.
    This option, very common with command line programs, forces rsync to log the operations it performs to the standard output stream. Should be skipped in scripts, where the --quiet or -q option is preferred as it produces no output.

  • --compress or -z
    This option uses compression during network transfer to reduce bandwidth. Very desirable and almost always used; same as the --archive option.

  • --dry-run or -n
    This tells rsync to simulate the process and tell us what will be updated or deleted without actually doing anything. Useful for double checking our arguments and dealing with sensitive data.

  • --update or -u
    This option tells rsync to skip files that are newer in the destination than in the source. We will take a deeper look at its effects in the examples below.

  • --del
    This option tells rsync to delete all files in the destination not included in the source argument. Useful for cleaning up temporary files which we don't need anymore but backed up during previous synchronization.

  • --inplace
    The default way rsync works is by copying new files over, and then deleting the data it's substituting. This option tells it to substitute data in-place which doesn't take that much storage in the transfer process. Useful in situations without much storage headroom ie. transfering to a flash drive etc.

rsync Examples

$ rsync -avz src dest

This is the most common way you'll be invoking rsync. Preserve and compress everything, while logging the operations.

For remote locations:

$ rsync -avz src user@destination:location

If you have an external hard disk you'd like to sync the files on - but use two devices to edit them (say, editing photos on a laptop and desktop), the -u flag comes in handy as you might not want rsync to override the more recent work on either of these with their old versions:

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

$ rsync -avuz src /mnt/my_hdd

rsync with Remote Hosts

Another thing rsync has going for it the fact that it excels with remote transfers. Using ssh by default (there are other protocols available, it even has its own), it is fully integrated with ssh keyrings. The only caveat is that rsync needs to be installed on both ends, but with such ubiquity, that is almost never an issue.

Performing a Full System Backup

Full system backups are common with many *nix users, if they are either particularly worried with their data, or just like experimenting and having a fail safe. We will be using some additional options like -A and -X which preserve ACLs and extended attributes. Some file trees on our file system are unnecessary, so we will be excluding them with the --exclude option. We will also remove any residual file which we have locally removed between the two backups:

$ sudo rsync -aAXv --del --exclude={"/dev/*", "/proc/*", "/sys/*", "/tmp/*", "/run/*", "/mnt/*", "/media/*","/lost+found"} dest

Daily Backups with rsync and crontab

Using crontab, we can schedule daily, weekly, monthly or even variable time backups. Let's write a cron expression and schedule our rsync command to run every day at 00:00:

0 0 * * * root sudo rsync -aAXv --del --exclude={"/dev/*", "/proc/*", "/sys/*", "/tmp/*", "/run/*", "/mnt/*", "/media/*","/lost+found"} dest

In your /etc directory, you can find multiple cron.* directories, corresponding to how often their contents should be checked and run. To create a daily backup of our entire system we can take the above command, place it in a file called, let's say, daily_backup.sh and place it inside /etc/cron.daily/.

This command will run at a specific time, defined by your crontab. Make sure your script is executable and starts with a defining shebang.

Conclusion

What's not to like? A minimal and lightweight command line tool, that is simple to use, able to do a lot of things, easy to put into scripts, and does what it does pretty fast. It's no surprise rsync reached such broad popularity and availability. It does a pretty good job at this one thing everyone, aware of that or not, actually needs.

Last Updated: April 27th, 2023
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Make Clarity from Data - Quickly Learn Data Visualization with Python

Learn the landscape of Data Visualization tools in Python - work with Seaborn, Plotly, and Bokeh, and excel in Matplotlib!

From simple plot types to ridge plots, surface plots and spectrograms - understand your data and learn to draw conclusions from it.

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms