M BUZZ CRAZE NEWS
// news

Rsync friendly gzip

By Sarah Rodriguez

I must not be the only one - I'm rsyncing .tar.gz files and notice that every time the full file gets rsynced rather than the differences. Reading into it it seems back in 1999 someone created an algorithm that fixed the issue (only 5% of data needed transferred)

Has this gone anywhere since, how do I create rsync friendly .tar.gz files?

5 Answers

my gzip (on ubuntu and fedora) has the --rsyncable option. So create the tarballs using:

tar -c whatever/ | gzip --rsyncable > file.tar.gz

BeezNest has a pretty good explanation of the rsyncable option for gzip. In the author's test, this option added about 1% to the file size, but made it possible for rsync to transfer the updates to a gzipped file with over 1,300 times speedup.

For the gory details, see this discussion (specifically, section 4.4.2), which they cite. The gist of it is:

The modification is quite simple:

  1. A fast rolling signature is computed for a small window around the current point in the uncompressed file;
  2. stream compression progresses as usual;
  3. when the rolling signature equals a pre-determined value the compression tables are reset and a token is emitted indicating the start of a new compression region.

I like this one because I wanted to tar.gz it, not just .gz

GZIP='--rsyncable' tar cvzf bobsbackup.tar.gz /home/bob
2
gzip --rsyncable # need two dashes for long options

I know that Ubuntu Linux applies a patch (gzip file) to gzip sources to allow for a --rsyncable flag. You can download that patch and use it yourself, or see if your distribution includes the patch.

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy