Useless fuss about ZIP and RAR

Feb 24 2007 Published by Salvatore Iovene under Software

There has been some fuss gen­er­ated by Jeff Atwood (who is a Win­dows devel­oper, which is bad, and a Visual Basic one, which is worse), who seems, in my hum­ble opin­ion, to be giv­ing par­tial infor­ma­tion around, as closed in his Windows-only world as he appears to be. In a recent arti­cle of his, Jeff makes a basic com­par­i­son between the ZIP and RAR com­pres­sion sys­tems. Unfor­tu­nately, most Win­dows peo­ple com­pletely ignore that there’s some­thing much bet­ter out there, that has been float­ing in the *nix world for quite a long time now. I’m talk­ing about the pow­er­ful com­bi­na­tion of tar and bzip2.

Let’s get to the facts right away. To exper­i­ment around, I’ve used a direc­tory con­tain­ing the source code of the Linux ker­nel, then I built that ker­nel, so that the size of the direc­tory would be pretty big, and we would have both text files and binary files.

Here’s the size of the orig­i­nal directory:

$ du -sh /usr/src/linux-2.6.18.3
539M    linux-2.6.18.3

This is what hap­pens with ZIP:

$ time zip -r ~/linux /usr/src/linux-2.6.18.3
...
real    2m35.917s
user    0m32.486s
sys     0m6.024s

$ ls -gGh ~/linux.zip
-rw-r--r-- 1 141M 2007-02-24 01:04 /home/siovene/linux.zip

Fine, 141Mb in 2 min­utes and 35 sec­onds. Let’s try RAR:

$ time ./rar_static a ~/linux.rar /usr/src/linux-2.6.18.3
...
real    5m8.715s
user    2m14.012s
sys     0m12.473s

$ ls -gGh ~/linux.rar -lh
-rw-r--r-- 1 132M 2007-02-24 01:26 /home/siovene/linux.rar

Ouch! Dou­ble time and just a slightly bet­ter com­pres­sion! Let’s try TAR and BZIP2:

$ time tar cv linux-2.6.18.3 | bzip2 > ~/linux.tar.bz2
...
real    4m22.265s
user    2m38.134s
sys     0m5.608s

$ ls -gGh ~/linux.tar.bz2
-rw-r--r-- 1 90M 2007-02-24 01:09 /home/siovene/linux.tar.bz2

Not too faster than RAR (but using two pro­grams com­mu­ni­cat­ing through a pipe, so some over­head), but much more effi­cient! The com­pressed file is only 90Mb start­ing from an orig­i­nal uncom­pressed of 539Mb

Let’s sum­ma­rize the data:

Method        Time        Size
zip           2m35s       141Mb
rar           5m08s       132Mb
tar.bz2       4m22s       90Mb

In con­clu­sion, you should use the best tools, inter­de­pen­dently from their pop­u­lar­ity, and remem­ber that there is so much more than what you can see from your Windows-user-perspective.

10 responses so far