Moving Files In Linux
rsync

Dee-Ann LeBlanc
Thursday, May 22, 2003 09:49:23 AM
rsync comes to us from the Samba project, at http://rsync.samba.org/.
This underutilized but valuable tool is excellent for keeping Web and
FTP site mirrors up to date, not to mention for keeping the contents of
local directories within your network in sync. You can also use it for
private "secure" purposes such as data backup, as long as you are sure
to utilize rsync within an ssh connection.
rsync is a client/server application, and like FTP, you can use it for
both anonymous and login-required transfers. For the client end, you can
learn more by typing man rsync, and for the server end it's man
rsyncd.conf. You don't have to use rsync with an rsync server, however.
You can use the client to connect to an FTP or HTTP server as well.
Say that I'm using Mandrake Linux 9.1 and want to grab the latest
packages available for this version without using Mandrake Update. I
first go to http://www.mandrakesecure.net/en/ftp.php and select the
mirror: I'll use the one at my alma matter, Penn State
(carroll.cac.psu.edu), for this example.
I begin by finding out if there are any rsync servers running on this
server. The command I use for this is:
rsync carroll.cac.psu.edu::
The response is, at the time of this writing (without the PSU banner):
Apache Apache
caldera Caldera Linux distribution
caldera-iso Caldera Linux distribution ISO images
collegelinux Collegelinux Linux distribution
cpan Comprehensive Perl Archive Network
ctan Comprehensive Tex Archive Network
cygwin Cygwin
debian Debian Linux distribution
debian-cd Debian Linux distribution CD images
freebsd FreeBSD
gentoo Gentoo Linux distribution
gnome The GNOME ftp site
gnu GNU repository
kde The KDE ftp site
kernel Kernel.org
mandrake Mandrake Linux distribution
mandrake-devel Mandrake development tree
mandrake-iso Mandrake development tree ISOs
mandrake-old Mandrake old releases
netbsd NetBSD
openbsd OpenBSD
opencd OpenCD Windows Distribution
redhat-redhat Red Hat, Inc. -- Red Hat FTP Site, RedHat Area
redhat-ftp Red Hat, Inc. -- Red Hat FTP Site
redhat-beta Red Hat, Inc. -- Red Hat Linux beta releases
redhat-contrib Red Hat, Inc. -- Contrib FTP Site
redhat-rawhide Red Hat, Inc. -- Rawhide FTP Site
redhat-updates Red Hat, Inc. -- Updates FTP Site
sgifreeware freeware.sgi.com
slackware Slackware Linux distribution
sorcerer Sorceror Linux distribution
splack Splack Linux distribution
sunfreeware ftp ftp.sunfreeware.com
suse SuSE Linux distribution
xfree86 XFree86
ximian Ximian GNOME
yellowdog YellowDog Linux distribution
Since what I'm interested in is Mandrake updates, I now type the
following to find the contents of the mandrake section:
rsync carroll.cac.psu.edu::mandrake
The results, minus the PSU banner, are:
drwxr-xr-x 4096 2003/04/05 16:30:04 .
drwxr-xr-x 4096 2003/03/25 07:19:02 9.0
drwxr-sr-x 4096 2003/03/25 07:46:53 9.1
lrwxrwxr-x 3 2003/03/25 08:30:05 current
drwxr-xr-x 4096 2003/03/25 13:40:45 iso
-rw-r--r-- 287053 2003/04/05 05:00:03 ls-lR.gz
drwxrwsr-x 4096 2003/03/11 12:03:39 updates
Since it's updates I'm looking for, I now try:
rsync carroll.cac.psu.edu::mandrake/updates
This gives me the following:
drwxrwsr-x 4096 2003/03/11 12:03:39 updates
What this tells me is that this is as as deep as I can go with rsync
without recursively listing all files and directories. I'll do this by
adding the -r flag:
rsync -r carroll.cac.psu.edu::mandrake/updates | more
The output is too long to list here, but what it shows me is that there
are subdirectories for each Mandrake version. Using:
rsync -r carroll.cac.psu.edu::mandrake/updates/9.1 | more
shows me that I want the RPMS subdirectory, and:
rsync -r carroll.cac.psu.edu::mandrake/updates/9.1/RPMS | more
shows me that I've finally found the directory containing the files
themselves.
Ideally, I would now build a script that checked to see if I had the
package installed before bothering to download an item, but for now I'm
content to download everything in the updates directory that I don't
already have. To accomplish this, I'll use (this line of code may show
wrapped for readability, but it's meant to be all one line):
rsync -uv carroll.cac.psu.edu::mandrake/updates/9.1/RPMS/*
/home/dee/Updates/Mandrake
The -u flag tells rsync to only grab the files that I don't already
have, and the -v tells rsync to be verbose and show me the name of each
file as it's grabbing them rather than just showing me the server's
banner and then sitting there silently while it does its work. The path
at the end (/home/dee/Updates/Mandrake) tells rsync where I want the
files to go.
If I was using this tool in a way that I needed security, I could use
the flag and option:
-e ssh
to tell rsync to tunnel through the secure shell to do its work. rsync
is a powerful, flexible tool. It can also be rather confusing, and
digging around for examples on the Web is the best way I've found to get
a handle on this program's many features.
Next: wget »