[ltp] Backup Solutions?

unlisted linux-thinkpad@www.bm-soft.com
30 Aug 2002 11:00:59 -0500


--=-hXepJW2OjkluTEmEQ4s4
Content-Type: text/plain
Content-Transfer-Encoding: 7bit

On Thu, 2002-08-29 at 07:16, Mike Taylor wrote:
> > Date: Thu, 29 Aug 2002 00:13:45 -0400
> > From: "D. Sen"
> > 
> > I am looking for suggestions for backing up my hard drive.
> > 
> > With hard-drives sizes greater than 30 Gigs, copying filesystems to
> > CDROM is no longer an option.....Are people using USB(2.0?)/Firewire
> > drives? Do they work reliably on linux?
> 
> You may consider this to be answering the question, and you may not,
> but here's what I do.  I don't bother backing up the whole disk at
> all, since most of what's there is either from the O/S CD or somewhere
> similarly recoverable.

agreed, on two accounts.  the original post seemed to be hardware
specific backup solutions/approaches (cd-r/rw no, firewire yes?), so
this may not be beneficial, but i approach the problem (filesystems too
big to backup onto cd-r/rw) differently, like the post i'm immediately
replying to.

i just backup:
- /etc
- /root
- user directories (ie /home/<username>, though on a debian system
there's sometimes a lot of other stuff in /home, like cvs and ftp)
- /usr/local (which mainly has hand-written scripts, though some
installed software compiled from scratch, but not enough to worry about
excluding)
- the state of my debian system (installed packages using dpkg and
hardware state using tpctl and lspnp)

everything else can be reinstalled from "original media" (and would
probably benefit me in getting rid of some of the cruft just lying
around on my system).

attached is the script i use to do it.  it compresses ~100 MB (mainly
text, docs, and pictures) into ~33 MB (3:1 compression).

mainly my backup script is an experiment in using gnu tar's incremental
feature.  the downfalls:
- each incremental backup is somewhat large (~12 MB) even if nothing
changed because tar (supposedly; need to conduct tests to better
understand it) records the filesystem state (which files/directories
exist) so that should a file/directory be deleted since a full backup,
the deletion will be recorded (ie it was in the full backup, but is not
recorded in the incremental backup).  most simple incremental backup
scripts only use "find" to find files with newer time/date-stamps since
the last full backup, but this fails to record deletions.
- with incremental backups, for any files that have changed (including
the creation/addition of a file) the entire file is saved, not just the
delta (ie diff).  so if you have a 500 MB database, but only change a
single record, ie ~1 KB, then the entire 500 MB database is captured in
the incremental backup.  a system using diff (even against binary files,
which gnu diff supports) would be preferable, recording the deltas, not
the entire file.  a benefit though is that if i want the latest backup
of a file, then i can just extract the file from the appropriate
incremental backup, and not have to start with the full backup and apply
all the diffs from the incremental backups.  rdiff-backup (see below)
tries (succeeds?) to accomplishing this.

benefits:
- speed: a full backup (~100 MB) takes a few minutes (i think ~5-10),
including taring and bziping.  an incremental backup takes ~3 minutes. 
and this is on 770x, which laptop hard-drive subsystems aren't known to
be speedy (compared to desktops & servers).  and this is only at 50% cpu
utilization overall (770x = 300 MHz PII).

i execute nightly from a cron job, but it's quick enough that it could
be executed at will to backup recently changed docs (ie a big
slideshow/presentation you just finished in OpenOffice).

here are some projects/scripts that i found in researching incremental
(tar) backups:

from <http://www.google.com/search?q=tar+incremental+script>

"Automating backups with tar" (script included)
<http://www.tldp.org/LDP/solrhe/Securing-Optimizing-Linux-RH-Edition-v1.3/chap29sec306.html>

"Backup for the Home Network" (script included, but simply earlier
version of above script)
<http://www.linuxgazette.com/issue47/pollman.html>

"the best features of a mirror and an incremental backup"
<http://rdiff-backup.stanford.edu/>
(with packages in debian ;-)

"Incremental tar" (very simplistic in "incremental" backup design; csh
script included)
<http://www.itworld.com/nl/unix_sys_adm/06122002/>


> > I guess my priorities for backing up the laptop are:
> > 
> >      1) Reliability
> >      2) Speed
> >      3) Mobility (so I can back up while I am travelling)

reliability: if i need a file, i just grep the txt file counterparts to
the tar.bz2 files to find the latest version (full or incremental) of
the file and extract the needed file from the appropriate archive.

speed: see above.

mobility: can easily fit on a cd-r/rw, or even a zip disk.  i store the
backups on the laptop itself in the short-term, and weekly backup the
full backups to cd-rw (but on my desktop).

> So you may like to consider this approach instead of brute-forcing it
> in hardware.  HTH.

maybe what i've shared is not beneficial to the original poster, but
hopefully somebody finds it useful.


anyways...
-- 

PLEASE REQUEST PERMISSION TO REDISTRIBUTE
   AUTHOR'S COMMENTS OR EMAIL ADDRESS.

--=-hXepJW2OjkluTEmEQ4s4
Content-Disposition: attachment; filename=mkbackup
Content-Transfer-Encoding: quoted-printable
Content-Type: text/x-sh; name=mkbackup; charset=ISO-8859-15

#!/bin/bash

TIME_CMD=3D"/usr/bin/time -f \ncmd\t%C\nreal\t%E\nuser\t%Us\nsys\t%Ss\ncpu\=
t%P"
NICE_CMD=3D"nice -n 19"

# backup directory
DIR=3D/home/backup
# date (YYYYMMDD)
DATE=3D`date +%Y%m%d`
# week number of year with Monday as first day of week (01..52)
WEEK=3D`date +%V`
# how many weeks to keep partial/incremental backups
KEEP=3D2
# name of computer
NAME=3D`uname -n`
# name of snapshot file
SNAPSHOT=3Dsnapshot.$WEEK
# settings directory
ETC=3D/usr/local/etc
# name of include file
INCLUDE=3Dmkbackup_include.txt
# name of exclude file
EXCLUDE=3Dmkbackup_exclude.txt
# name of listing file
LISTING=3Dlisting.$WEEK

# change to backup directory
cd $DIR

# limit file permissions to u=3Drwx,g=3Drx,o=3D
# knowing u=3Droot,g=3Dbackup
umask 0027

# if snapshot file doesn't exist, then...
if ! [ -f snapshot.$WEEK ]; then
    # this is a full backup

    # backup previous records of computer state
    mv -v /root/tpctl--all--dull.*.txt /root/state/
    mv -v /root/lspnp-vv.*.txt /root/state/
    mv -v /root/dpkg-l.*.txt /root/state/
    mv -v /root/dpkg--get-selections.*.txt /root/state/

    # create record of current computer (hardware & software) state
    tpctl --all --dull > /root/tpctl--all--dull.$DATE.txt
    lspnp -vv > /root/lspnp-vv.$DATE.txt
    COLUMNS=3D150 dpkg -l > /root/dpkg-l.$DATE.txt
    dpkg --get-selections > /root/dpkg--get-selections.$DATE.txt
   =20
# create full archive
$TIME_CMD $NICE_CMD tar -cvj \
    -g snapshot.$WEEK \
    -f $NAME.$DATE.full.tar.bz2 \
    -T $ETC/$INCLUDE \
    -X $ETC/$EXCLUDE \
    -V "$NAME: `date`"

# create listing of full archive contents
$TIME_CMD $NICE_CMD tar -tvj \
    -f $NAME.$DATE.full.tar.bz2 \
    2>&1 \
    | tee $NAME.$DATE.full.txt

    # delete partial/incremental backups
    # find listings older than N weeks (N*7 days)
    # and delete the listing and all listed files
    find -maxdepth 1 -mindepth 1 -regex "^\./listing\.[0-9][0-9]$" -mtime +=
$(($KEEP*7)) \
    | while read LISTINGS; do
	cat $LISTINGS
	echo $LISTINGS
    done \
    | xargs --verbose --no-run-if-empty rm --verbose

else
    # this is a partial/incremental backup

# create partial/incremental archive
$TIME_CMD $NICE_CMD -n 19 tar -cvj \
    -g snapshot.$WEEK \
    -f $NAME.$DATE.part.tar.bz2 \
    -T $ETC/$INCLUDE \
    -X $ETC/$EXCLUDE \
    -V "$NAME: `date -r snapshot.$WEEK` - `date`"

# create listing of partial/incremental archive contents
$TIME_CMD $NICE_CMD tar -tvj \
    -f $NAME.$DATE.part.tar.bz2 \
    2>&1 \
    | tee $NAME.$DATE.part.txt

    echo $NAME.$DATE.part.tar.bz2 >> $LISTING
    echo $NAME.$DATE.part.txt >> $LISTING

fi

exit 0

--=-hXepJW2OjkluTEmEQ4s4--


----- The Linux ThinkPad mailing list -----
The linux-thinkpad mailing list home page is at:
http://www.bm-soft.com/~bm/tp_mailing.html