Search
Enter Keywords:
Home
A Rudimentary Stuffit Deluxe Based Nightly Archive System PDF Print E-mail
User Rating: / 0
PoorBest 
Written by Michael Salsbury   
Friday, 03 June 2005

To backup the Mac OS X systems where I work, I developed a script to use the RsyncX utility to backup the Mac to a Mac OS X Server running the daemon version of RsyncX.  The script I use will automatically "mirror" the Macintosh desktop system's contents on the server.  While this is ideal from a recovery standpoint, the users I work with are familiar with Dantz Retrospect, which takes a nightly "snapshot" of their computer and stores those snapshots in the backup.  A side benefit of this snapshot approach is that they can recover a corrupted file "as it appeared" on Monday, Tuesday, etc.  My RsyncX script didn't offer this snapshot option, and I didn't want rework the whole process to do this, so I created a new script that creates a nightly archive and stores it locally on the computer.  Each archive file is uniquely named to avoid accidentally overwriting data from an earlier archive on the same day.  Because I didn't want to consume huge amounts of disk space with these archives (which might never be used), I used Stuffit Deluxe to compress them as much as possible when storing them.  I could have used UNIX tools like "tar" or "gzip" but I was concerned that these could throw away the resource forks on those files which have them because they're not HFS-aware.  I knew Stuffit wouldn't do that.

To further reduce disk usage, this script looks in the archive directory for any archive files that are more than 30 days old.  It then deletes all those old archives from the directory.  This means that at any given time, up to 30 days' worth of changed files appear in the archive.  After 30 days, those files are gone.  This is approximately the amount of time that the old Retrospect backup scripts kept file snapshots around.

If you ran the script as-is, there would be quite a few things in the archive you didn't want, like cache files, log files, etc.  To allow you to filter out specific things you don't want, just add a line to the archivexclude.txt file.  If, for example, you added "/sample/" to the file the system would never archive any file in a path that contained the string "/sample/".  That would include "/sample/test.doc", "/Library/sample/test.doc", or "/usr/test/docs/sample/thisfile.txt" but would NOT include "/example/samples/test.doc" because "/samples/" does not match "/sample/".  Similarly, you could exclude all QuickTime movies by adding merely ".mov" to the list, which means anything with ".mov" in it isn't to be archived.

To use the script, copy and paste the code into TextEdit.  You will need to make some minor tweaks to the code to point to where you'd like your archive files to be written, how you might like them named, etc.  I've commented the code as much as I feel I can, to help you understand what the script is doing and why.

As with all the scripts on this site, this one is provided as-is without any kind of warranty or support.  If you can get it to work in your environment, fantastic.  If not, I don't promise to help you or fix the script.  If it results in a loss of data or other any other kind of loss, that is the risk you elected to take when you tried out the script or made your own modifications to it.  I take no responsibility for anything good or bad this script does for or to you. 

This script requires you to have Stuffit Deluxe installed, the C shell available, and to ensure that the directory specified in the "set archivelocation" statement  actually exists.  The script will intentionally ignore some files and directories (based on my specific needs for it to do so) and may archive a lot of junk you don't want archived.  You'll need to tinker with it to make it suit your needs.  I won't do that for you.

Save this as "archive" and make it executable.  Run it from the command line or a cron task by entering the command "csh archive".

#!/bin/csh
#
# This script is designed to maintain an archive of changed documents
# on the system.  It is intended to run each night as a cron task and
# will scan the system for all changed files on all drives.  If it
# finds anything changed in the last day, it adds the file(s) to a
# Stuffit Deluxe archive in the location specified in the variable
# "archivelocation".  It names the Stuffit archive files with the
# "machinename" variable, followed by a ".", followed by the current
# date and time in the format "MMDDYYYY_HHMMSS" so that multiple
# executions won't generate files of the same exact name.
#
# A companion script, "archive-clean" will find and remove all archive
# files older than 30 days.
#
# Updated: July 28, 2005
# By: Michael Salsbury
#
# When implementing a new machine, adjust "machinename" to match the
# "official" name of the machine (no spaces or special characters
# are permitted, other than "." or "_").
#
set machinename = "`hostname -s`"
#
# This line creates a unique suffix for the file based on the current
# date in 2-digit numeric format for month and day, 4-digit format for
# year, and two-digit format for hours, minutes, and seconds.
#
set uname = `date +"%m%d%Y_%H%M%S"`
#
# This variable specifies a location where archive files created by
# the script should be stored.  Normally this location will be in a
# folder called "archives" on the "Work" drive.
#
set archivelocation = "/Volumes/Work/archives"
#
# Set the working directory where the script will place temporary
# files as it executes.
#
set workdir = "/Library/CASAdmin"
#
# Set the name and location of the "exclude file" containing the
# path names and partial names that will be used to exclude any
# files from the archive that we don't want.
#
set arcexcludefile = "$workdir/archivexclude.txt"
#
# How many days' worth of archive files should we keep?
# Below is 30 days' worth. Change to any value you like.
#
set archiveretention = "30"
#
# Print banner information for user, useful in troubleshooting
# later on.
#
echo " "
echo "Nightly Archive Script"
echo " "
echo "Running on machine: $machinename"
echo "Archive uname:      $uname"
echo "Archive Location:   $archivelocation"
echo "Work Directory:     $workdir"
echo "Exclude file:       $arcexcludefile"
echo " "
if (-e "$arcexcludefile") then
   echo "Found the archive exclude file."
else
   echo "*** PROBLEM: Didn't find the archive exclude file! ***"
   exit
endif
if (-e "$workdir") then
   echo "Found the working directory."
else
   echo "*** PROBLEM: Didn't find the working directory! ***"
   exit
endif
if ($uname == "") then
   echo "*** PROBLEM: Couldn't identify the machine name! ***"
   exit
endif
if (-e /usr/local/bin/stuff) then
   echo "Found Stuffit Deluxe in /usr/local/bin/stuff as expected."
else
   echo "*** PROBLEM: Didn't find /usr/local/bin/stuff ! ***"
   exit
endif
#
# Before we begin, we check to see if either of our temporary files
# (filelist.txt and templist.txt) exist. If so, we delete them.
#
if (-e "$workdir/filelist.txt") then
   echo "Removing old filelist.txt file."
   rm -f "$workdir/CASAdmin/filelist.txt"
endif
if (-e "$workdir/templist.txt") then
   echo "Removing old templist.txt file."
   rm -f "$workdir/templist.txt"
endif
echo " "
#
# Change to the Volumes directory so we can see local disks.
#
echo "Doing a 'cd' to /Volumes"
cd /Volumes
#
# For each volume in the Volumes directory, we'll execute a
# find in the relevant subdirectories to look for updated files.
#
foreach dir (*)
  # Display the current volume name on screen
  echo "Looking at /Volumes/$dir"
  set mpt = `df "/Volumes/$dir" | grep /dev/`
  if ($status != 0) then
     echo "/Volumes/$dir isn't local, skipping it."
     echo " "
     goto skipit
  else
     echo "/Volumes/$dir is local. Scanning for files to archive."
     echo " "
  endif
  # Change to the current volume.
  cd "/Volumes/$dir"    
  #
  # For each subdirectory on the volume, we'll do various things.
  #
  foreach subdir (*)
    # Display the name of the subdirectory
    echo "  $subdir"
    # Check to see if the disk volume is local. If so, a "df" command
    # results will include the string "/dev/".  Network and other
    # volumes will not contain this string.
    set mpt = `df "/Volumes/$dir/$subdir" | grep /dev/`
    # If we don't find "/dev/" in the above, the system variable
    # $status (which is similar to an MS-DOS ERRORLEVEL) will be
    # something other than zero.
    if ($status == 0) then
     #
     # Based on the name of the subdirectory, we're going to either
     # look for updated files, or not.
     #
     switch ("$subdir")
      # Volumes would get us in an infinite loop, so we skip it
      # if it exists.  An implication of this is that if the user
      # creates a "Volumes" directory and puts work in it, we won't
      # archive it.
      case "Volumes":
      # Network would have us trying to find changed files on all
      # systems on the LAN. This would also be bad.
      case "Network":
      # .vol is a system directory so we don't check it.
      case ".vol":
      # Applications directories can't be written to by the user, so
      # we don't check those either.
      case "Applications":
      case "Applications (Mac OS 9)":
      # Desktop DB and DF are used internally by OS X so we don't want
      # to archive them.
      case "Desktop DB":
      case "Desktop DF":
      # System and System Folder are OS directories and are not user
      # writable, so we ignore those.
      case "System":
      case "System Folder":
      # The directories between this comment line and the next are
      # system directories and files we don't want to archive.
      case "Library":
      case "private":
      case "automount":
      case "bin":
      case "dev":
      case "cores":
      case "mach":
      case "mach.sym":
      case "mach_kernel":
      case "sbin":
      case "usr":
      case "var":
      case "File Transfer Folder":
      case ".Trashes":
      case "etc":
      case "tmp":
      case "Temporary Items":
      case "Cleanup at Startup":
      # The file below is created by Norton Antivirus. We don't want to
      # archive it either.
      case "NAVMac800QSFile":
      # The folder below is CAS created and there is no need to archive it.      
      case "CAS OS X Documentation":
        # For all the above folder names, print a message showing you
        # won't be archiving it, and break out of the "switch" statement.
        echo "  Skipping $subdir..."
        breaksw
      #
      # If we get to here, it's not a directory name we recognize as
      # worth skipping, so we will find all files modified here in the
      # last day and archive them.
      #
      default:
        #
        # Let the user know what we're doing.
        #
        echo "  Finding recently modified files in /Volumes/$dir/$subdir..."
        #
        # Find all items in the current subdirectory of the current
        # volume, of type "file", with a modification time in the last
        # day, print out the full path to them, and redirect the list to
        # the file /Library/CASAdmin/filelist.txt.  This will result in
        # filelist.txt containing a single long list of all files changed
        # on the system in the last day, which we'll later use as input
        # to Stuffit Deluxe.
        #
        find  "/Volumes/$dir/$subdir" -type f -mtime -2 -print >> "$workdir/filelist.txt"
        # Skip a line in the on-screen output.
        echo "  "
     endsw
    endif

  end
skipit:
end
#
# Change to the working directory.
#
cd $workdir
#
# Sort the contents of the file list into alphabetic order,
# removing any duplicates (-u), and place the result in the
# file "templist.txt".
#
/usr/bin/sort -u "$workdir/filelist.txt" -o "$workdir/templist.txt"
#
# Forcibly remove the filelist.txt file.
#
rm  -f "$workdir/filelist.txt"
#
# Copy the (sorted) templist to filelist.txt
#
cp "$workdir/templist.txt" "$workdir/filelist.txt"
#
# Remove all entries from the list that are contained in the
# exclude file, placing the filtered file in templist.txt in
# the work directory.
#
grep -v -f "$arcexcludefile" "$workdir/filelist.txt" > "$workdir/templist.txt"
#
# Don't include the existing archives in the archive!
#
grep -v "$archivelocation" "$workdir/templist.txt" > "$workdir/filelist.txt"
#rm -f "$workdir/templist.txt"
#
# Call Stuffit Deluxe to stuff, in SITX format, overwriting if needed,
# highest level of compression (16), "most efficient" method of
# compression (6), optimized, highest redundancy level, all files in
# the list "templist.txt", into a file name based on the machine name,
# date, and time, in the archivelocation directory.
#
/usr/local/bin/stuff -f sitx -o -l 16 -m 6 -O on -r 64 -i "$workdir/filelist.txt" -n "$archivelocation/$machinename.$uname.sitx"
cp "$workdir/templist.txt" "$archivelocation/$machinename.$uname.filelist.txt"
if (-e "$workdir/filelist.txt") then
   rm -f "$workdir/filelist.txt"
endif
if (-e "$workdir/templist.txt") then
   rm -f "$workdir/templist.txt"
endif
echo " "
echo "Cleaning up old archive files..."
find  "$archivelocation" -name "*.sitx" -type f -mtime +$archiveretention -delete
echo " "
echo "*** DONE ***"



The archivexclude.txt file contained the following lines:

/.DS_Store
/Users/admin/
/Users/root/
cronoutput.txt
filelist.txt
/File Transfer Folder
.cache
/Recent Servers/
.log
/Adobe/FileBrowser/
/Acrobat User Data/
.Xauthority
/.Trash/
diskutil.txt