Tape Backup Page

  1. Tape and Log File Naming Scheme
  2. Standard Tape Commands
  3. Doing/Checking the Backup
  4. Backup Scripts
  5. Troubleshooting
  6. Revision History
Back to BolocamWebPage
Back to ExpertManual


Tape and Log File Naming Scheme

Tapes are usually named obsrun_YYYYMM_tapeT_copyC, where YYYYMM is the year and month of the start of the run, T is the tape number (one run may require multiple tapes) and C is the copy number (we make two copies).  The naming is not known to the tape itself (the tapes do not have volume labels), but serves as a means to keep things organized.

When backup_day does a backup, the log file is named

    allegro:/home/observer/backup/log/backup_day_YYYYMMDD_yyyymmdd_hhmm

where YYYYMMDD is the date of the data being backed up and yyyymmdd and hhmm give the date and time of the start of the backup.  All dates and times are UT because allegro's clock is set to UT.

For each tape, we usually create a file with the same name as the tape in

    allegro:/home/observer/backup/log

and list in that file the backups done to that tape by the log file name.


Standard Tape Commands

The generic unix/linux tape utility program is called mt (for magnetic tape).  It lets you check tape status and navigate through a tape.  Its syntax is

     mt -f TAPE command (count)

where TAPE is the tape device name, command is one of a variety of tape commands, and count is parameter necessary for some of the commands.  The tape drive on allegro is /dev/nst0.  Some typical commands you will use are

Commmand
Explanation
mt -f /dev/nst0 rewind rewinds the tape
mt -f /dev/nst0 offline rewinds and ejects the tape
mt -f /dev/nst0 status checks status -- is there a tape in the drive, where on the tape is the head positioned (file number, block number, partition number)

Many more commands are available; type man mt to see them.


Doing/Checking the Backup

  1. Which tape:

  2. At any terminal logged in to allegro as observer, cd to ~/backup/scripts/ and type

         backup_day YYYYMMDD >& temp.log &


    where YYYYMMDD is the UT day of the data just taken.
     
  3. You can monitor the backup by typing

         tail -f temp.log

    Note the name of the log file mentioned in temp.log -- you will check it later.  To see the entire log file, type

         more temp.log

  4. When the backup is done,  the tape will be automatically rewound.  If the backup ends with more than 30 GB on the tape, then the tape will automatically ejected. 

    NOTE: if you try to back up to a tape that already has more than 30 GB on it, then the tape will be ejected without writing the backup.  This will be clearly indicated in temp.log, so check temp.log to make sure that the backup was actually written!  If not, you need to repeat with a blank tape.

  5. Check the log file indicated in temp.log for errors.  One quick way to do this is to type

         grep "tar:" logfilename

    which will list any errors.  Note that you will usually get the message

         Removing leading `/' from member names

    when you do this; ignore this message.  Notify Sunil Golwala of any errors.  You can check the amount of data written by just scrolling to the bottom of the log file; there will be a line near the end of the form

         Total bytes written: 2423296000 (2.3GB, 2.7MB/s)

    You should get 2.5 to 3 GB for a full night of data. 

  6. If the copy 1 backup has completed successfully, you can start the copy 2 backup.  Eject copy 1 using the offline command, then insert copy 2 and repeat the above.  Note that, if copy 1 was full and you had to start a new tape, then this will be true for copy 2 also and you can just start a fresh tape for copy 2 without trying to write to the full tape.

  7. If your backup filled up the tape, inform Sunil Golwala so he can run a verification on the backup.  Provide the full tape name, including the copy number, and where the tape is located.  Once the full tape is verified, copy 1 will go into the summit archive and copy 2 will go to Hilo.

Backup Scripts

A set of backup scripts have been written to make doing and checking backups relatively easy.

backup_day YYYYMMDD

This is the script used above, most users will need no others.  It backs up the specified day, appending to the end of the tape.

backup_day_overwrite YYYYMMDD N

This is a useful but dangerous script.  It backs up the specified day, but it fast forwards past the first N backups written to the tape and begins writing.  So it can overwrite data on the tape.  You should only use this script when you want to intentionally overwrite a backup (e.g., there was some problem with a backup and you want to try to rewrite it).  Since tapes are not random access devices,  be aware that overwriting will destroy not only the data that is overwritten, but any data after it also -- the new backup will write an end-of-data mark at the end of the backup it does, so any data past that point on the tape is likely inaccessible.

multiday_backup

This isn't really a formal script; it's just an example of a script that can string together a set of backup_day commands.  You may want to do this if you need to catch up on a few days of backups, or you ran into a corruption problem with a tape and want to write a new copy of the tape.  Feel free to modify as necessary.

multiday_backup_overwrite

This is like multiday_backup, but uses backup_day_overwrite.  Again, use with extreme care!

check_backup

This is a script to do a verification of tape.  It spins through the tape, doing tar -t on each archive on the tape to check that the archive is fully readable.  For each file that it checks, it creates a log file with the name

LOG_STUB_YYYYMMDD_HHMM

where LOG_STUB is the calling argument and YYYYMMDD_HHMM is the date that the verification of the particular file starts.  A global log with filename

LOG_STUB

is also written; it contains the position at start and end and size for each file on the tape.  It is sensible to use following for LOG_STUB:

for copy1: ~/backup/log/verify1/TAPE_NAME
for copy2: ~/backup/log/verify2/TAPE_NAME

where TAPE_NAME is the tape name of the form obsrun_YYYYMM_tapeT_copyC.

The main difficulty with the routine is that, in order to do an explicit comparison of the verification logs to the original logs, you have to figure out which backup logs correspond to which files.  You can usually do this by comparing the Total bytes written: line each of the original files (use grep to find the lines) to the file size in the global log file LOG_STUB.  Either the file sizes for the two copies written on a given day are identical, in which case it doesn't really matter which one you compare to, or the file sizes are different and it will be possible to use the file size to determine which log goes with which tape. 

However, it's not really necessary to do such an explicit comparison.  If the verification executes without any errors and the files sizes are vaguely correct (1-3 GB per day), then the archives on tape are readable and you shouldn't have any worries.


Troubleshooting

In general, the only problem you will have will be problems writing or reading tapes -- the backup routines themselves are both simple enough and have been used enough that they are robust.  If you have trouble writing a backup, you can try writing it again using backup_day_overwrite.  If it still fails, try a new tape; remember, you will need to rewrite all the backups on the tape, you can use multiday_backup to do this.  If you still have problems, something is probably wrong with the drive and you should contact Sunil Golwala.


Revision History


Questions or comments?  Contact Sunil Golwala, golwala@astro.caltech.edu