CATLG User's Guide

C. Ma
K. Baver

Abstract:

Table of contents:

1   Introduction

2   Catalog contents

3   Description of program Catlg

4   Catlg operations, interactive mode

4.1   CC - change or list data base disk directories
4.2   CL - clean up after an improper ending
4.3   CR - create a new experiment key and chain
4.4   DE - delete experiment or versions from catalog
4.5   DT - disk to tape operations
4.6   FS - count free space in catalog
4.7   LI - list contents of data base catalog
4.8   LT - list data base tapes
4.9   LU - change CATLG listing destination
4.10   MO - move data base files from one disk LU to others
4.11   OB - obsolete experiment operations
4.12   PU - purge active versions
4.13   TD - tape to disk operations
4.14   TL - label archive tape
4.15   US - user key operations
4.16   MV - move individual data base files between directories
4.17   SD - Select tape drive
4.18   IM - Import (add) a data base from disk

5   CATLG OPERATIONS, BATCH MODE

5.1   DE - batch deletion of ***KEY*** from catalog
5.2   DE input batch command line
5.3   DE output batch status line
5.4   IM - batch import of databases into catalog
5.5   IM input batch command line
5.6   IM output batch status line
5.7   Data base tape archive libraries
5.8   Description of catlg's "archv" subroutines

6   Historical Notes (transfer from the F-processor to UNIX)

6.1   GLOSSARY

1   Introduction

VLBI databases are tracked through the database catalog, which tracks each database's status (complete or in progress ("pending")), current location on disk and/or on tape and other information. Catlg is the high-level access program that lets users manipulate databases (e.g., move them between two disk directories) and updates the catalog to reflect the changes. Catlg also permits users to manipulate the catalog (e.g., add disk directories to the catalog system) without knowing low-level details such as the catalog's structure. This document will present information about Catlg and describe its use. Readers should note that this is a preliminary version of the user's guide being revived from 1992. Catlg and the database catalog change little over time, so this guide will provide a good general understanding of Catlg and its usage. However, some newer features are still undocumented and others may be slightly different than described. The decision is being made to release this guide in the hopes that it will provide more help to users than an undocumented program. And Catlg users should be able to infer the proper Catlg steps from this guide. But Catlg users are very welcome to e-mail the contact person (Karen Baver, kdb@leo.gsfc.nasa.gov) if they have questions about running Catlg that this guide does not answer.

2   Catalog contents

Each Solve center may have its own catalog, which tracks which of that center's databases are "active" (on disk) or archived to tape. Each experiment has a chain in the catalog and is assigned a 10-character key of the form ayymmmddbc, where: a is the prefix that identifies that center. The following table lists the standard prefixes for data base keys that were assigned to selected centers in the past. However, most, if not all centers, now use a $. Computer Prefix Goddard $ MIT / NGS @ Onsala > SAO {S yy - last two digits of the year the experiment began. mmm - first 3 letters of the month the experiment began. dd - two digit day of the month the experiment began. bc - up to two arbitrary characters. Some standard examples are: "X" X-band data "S" S-band data "XU" Intensive X-band data The entries on each chain contain information about the versions of an experiment's data base. Each entry includes: 1. The experiment name. 2. The first 80 characters of the version's history. 3. The disk directory on which the version's file is located, represented by a 1-2 digit number corresponding to that directory in the catalog. Inactive versions (versions not on disk) store a 0. 4. The tape number on which the version was last archived, and its position on that tape. (E.g., n for the nth data base version on the tape.) 5. A pending/blocking flag. (1 = pending - the data base is currently being updated by CALC, etc. 2 = blocking - virtually obsolete) 6. The version number. 7. The length of the disk file in words. 8. The dates of the data base file's creation and addition to the catalog. Besides experiment data base files containing VLBI observations, there are data base files which contain log information ('bc' is "Lx") and ephemeris data ('bc" is "EP"). Two special data bases exist: 1) $$SKELETON, which contains all priori information. 2) BENCH MARK, which is a dummy experiment for program verification. In addition to chains related to experiments, there are chains linking entries with specific attributes: 1) active database versions (versions on disk) 2) versions located on a particular disk directory 3) versions located on a particular archive tape 4) pending/blocking versions Experiments themselves are linked to either an active or obsolete chain. User chains can also be created.

3   Description of program Catlg

CATLG allows a user to access and change information in the data base catalog and to manipulate data base files. CATLG provides for: 1. Listing or changing the data base directories. (command CC) 2. Correcting the catalog after a program failure involving the data base handler. (command CL) 3. Creating an experiment key. (command CR) 4. Deleting an experiment or version. (command DE) 5. Copying files from disk to tape. (command DT) 6. Counting the free space in the catalog. (command FS) 7. Adding a disk data base file to the catalog as a new key. (command IM) 8. Listing information in the catalog. (command LI) 9. Listing the data bases on a tape. (command LT) 10. Changing the listing destination (terminal or dump file). (command LU) 11. Moving data base files from one disk directory to another. (commands MO,MV) 12. Changing an experiment to and from an obsolete status. (command OB) 13. Purging files from disk. (command PU) 14. Copying files from tape to disk (command TD) 15. Labeling archive tapes. (command TL) 16. Creating a user chain and hanging entries for experiment versions of particular interest on it. (command US) Two levels of subroutines support CATLG and its subroutines. The higher level routines are in the qcat4 library (qcat4.a). The lower level ones are in the chain4 library (chain4.a). Catlg is generally run in interactive mode. However, a limited number of commands are supported in batch mode. Available commands are described in a later section (Catlg operations, batch mode).

4   Catlg operations, interactive mode

To use CATLG, type catlg in lower case. The program prints a header and a message for opening the data base catalog. If opening does not complete and the program waits, another user may have the catalog. After opening is complete, the program shows any news (read from a file specified by the catlg_news parameter in the catalog_parameters.i include file), then the number of free entries in the catalog. Then CATLG prompts the user for a command. "??" or an unrecognized command will cause a list of valid commands to be printed. "::" or "OF" as the reply to "Catalog command ?" terminates CATLG. Although most CATLG questions prompt the user for upper case responses, CATLG also accepts lower case responses. Throughout CATLG, users may specify 0 whenever a version number is solicited, to access the last (newest) version of an experiment.

4.1   CC - change or list data base disk directories

The user is prompted for listing, adding to, or deleting from the list of disk directories which the data base handler may use. A directory cannot be removed from the list if there are data base files active on it.

4.2   CL - clean up after an improper ending

The CL command will clean up after a program ends abnormally while working with a data base file. CATLG prompts for the data base handler mode (read, update, or create), input experiment key and version, and output experiment key and version. The output file is purged, and the catalog pending entry is deleted. The data base handler has some ability to clean up when it ends abnormally, so CL will not always be necessary. The "PEND ENTRY" key can be listed from the LI command's (C)hain mode to look for suspected pending entries left over after an abort. However, CL refuses to delete entries that are not pending, so it is always safe to just run CL.

4.3   CR - create a new experiment key and chain

CATLG prompts for the year, month and day of the experiment (2 digits each), a 2-character identifier and a 36-character experiment description. The 2-character identifier may be arbitrary but generally has an assigned meaning (e.g., XU for x-band intensive). The description should indicate the stations or particular content of the data base to be created. The new experiment name is returned. A new experiment key must be created before using the data base handler in the create mode or in the mode to update onto a new key. The data base handler has no capability of creating a new experiment key in the catalog. CATLG also allows the user to select an arbitrary 10 character key name, instead of a name generated from a year, month and day. To avoid bugs in the catalog system, users should start the key with their center's standard key prefix (e.g., $).

4.4   DE - delete experiment or versions from catalog

This command removes all traces of individual data base versions or data base experiments (including all their versions) from the catalog. It also removes the corresponding data base files from disk. If the user just wants to purge the data base files from disk, but retain the record of the versions in the catalog, he must use the PU command instead. The user is asked for a password. If the password is accepted, the user is asked whether experiments ((K)eys) or versions ((E)ntries) are to be deleted. Before deleting, CATLG writes the experiment description or version history and requires confirmation from the user (unless the user has asked to turn off the confirmation step). If the user is deleting versions, only the last version of an experiment may be deleted. A hidden feature allows the user to input experiments to be deleted via a file. To use this feature, the user should type F when CATLG asks whether to delete a (K)ey or (E)ntry. The file feature deletes the entire experiment or the last version, depending on whether (K)ey (the entire experiment) or (E)ntry (the last version) is selected as a sub-option.

4.5   DT - disk to tape operations

All disk to tape operations involving the data base catalog are invoked through DT. These include archiving data base files or writing such files on a temporary tape for transmission to an external computer. There are three modes: 1) archiving (A)ll versions on disk which are not yet archived. (This mode has not been used in years and is currently disabled.) 2) archiving (P)articular versions within the user's catalog system and 3) writing data base files on a (T)emporary tape. In the third mode CATLG prompts for an arbitrary tape number and the number of data base fries already on the tape. Then the user selects the versions to be written. In the second and third modes CATLG prompts for specific experiments and versions. At any stage in the selection, the operation can be aborted. Then mode 2 automatically selects a tape from the archive library. Messages indicating which tape to mount are then printed on the user's terminal. As each version goes to tape, a confirmation is printed. After the versions are written to tape, the catalog is updated in mode 2 and another confirmation message is printed for each version. CATLG does some checking to make sure the files will not overflow the tape and rejects files which would overflow. In modes 2 and 3, catlg generates a status file: <auxiliary_dir>/dt_status_<user_id>, where <user_id> is the user's log in id, and <auxiliary_dir> is the directory for auxiliary catlg tape operations set by the AUXDIR parameter in includes/catalog_parameters.i. A line is written for each database version that catlg tries to archive. Each line has the format: key_name version status_code, where status_code is one of: ACTAC ERROR - Catalog discrepancy. CONTACT A CATLG EXPERT to fix the catalog. However, this is usually a minor error affecting only one database, so catlg is still usable. ALREADY ARCHIVED - The database has already been archived to tape. ARCHIVING ERROR - The attempt to archive the database to tape has failed and the tape may be in an incorrect state. The catalog has not been updated for this database. NOTIFY A CATLG EXPERT so s/he can check the tape. Show him/her any output from the terminal. CAT UPDATE ERR: - The database was archived to tape, but DT failed to update the catalog to indicate this. The catalog may be damaged. ***NOTIFY A CATLG EXPERT IMMEDIATELY*** and show him/her any output from the terminal. CATALOG LOOKUP ERROR - DT encountered an error while looking in the catalog to make sure the key is catalogued. You should CONTACT A CATLG EXPERT before continuing. NOT IN CATALOG - The key could not be found in the catalog. NOT ACTIVE - The catalog thinks the database file is not on disk. OVERFLOW TAR LIMIT - Some catlg systems archive via tar. Placing too many files in a single tar archive will make accessing the files very slow. So Catlg systems that use tar will not archive more than 500 databases at a time. The flagged database would exceed this limit, so the database has been rejected for this DT call. REMOTE UNMOUNTED LU - The directory on which the database is located is not mounted on the machine running Catlg. Contact your system administrator to mount the directory. SUCCESS - The database has been placed on tape, and the catalog has been updated to indicate this. TOO BIG TO ARCHIVE - The database is too large to be archived to any tape. Contact a Catlg expert for help. TOO BIG FOR THIS RUN - The database has been eliminated from this DT session because it is too big to fit on the end of the tape. Try again later.

4.6   FS - count free space in catalog

CATLG searches through the catalog free space chain and counts the number of records left for use.

4.7   LI - list contents of data base catalog

There are seven listing modes: 1) active versions 2) experiment keys 3) user keys 4) entries on a specific chain 5) specific versions 6) tape contents 7) cartridge (data base directory) contents There are five levels of listing: a) minimum - only experiment key and version number, b) description - including experiment description or version history, c) other - including the disk directory, archive tape, creation date and pend/block flag from the entry, d) full - all of the preceding, e) condensed - experiment, version, tape and version history. In mode 1, all the versions currently active on disk are listed in chronological order. The listing may be limited to a date range. In mode 2, all the non-obsolete experiments (see OB) are listed in chronological order. The listing may be limited to a date range. In mode 3, all the keys created by users apart from experiment keys (see US) are listed. In mode 4, CATLG prompts for keys and lists information for all the entries on the specified chains. The keys may be experiment keys, user keys, or certain internal catalog keys, such as "PEND ENTRY". In mode 5, CATLG prompts for a specific experiment and version and lists the corresponding catalog entry. Versions may be input interactively or from a file. In mode 6, Catlg lists data base versions archived to specific tapes. In mode 7, Catlg lists data base versions residing on specific directories. This mode also reports discrepancies between the catalog and disk - versions not found on the directory but listed as being there by the catalog ('ghosts') and versions not listed in the catalog but found there ('orphans'). In most modes, "br pid" will stop the listing. (Pid is the process id of catlg, as identified by the UNIX ps command.)

4.8   LT - list data base tapes

There are two modes, to list tapes in a catalog system's archive library and to list temporary tapes sent from other Solve centers. CATLG prompts for various things. The most important are the selection of a condensed listing (just the data base versions on the tape) vs. non-condensed (history entries as well) and the numbers of the tapes (arbitrary for temporary tapes). Then LT prints instructions for mounting the tape(s). Listing stops at a double EOF. LT should be used to list the database versions on a temporary tape sent from a different Solve center.

4.9   LU - change CATLG listing destination

This command is used to control the destination of output from most modes of the LI command. The user can switch between terminal and file output or switch from one file to another. (The user selects the file names.)

4.10   MO - move data base files from one disk LU to others

CATLG prompts for the directory to be cleared and lists all versions on the specified directory along with their lengths. The user enters the line number and destination directory for the versions to be moved. The directory to be cleared and the destination directory must be specified by their directory numbers, and both directories must be part of the database catalog system as recognized by the CC command's listing feature. MO can be used to clear contiguous space if a particularly large data base file is to be created or brought from tape.

4.11   OB - obsolete experiment operations

There are three modes: 1)transfer experiment to obsolete status, 2)revive obsolete experiments, and 3)list the obsolete experiments. In the first two modes CATLG prompts for experiments to be altered. If an experiment is made obsolete, its key is removed from the normal chain of experiments and placed on the obsolete experiment chain. The experiment's entries and versions are not affected. The listing mode can include the experiment descriptions. "br pid" will stop the listing, where pid is the process id of catlg as identified through the UNIX ps command.

4.12   PU - purge active versions

This command purges data base versions from disk and updates the catalog to show that the versions are no longer on disk. This command does not remove the versions' entries from the catalog. The DE command handles that function. CATLG prompts for experiment versions to be purged. The user may enter the versions (I)nteractively or through a (F)ile. There are some safeguards against purging last versions and/or unarchived versions, but users should be careful. This command also purges all data base versions, if the user chooses option A. This option can purge every version or only those that have been archived.

4.13   TD - tape to disk operations

All tape to disk operations involving the data base catalog are invoked through TD. There are three modes, restoring archived versions (R) and adding new versions to the catalog from either a temporary tape from another center (A) or from an archive tape (L) (loading as a new name). In the restore mode CATLG prompts for experiment versions to be restored. The user can enter them interactively or by a file. If a version is already on disk, it is purged. In the adding and loading modes, CATLG prompts for a tape number (arbitrary for temporary tapes) and a list of experiment names and their positions on the tape. The list can be entered interactively or by file. If the user uses a file, he must include two lines per experiment, with the first specifying the experiment and tape position, and the second giving a description of the experiment. (The interactive mode will solicit the description later.) If the experiment is already in the catalog, a blocking entry must exist to accept the new version. CATLG asks for confirmation if such a blocking entry exists. If the experiment does not exist in the catalog, a new experiment chain is created and confirmation is requested. In all modes the tape to disk operation can be aborted any time during the data base selection. When the list is complete, messages indicating which tape(s) to mount are printed on the user's terminal. As each version is written to disk, a confirming message is printed. When all versions have been written, the catalog is updated and messages are printed for each version.

4.14   TL - label archive tape

TL prints records at the start of a tape that identify it as a specific tape within the Solve center's archive library. (The tape is identified by number.) CATLG prompts for the tape number and gives directions for mounting the tape. The tape cannot already have the identifying records for another number within the library.

4.15   US - user key operations

There are four modes: 1)create a user key 2)delete a user key 3)chain entries on a user chain 4)unchain entries from a user chain In the first mode CATLG prompts for a user key and a 60-character chain description. The user key can be any 10-character string not beginning with "$" or the local experiment key prefix and not conflicting with the internal catalog keys. In the second mode CATLG prompts for a user key which is then deleted from the catalog. All entries previously on the chain remain on their other chains. Deletion of the user key destroys the corresponding user chain. In the third mode CATLG prompts for a user key. The user then enters the experiment versions whose entries are to be chained onto the key. Each entry can be chained at the end, at the beginning, or after another specific entry already on the chain. In the fourth mode CATLG likewise prompts for a user key. The user then must enter experiment versions whose entries are then removed from the specified user chain. User chains can be created to link versions with particular characteristics.

4.16   MV - move individual data base files between directories

MV prompts for a list of experiment versions to be moved. (Unlike some of the other CATLG commands, this list cannot be entered from a file -- only interactively.) Then MV lists the available data base directories by full path and directory number and prompts for the directory to which all the listed versions should be moved. (The user should specify the directory by its number). MV acknowledges each version by printing its length, moves it and records the new location in the catalog.

4.17   SD - Select tape drive

SD selects the tape archive library CATLG will access until the next use of SD. (CATLG uses information in the catalog to determine which archive library it will initially access.) Specifically SD selects the library, "lu number" and density (which it combines to select the tape drive) and maximum words per tape. SD permits the user to make a choice from all the "lu numbers" and densities in the catalog system, and sets the tape drive, library and words per tape from that. SD also lets the user return to the initial default values from the catalog. This function is virtually obsolete (and therefore may have errors) because every center currently has only one archive library. It will not be disabled, however, because future multiple libraries are currently under consideration. SD also currently sets the "lu number" and density for listing temporary tapes. However, this is a leftover side effect from the days when SD was first created. (It originally set the density etc. for all temporary tape access.) Since then CATLG has been recoded to always solicit the density etc. when the user wants to read from or write to a temporary tape, and at some point it will be recoded to solicit these things when listing a temporary tape as well. Until then, users must use SD to set the density etc. for listing temporary tapes. This is the only valid use of SD.

4.18   IM - Import (add) a data base from disk

IM adds a data base disk file into the catalog. Before using IM, the user must log onto another center's computer, use their CATLG to identify the full paths to data bases he wants and bring them onto his own computer using ftp (in its binary mode). The data bases must come from a UNIX machine. The user must place the data bases in a directory recognized by his CATLG as a data base directory. The file part of each file's name will already be in the form YRMONDYXX_V### (year, month, day, 2 characters (or one character followed by the character "_'), "_V" and a 3 digit version number (zero filled - e.g., 005). The user must leave it in this form. However, if he wants to rename the data base to another experiment name, he can do so. IM operates on a single data base. IM first displays the list of available data base directories and their numbers and solicits the directory number identifying the directory where the data base file is located. Next IM solicits the data base's file name (YRMONDYXX_V##. IM reads part of the file and dumps it, asking for confirmation to continue. If the user has renamed the file, IM points out the discrepancy between the new name and the record of the name within the file and asks whether the user truly wants the new name (option D for changing the (D)ata base contents to the new name) or wants to keep the name implied by the old database contents (option F for renaming the (F)ile name). The experiment (key) name must not already exist in the catalog.

5   CATLG OPERATIONS, BATCH MODE

The Catlg batch mode lets the user run selected commands from a file, answering all of Catlg's usual questions via the file. The input file must contain one command per line. The status of each command is written to a separate line in a status file. In addition, error messages can be directed to standard output, standard error or an e-mail address. The calling sequence is: catlg <input_command_file> <output_status_file> {<error_destination>} In <input_command_file>, each line must have the form: <catlg_command> <specialized_instructions>, where catlg_command is a two letter command (e.g., IM) AND specialized_instructions is a set of instructions and/or input for that command. Only commands described in this section of the catlg user's guide are supported. Subsequent sub-sections of the guide give the actual format for each command's line. <Input_command_file> is a required argument to catlg. One status line per command is written to <output_status_file>. The form for each command is described in that command's sub-section of this guide. <Output_status_file> is a required argument to catlg. <Error_destination> is an optional argument. This argument only specifies the destination for messages about the overall execution of Catlg (e.g., a start up error or an error reading the command file). Error messages for each command will be reported as requested in that command's command line. <Error_destination> may have four forms: terminal - error messages will be output to the terminal (standard output) -stderr - error messages will be output to standard error e_mail_address - messages will be mailed to this address (e.g., kdb or kdb@leo.gsfc.nasa.gov) --- - suppresses messages If no argument is specified, error messages will be suppressed. In batch mode, catlg returns a status code -- 0 if all commands have been successful or 1 if any error has occurred in any command.

5.1   DE - batch deletion of ***KEY*** from catalog

******************WARNING:**************************************** Please note that only the key deletion option is currently supported. Entry deletion is not available, and all entries of the requested key will be deleted from the catalog and purged from disk. ****************************************************************** This command removes all versions of an input key and the key itself from the catalog. It also removes the corresponding data base files from disk. If the user just wants to purge the data base files from disk, but retain the record of the versions in the catalog, he must use the PU command instead.

5.2   DE input batch command line

Each DE command line will only delete one key. The correct format is: DE <error_dest> <password> <confirmation> <deletion_type> <key_name>, where <error_dest> is the destination for DE error messages, specified as: an e-mail address (e.g., kdb or kdb@leo.gsfc.nasa.gov) - to e-mail the messages to that address -stderr - to send error output to standard error --- - to suppress error messages <password> is the value set by the valid_batch_passwd variable in source file catlg4/de.f. <confirmation> is Y to have catlg ask for confirmation before initiating each deletion OR N to skip confirmation <deletion_type> must be K to indicate that the entire key should be deleted. <key_name> is the name of the key to be deleted (e.g., $01JAN18XU)

5.3   DE output batch status line

Output lines have the form: <key_name> <status>, where <key_name> is the key that DE tried to delete <status> is failure (for any failure) or success (for complete success) In the case of failure, details are sent to <error_dest>.

5.4   IM - batch import of databases into catalog

IM adds a data base disk file into the catalog. (The key must not already exist in the catalog.) Before using IM, the user must place the data base in a directory recognized by his copy of Catlg as a data base directory. (If the database is transferred to the local file system via ftp, ftp must be run in binary mode.)

5.5   IM input batch command line

Each IM command line will import only one database. The correct format is: IM <placeholder> <error_dest> <discrep_flag> <db_dir> <db_disk_name> where <placeholder> is reserved for an option that may implemented soon. Any value (e.g., -) may be typed for now. <error_dest> is the destination for error messages, specified as: an e-mail address (e.g., kdb or kdb@leo.gsfc.nasa.gov) - to e-mail error messages to that address -stderr - to send error messages to standard error --- - to suppress error messages <discrep_flag> gives instructions for resolving discrepancies between the database file name and its contents. A database file name is supposed to be the name of the experiment and version in format: YYMONDYxx_Vxxx (e.g., 01JAN18XU_V001. IM must resolve any discrepancies between the experiment and version implied by the file name and the experiment and version read from the file contents. In batch mode there are three options: F - modify the file name to match the database contents D - modify the database contents to match the file name E - flag the discrepancy as an error and stop processing this file <db_dir> is the directory on which the new database file is located. This must be a full UNIX path within the catalog system. <db_disk_name> is the name of the database file (e.g., 01JAN18XU_V001).

5.6   IM output batch status line

Output lines have the form: <database_file_name> <status>, where <database_file_name> is the UNIX file name of the database being imported <status> is failure (for any failure) or success (for complete success) In the case of a failure, details are sent to <error_dest>.

5.7   Data base tape archive libraries

Catlg allows centers to create DAT tape archive libraries. For backwards compatibility, Catlg also tracks 800, 1600 and 6250 bpi libraries created at specific Solve centers in the past. (Errors are known to exist in the 1600 bpi case.) CATLG currently only allows updates to a single library. However, data bases may be restored from previous libraries provided the center has retained the corresponding tape drives. Tar and non-tar libraries are supported. Each tape has a tape label as the first file. Archive tapes must be labeled with the CATLG TL command before first use. The label contains the tape number and first use date. Each version archived is written as a separate tape file for non-tar libraries or a collection of data base versions for tar libraries. For non-tar libraries, the disk records are blocked into 1500-word tape records (6144-words for 6250 bpi) including a checksum. If the DOUBLE_TAPE_REC parameter is set to .TRUE. in the catalog_parameters.i include file, each tape record may be written twice, so that if the first record of a pair is faulty when the tape is later read, as determined by the checksum, the second record can be used. A tape ends with a triple EOF. The DOUBLE_TAPE_REC parameter must be set to .TRUE. or .FALSE. for every tape written within a single library. Setmarks may be added to individual tapes to make restoration of databases quicker. CATLG checks each tape's label before writing to or reading from it, to protect users from accessing the wrong archive tape. CATLG tracks where specific data base versions are archived and prompts the user to mount the proper tape when the user wants to restore a specific version. CATLG also tracks the amount of tape used up on each tape and automatically selects a tape with enough room when the user wants to archive data bases. At some point, as time permits, this guide will be updated to give more information about Catlg tape libraries. Until then, programmers at Solve centers should read the documentation in the catalog_parameters.i include file when setting up the tape libraries for a new catalog system.

5.8   Description of catlg's "archv" subroutines

The ARCHV subroutines ("ARCHV") arc used as CATLG's main interface to the tape drive. Three of the four tape commands (DT, TD and LT) use "ARCHV". Only TL accesses the tape drive directly. Because the ARCHV routines were once collected into a separate program, CATLG passes some of its information between DT, TD and LT and ARCHV via files. The file for the save mode (DT) begins with a record identifying the tape and starting file. The subsequent records have the full paths to the data bases and their experiment keys and version numbers. "ARCHV" updates the file with the tape file position to which each data base version was written. In the file for the restore mode (TD), each record identifies a data base version to be restored. Each record lists the tape (and position) where the version is archived, the 14 character file name it will have on disk, and its experiment key and version number. After each data base is restored, its record is updated with the actual disk file name it was given and the directory where it was placed (represented by its number). In addition the length of each file in words is written in the restore file. During the restore, the file name on tape is checked with the file name to be used on disk. If the version is being restored (as opposed to added or loaded; see TD), the two names must match for the copy to proceed. In the file for the list mode (LT), each record lists a tape number to be listed. All three types of files are ordinarily written, used and purged without bothering the user. However, if the tape action fails, CATLG tries to preserve the file for debugging purposes. The files are located on/tmp and are called abcsave, abcrstr and abclist, where abc are initials solicited by DT, TD and LT. "ARCHV" checks tape labels and drive status before proceeding with the operations. Consequently the instructions given on the user's terminal should be followed explicitly as to tape number and write ring status. At the moment, "ARCHV" 's first message asks if the tape drive is a DAT (DDS-forma0. Next, in restore mode, "ARCI-IV" asks if the user wants to select an lu (directory) where (A)ll his data bases will be restored or if he wants to let CATLG select a (D)efault directory. For (A), the actual directory is selected later. In list mode, "ARCHV" asks if the user wants a condensed listing (one with just the experiments and versions) or not (one with history entries as well). Next =ARCHV" beeps until the user hits any key. Then "ARCHV" asks if the user really wants to break. Then "ARCHV" gives instructions for mounting the tape and asks the user to type "R" (ready to continue) or "C" (cancel). Finally, in the restore mode, if the user asked to direct (A)ll his data bases to a directory, "ARCHV" displays the choices and lets him select one. The "R" (ready to continue) / "C" cancel message will be repeated once for each tape being accessed. Entering "C" will cancel all the remaining tape operations. However, it will not affect any of the previous operations. Data bases already archived or read from disk will remain on tape or disk, and the necessary catalog updates will be done. Users may encounter one other message requiring action. Each data base version is written to its own file on the tape. If the previous attempt to save to an archive tape failed or the user specified that a temporary tape already had n f-des when it had n+l, "ARCHV" will try to overwrite the last file on the tape. In this case, "ARCI-1V" first dumps some information about the data base in that file and asks whether or not to continue. "ARCHV" cannot overwrite any file other than the last one, so saying that a tape has n flies when it really has n+2 etc., will fail. "ARCHV" produces a progress message as it completes each data base restoration or save.

6   Historical Notes (transfer from the F-processor to UNIX)

On the F-processor, data base disk files were located in disk areas called lus. Each lu was identified by a one or two digit number. On the UNIX, the disk areas are known as directories or paths (also called "areas", within the catalog system), and are identified by names up to 140 characters in length. However, to provide a short hand way to refer to these directories, the catalog system asks each center to assign a distinct one or two digit number to each data base directory. CATLG and the other catalog system programs then ask users to refer to the directories by those numbers. The CATLG CC command lists the directories and corresponding numbers for a given center. Tape drives are also accessed differently on the F-processor and UNIX. On the F-processor, tape drives were accessed by single digit numbers known as "lus". The catalog system recorded each available "lu" and its density(ies), and when the user wanted a drive with multiple densities, solicited the specific density. When the catalog system was transferred to UNIX, it was coded to make differences in the machine's tape drives transparent to users. However, there are differences, of which programmers, at least, should be aware and which may be useful to users. On the UNIX, tape drive access occurs through files, not lu numbers. By convention, these files are located on /dev/rmt, and they have names of the form #c, where # is a single digit number (usually 0 is assigned, then 1, ...) and c is a character (usually, 1 for 800 bpi, m for 1600 bpi and h for 6250 bpi, although one important exception has been developed and will be described later). The catalog system bridges the gap between the non-UNIX and UNIX tape systems by tracking and soliciting lu numbers and densities and using them to construct the tape file names. (E.g., if the user wants lu "0" at 6250 bpi, the catalog system will actually access /dev/rmt/0h.) In general, the representation of tape drives as lu-density combinations works. However, recently centers have begun to acquire DAT (DDS-format) tape drives, which most centers, if not all, will connect to/dev/rmt/#m, based on the examples in the center manuals which accompany DATs. The big problem with this is that the DATs are not 1600 bpi (medium density) drives. They axe a new density, unofficially calculated as over 4 million bpi. At some point, the catalog system must be fixed up to cleanly handle this exception. This change is tentatively planned to occur between March and the fall of 1992. For now, coding has begun to handle DATs as a special case. Centers should install their DATs as/dev/rmt/#m, and the catalog system will use special, temporary code to handle the discrepancies. Currently, the catalog system can write temporary tapes to DATs. The capability of writing archive tapes is scheduled for development by March of 1992.

6.1   GLOSSARY

blocking entry - entry on an experiment chain created by CATLG indicating the next version will be returned from an external source (i.e., a temporary tape). Obsolete. chain - series of catalog entries linked sequentially and associated with a key. data base handler - a set of subroutines (in library dbase96/dbase96.a) that read existing data base files or create new ones for programs such as Calc. The data base handler has three modes: read - reads the file for a specific data base key and version create - creates a file from scratch update - creates a file from an existing data base file (usually version 2+ of a key based on the previous version) entry - single record in the catalog with information about a version or key. experiment chain - chain of catalog records representing the versions of a key. There is a key record and a record for each version linked in order. experiment key or name - a key for an experiment chain. key - 10-character name associated with a chain. pending entry - entry on an experiment chain created by the data base handler indicating a new version is being made in the create or update modes. user chain - chain of versions selected by a user for a special purpose. version - a data base file (on disk or tape) containing one update of an experiment (E.g., a version made by Dbedit, a version made by Calc a version made by a routine Solve update for ambiguity resolution). version number - sequence number assigned to a version of an experiment. Version numbers increase by one for each new version of an experiment's data base file. A new version is made each time a data base file is updated or created. (Note: the version number of an experiment's current last version may be referenced as 0.)



Questions and comments about this document should be sent to:

Karen Baver (kdb@leo.gsfc.nasa.gov)

Last update: 2001.01.18