User guide to dbedit

Leonid Petrov
Brent Archinal
Gregg Cooke

Abstract:

Table of contents:

1   Overview

2   Command parameters

3   Control file

3.1   Special Note: Quoting Names

4   How does dbedit work

5   Restrictions

6   History


1   Overview

Program DBEDIT is a tool for manipulation with database files. It allows a) to create a new database, b) to sort a database, c) to eliminate duplicate observations, d) to add some additional information, i.e., apriori, e) to remove information from existing databases, f) to merge several databases, g) to remove sources, stations, baselines from the database, h) to rename stations and/or sources. A database is a file which is related to a VLBI experiment and contains a) header; b) history entries; c) table of contents; d) set of items. Usually one experiment has two databases: one databases contains primarily information about X-band, the second database contains information about S-band. Sequential access to lcodes is provided. Programs which process the experiment reads information from the databases and some programs, i.e. CALC, adds information to the databases. Each item has 1) NAME called "lcode" 2) TYPE: INTEGER*2, INTEGER*4, REAL*8, CHARACTER; 3) SCOPE: session, observation; 4) DIMENSIONS: each item is considered as a three-dimensional array 5) VALUES. Main functions of DBEDIT are a) creation of the new database from the output of post-correlation software. b) adding apriori values: i.e, site positions, sources positions, UT1, polar motion to existing databases.

2   Command parameters

Usage: dbedit <control_filename> [-s <filename>] [-v n] [-s <filename>] [-e <filename>] [-o <filename>] [-p <filename>] [-h <filename>] Normally to run the program, the user just sets up a control file and then types the program name "dbedit". However, some of the values the program always uses by default can be set through the run-string using run parameters. Specifically, the control file name, the scratch file name, and the verbosity level can be set, as follows: (filename) where filename is the name of a file to be used as the control file. This must be the first parameter in the run-string. The control file, described below, can be used by other programs that are able to read it. NOTE: THE USER MUST HAVE WRITE permission to the directory where the control file resides. (I don't know why) -s (filename) where filename is the name of a file to be used as the scratch file. This file is quite large -- the size of an average database. The user may want to move this to a directory with more space than the default when processing particularly large databases. (The default name of this file is "/tmp/scrdatafil_[PID]" where PID is the specific user process #, unique for each run. This name is currently set in ../includes/param.templ) -v n where n is the verbosity level, as described below: 0 Silent mode. All output to stdout is suppressed. 1 Muted output. Only the barest status output. 2-3 Standard output. Same as in the old KBMSG program. 4-7 Extended output. Tells you everything it's doing... >7 Debug mode. See the programmers guide for a table of debug flags. (Default value is 2, as set in ../includes/param.templ) -e <(error file)> (if omitted than the errors are going to stderr) -o <(output file)> -- file for information messages. (if omitted than the errors are going to stdout) -p <(progress file)> -- file for progress messages in addition to information messages. It is useful when dbedit is called as a subprocess. If omitted then progress messages are redirected to /dev/null -h <(history file)> -- name of the file with history. This is a plain ASCII file with lines no longer than 78 characters which is attached to the history section. If the file contains a line with "$" or "." as the first character then this line all consecutive lines are ignored.

3   Control file

Control files consist of a) SECTION TITLES. Section titles open the section and declare that the next keywords belong to that section. b) KEYWORDS. The value may follow the keyword. Keyword determine the action which dbedit should perform and qualifiers to be applied. Keywords themselves are case insensitive (they are transformed to the upper case anyway), while values are case sensitive. c) COMMENTS. Lines starting from * are considered as comments and ignored. If the user wishes to cut off processing of the control file at a certain point he can use the $EOF section title. Placed anywhere in the file, it causes an end-of-file to be sensed at that point.

3.1   Special Note: Quoting Names

Names that are thought of as single objects AND that can be listed more than one to a line must be quoted if they contain embedded white space. This works just as it does in Unix; double quotes or single quotes can be used. The objects affected are baselines, stations, and sources. Remember: a station name is composed of 8 characters, so a baseline with a station name with less than 8 characters (e.g. Kauai) as its first station MUST BE QUOTED. If the baseline had such a station as its second name it need not be quoted. HOWEVER, also note that the order of stations within a baseline specification is significant, e.g. 'gilcreekwestford' is not the same as 'westfordgilcreek', so just switching the order of the names to avoid using quotes may not give the user the desired results. For example (for baselines): 'kauai gilcreek' is the correct and recommended form kauaigilcreek is incorrect gilcreekkauai will work, but will give different results When in doubt, just remember: 1. Always use one set of quotes around a complete baseline, station, or source. 2. Spaces in station and source names are significant. always use 8 characters including blanks.

4   How does dbedit work

First, dbedit parses control file. Then it reads so-called input media(s), what can be either a database or a set of databases, or correlator output or both. If the input media is a correlator output dbedit creates a listing of directory tree which contains FOURFIT-supplied files: files of extent 52 in the case of Mark-3 correlator output and files of type 2xxx in the case of Mark-4 correlator output. Each input file corresponds to the fringe output for one observation. In general there may be several fringe outputs for the same observation. This list is saved in the temporary directory. Then dbedit reads all input files from that list. It extracts information from these files, makes some necessary transformations. Dbedit puts all extracted information related to the observation in the intermediary record. If the input media is the database then the database is read. The database file is opened, table of contents is extracted. If the keyword INPUT_LCODE is ALL or this keyword is not specified, then all lcodes are read. If keyword INPUT_LCODE is specifies the list of permitted lcodes, then all lcodes which both saved in the database and specified in the INPUT_LCODE lists are read. Dbedit puts all extracted information related to the observation in the intermediary record. If the file with unwanted observations is specified (keyword NOINPUT_OBSER) then each observation which passed through all other filters is chacked agains the list of unwanted observation. If it matches to the observation specified inthe list it is not considered any more. Then it applies input filters to the intermediary records of the observation if specified. If the record passes through the input filter it is written down in the temporarily file of direct access and a list of keys for each observation is created. After that the keys are sorted in time order. The duplicates are eliminated: records which correspond to the same observation. The the output database is opened. 4 blocks are written before the first observation: 1) mathematical and astronomical constants; 2) (unless APRIORI NO) apriori station/sources coordinates, antenna axis offsets, ocean loading parameters etc. 3) (unless UT1-PM NO) UT1, pole coordinates and (optionally) daily values of nutation offsets around the date of the experiment. 4) (if EPHEMERIDES YES) positions of Sun, Earth, Moon taken from DE200/LE200 numerical ephemerides. This option is considered as obsolete. The early versions of Solve required ephemeris information. Calc 9.1 and later doesn't need them since it reads ephemerides (DE403) itself and ignores ephemerides saved in the database. Therefore it is recommended that EPHEMERIDES NO should be specified unless there is a specific need to have ephemerides in the database. Then information saved in the intermediary records which is passed through is written in the output database. If the keyword Calc is specified, dbedit creates a control file for CALC (so-called calcon file ) and calls Calc. The if history file uis specified, dbedit analyzes its contents. It puts all lines of the history file into the history section of each database till it finds a marker: the line with dollar character in the first column. Dbedit does not put the marker and all lines following the marker to the database.

5   Restrictions

If several databases are specified as input then special care should be taken. Dbedit doesn't allow to read several databases with contradiction definitions of lcodes. It checks whether lcodes with the same names have the same dimension and type. As a rule of thumb two databases HAVE some lcodes with different definitions, f.e. lcodes supplied by a priori section of dbedit, by calc, dbcal, solve and other programs. The only way to process a set of databases with different specification of lcodes is to apply the input filter and not allow dbedit to read these lcodes! It means dbedit supports rather restricted mode of merging several databases: many Calc-supplied, Dbcal-supplied, Solve-supplied lcodes should be dropped and the resulting databases should be run through Calc, Dbcal, Solve anew. There is a limit of the size of internal buffer for keeping one observation: 32768 bytes. In the case if the input media is a database and the input database contains a lot of lcodes then this buffer can be overflown and dbedit will terminated abnormally after issuing pre-mortal error message. This problem will be solved in the next version of dbedit. Currently, if you need to process the input database which has records which exceed the length of the internal buffer, you can try to read not all lcodes, but only part of them. Refer to \ description of keyword Input_lcode in the section $Input_Filter ). For example, you can remove all Calc-supplied lcodes (and later to re-Calc the database).

6   History

Who When What Gregg Cooke 90.02 Creation. Gregg Cooke 90.04.15 Release of the first beta-version. Brent Archinal 90.12.11 Name changed to "dbedit" from Dave Gordon working name of "kbmsg". Database output loop fixed to work with multiple databases. Time selection fixed. Brent Archinal 91.04.30 Changed to operational version 1.0, call to datsv added to dbedit.f Brent Archinal 91.07.25 Size of Uflag fixed several places, call to phist_ephut changed in gen_outdb. Version 1.1. Darin Miller 91.11.19 Changed variable "ut1nm" to "dut1nm" to correct a redundant common specification, and dimension- ed "dut1nm" to be 255 characters. Added error messages. Darin Miller 91.12.11 Added calc run option,including user specified calcon file name. Changed default calcon file path to users home directory, and temporary file names have process id appended to them. Darin Miller 91.12.13 Added verbosity level to keep the temporary files or to delete them. Brent Archinal 92.01.14 Version 2.0. K. Wilcox 92.04.30 Version 2.1. Changed for 700 series use and system default to gotten from ../includes/param.i. "sqzspc" used instead of "sqz".\ "implicit none" added. Sscat interface changed. Initialization of integer part of ltoc added. Get_outdb ierr=-2 handled. Brent Archinal 92.10.06 Version 2.2. Changes made to handle ~20000 obs. Also fix to read_ddbuf and fixes to minimize number of units opened (e.g. "rjdcb" set only once here). Kaybee Wilcox 92.11.10 Call to geterr.f added. Brent Archinal 93.01.21 Fixes made to above change. "geterr" now "chkcf". Version 2.3. End of program message added. Melvin White 94.02.14 Version 3.0. Hard coded LU numbers Brent Archinal replaced by variables Luout and Luerr. Numerous routines modified. Added "Primary_elim" option. Changed "kill" to "pkill" and "dkill". Added "idpcp" and "idpcd". Modifications made for SunOS compatibility. Handled changed error returns from get_ut1pm.f. clcdcb properly initialized. Brent Archinal 95.02.21 Version 3.1 datbuf.i and read_dbh.f updated to handle more lcodes. open_mk3.f updated to better handle more observations, and sitstr.i to handle more sources. Brent Archinal 97.02.12 Version 3.2. Handling Fourfit data. Several other minor changes since version 3.1. " " 98.03.23 Version 4.0. Allowing integer*4 number of observations. " " 99.05.18 Version 4.1. Improving error message and allowing for C band data in open_mk3.f. Also allowing 1000 radio sources per experiment. " " 1999.11.04 Version 5.0. Tentative handling Mark IV -2000.04.19 correlator on input and year 2000 data. Fixed faulty UT1 and polar motion printout. Increased length of scrdcb record by "ntnew" words. Added use of intrv4, and fexpnm_ch. " " 2000.05.12 Version 5.1. Some improvements for handling Mark IV data. Final changes by B. Archinal. Leonid Petrov 2000.05.19 Added setting appropriate completion codes. Improved comments. Now dbedit returns completion code 0 in the case of normal termination and not zero in the case of abnormal termination. Leonid Petrov 2000.07.04 Version 6.0 . Massive re-writing. Changed internal data structure: instead of using commons with obscure addresses arrays, I used the same data structure throughout dbedit. Provided full of support Mark IV correlator output. Added NOOUTPUT mode. Made adding aprioris, UT1PM, ephemerides optional. Lifted restrictions on using databases as input media. Allowed merging mode (with some reservations). Leonid Petrov 2000.07.19 Corrected a bug: the previous version didn't allow to read correctly more than one database as input source since the table of contents was read only for the first database Leonid Petrov 2000.07.24 Corrected a bug: the previous version didn't work correctly when more than 1 input medias have been specified. Leonid Petrov 2000.08.23 Corrected a bug: the previous version didn't react to the error of overflowing in GEN_ADR. It proceeded despite error conditions and then crashed. Error message in GEN_ADR was misleading. Leonid Petrov 2000.09.06 Corrected a bug: the previous version discarded the last observation if the input media was a database. Leonid Petrov 2000.09.06 Corrected a bug: the previous version didn't make substitution of names of sites and stars for lcodes BASELINE and "STAR ID " in the records of the 3-rd type. Leonid Petrov 2000.09.07 Changed the logic of adding apriori EOP: forced add_ut1pm to remove all old lcodes with EOP which the database might had before adding new lcodes. As a result the output database will have only one type of aprioris: FINAL, PRELIMINARY or EXTRAPOLATED. Leonid Petrov 2000.10.17 Fixed a bug: the previous version failed to work correctly if the input media was "database in DBH format when" and the database didn't have records of type 3 for the last observation. Leonid Petrov 2000.11.08 Fixed a bug: the previous version ignored keyword EPHEMERIS YES in the control file and worked as if user always specified EPHEMERIS NO. Leonid Petrov 2000.12.29 Fix a bug: The previous version didn't stop after the failure to parse OVEX root file. The previous version didn't recognize the signature of vex file "$OVEX_REV ;" and the signature of version "rev = 1.5 ;". The new version recognizes them and considers the failure to read VEX file as a fatal error. Leonid Petrov 2001.01.25 Fix two bug: the previous version of dbedit tried to read skeleton database even if the user specified "APRIORI NO". The previous version of dbedit returned completion code "success" in the case of failure to read the skeleton database. Leonid Petrov 2001.05.24 Added support -h option -- inclusion history file. Leonid Petrov 2001.06.14 Fixed a nasty bug in dupelim_obs.f: in the case when the primary elimination criterion was quality code and the secondary elimination criterion was SNR the previous, post June-2000 version used latest fourfit time criterion instead of it. Leonid Petrov 2001.06.14 Fixed a nasty bug in dupelim_obs.f: in the Leonid Petrov 2001.06.22 Added new features: a) support of keyword SCAN_CHECK in $INPUT_FILTER section -- for checking whether there are duplicates with slightly different time tags; b) support of keyword NOINPUT_OBSER in $INPUT_FILTER section -- for specifing the file with the list of observationbs which should not be included in the output database; c) support of qualifier "-" after keyword Database in $OUTPUT_DATABASE section for not creating the output database. Leonid Petrov 2001.07.09 Added support of Fourfit version 3.0 -- new lcode #SAMPLES is put in the database. #SAMPLES is the arrays which keeps the number of samples PROCESSED by fourfit for each channel, each baseband: low and upper. Leonid Petrov 2001.07.11 Fixed bug in parsing root vex-file: the previous version didn't work correctly if frequency tables (channles names) for two stations of a baseline were different. Leonid Petrov 2001.07.11 Added code for an attempt to deal with bug in Fourfit version 3.0 and older: sampling rate was incorrect. If the version of Fourfit is greater than 3.0 then dbedit stops with the error message. If not, then it takes sampling rate from root vex-file (unless it is zero there). Leonid Petrov 2001.09.19 Fixed the bug made, probably, on 2001.04.24 units transformation for phase cal frequency was wrong: it was transformed from HZ to milliHZ instead of kHz what resulted in integer overflow. Leonid Petrov 2002.07.30 Allowed Q suffix in Mark-3 input media in order to process correctly Q-band data. Converted year time tag from 2-digit year to the 4-digit year in reading Mark-3 media. Leonid Petrov 2004.04.13 Fixed numeruous problems related to transition to Fortran 90 compiler. Leonid Petrov 2004.11.04 6.10 Added lcode BITS_SAMPLE -- the number of bits per sample. Leonid Petrov 2004.11.01 6.11 Fixed an error: the index at SOB array was ommitted which resulted in crash. The old version did not complain at the wrong keyword in $INPUT_FILTER section. The new version abnirmally terminates when it finds unsupported value. Leonid Petrov 2005.04.11 6.12 Fixed an error: the old version ignored seconds part of the date tag when made a time window test. John Gipson 2006.12.12 6.13 updated to perform the necessary conversion BigEndian --> LittleEndian to make databases from Mark4 fringe files on computers with Little Ednian architectures, such as i386 and x86_64. The conversion is handled internally within the program. Therefore, no changes are needed to the control files. The same control files (using the 'Mark4_Directory' input type) should work on either HP Unix or on Linux PC systems.



Questions and comments about this guide should be sent to:

Leonid Petrov ( pet@lyra.gsfc.nasa.gov )


Last update: 2006.12.12