User's guide to interactive VTD/Post-Solve

L. Petrov


Contents:
Abstract

This manual describes using the program pSolve for analyzing geodetic VLBI observations in interactive mode. It contains some recipes for using this program.


Introduction

CALC/SOLVE is a program for analyzing geodetic VLBI observations which was originally developed in the 1970s at Haystack Observatory and then heavily modified at the Goddard Space Flight Center in mid 70s. Since then it has been under continuous development through joint efforts of NASA/GSFC, the U.S. Naval Observatory, University of Bonn, and Astrogeo Center. SOLVE is a tool for

There are several clones of SOLVE. In mid 80s a clone supported by a group in Muzusawa Observatorey diverted. That clone is called mSolve. It supports import and export data in FITS Database format (that is *not* compatible with FITS-IDI or FITS-image format). In 2008 a cline supported by NVI Inc diverted. It is called now cSolve. pSolve is currently supported by NASA Goddard Space Flight Center. The main difference are 1) a complete support of databases in the GVF (Geodetic VLBI Format), 2) replacement of obsolete Calc program with a modern VTD (VLBI Time Delay) library that computes a priori path delay, and 3) revision of the way how flags are handled for different observables. These new feature significantly simplify interactive analysis. In 2017 pSolve was converted from Intel compiler to gfortran, from 32-bit to 64 bits, and the maximum number of estimated parameters was raised to 100,000. In 2022 a new installation utility was deveoped, and pSolve was became a part of SGDASS (Space Geodesy Data Analysis Software Suite).

The present document explains the use of VTD/pSolve

pSolve supports both interactive and batch [1] modes. pSolve can be used in the following modes:

Customization

Before starting to use pSolve the first time you have to customize your environment. Don't try to use pSolve without proper customization! You will waste your time.
  1. Shell. tcsh is recommended. Although pSolve is able to run under either tcsh or bash shells, this manual doesn't cover the topics of how to use pSolve under other shells.

  2. Check your path variable. It should contain directories where system executables and pSolve executables are located. Consult your system administrator.
  3. Environment variables. pSolve has a plenty of configuration parameters. There are three levels of configuration:
    • Pre-compiled system-wide default.
    • User defaults specified by environment variables.
    • Current settings.
    Definition of user defautlts are gathered in a special file. If you don't have it you should copy the file solve_env.template from $MK5_ROOT/example/ to your home directory, rename it, read and edit. Then add execution of this file in your startup shell file ~/.tcshrc :
    source {file_name}

    Environment variable DISPLAY should be set up properly. Refer to X11 user guide.

  4. X-resources. Some pSolve programs use a PGPLOT library of graphic utilities which in turn invokes XW driver. It requires proper setting X-resources. You should add the following lines to the file ~/.Xdefaults if you use a big screen: 340x280mm pgxwin.Win.geometry: 1260x800+0+90 pgxwin.Win.maxColors: 69 pgxwin.Win.iconize: True or if you use a small screen: 300x230mm: pgxwin.Win.geometry: 1000x680+0+90 pgxwin.Win.maxColors: 69 pgxwin.Win.iconize: True

    Then you have to activate the settings defined in ~/.Xdefaults by the command

    xrdb -merge ~/.Xdefaults NB: Environment variable DISPLAY should be set up properly before using xrdb). This operation should be done each time you start a terminal session, therefore you have to put the call of xrdb in your customization file, for instance, in $HOME/.tcshrc
  5. pSolve initials. pSolve creates a number of scratch files. In order to allow more than one user to work simultaneously a two-letter code called "pSolve initials" is granted to each pSolve user. A given user may have more than one initials and is able to run more than one instance of pSolve. pSolve initials are defined in the file $SAVE_DIR/letok . Add a record for your initials if you don't have it.
  6. Creation of scratch files. pSolve keeps intermediary results in so-called scratch files. In order to prevent abnormal termination when no free disk space left, pSolve reserves space for scratch files from the very beginning and doesn't remove these files after termination. Size of these file is determined by a) maximum number of observations in the sessions to be analyzed; b) maximum number of parameters to be solved for.

    Run a program psolve_reset if you don't have pSolve scratch files. NB: Scratch files should also be re-created after each pSolve upgrade! You should also be sure that your environment variable WORK_DIR has been set before running solve_reset: echo $WORK_DIR . Usage of solve_reset:

    Usage: psolve_reset xx max_obs max_parms Where xx = user's initials max_obs = maximum number of observations max_parms = maximum number of parameters Parameter max_obs should be set to twice maximum number of observations in one session if the first analysis is intended to be performed. Suggested parameters: max_obs: 40000, max_params: 24000.
  7. C-shell program sol. Script sol facilitates using pSolve. It expects to find your pSolve environment variable definition file in your home directory under the name {solve_initials}.solve_env where {solve_initials} consists of lower case symbols, e.g., px.solve_env if pSolve initials PX are in use. Script sol> can be found in $MK5_ROOT/example directory
pSolve processes the VLBI data collected in databases in GVF format. A database contains the data from a specific experiment. That includes group delays, phase delay rates, single band delays for all bands and all observations, their uncertainties, their flags, a number of auxiliary parameters, and setup of previous analysis if available. An observation in this manual is an estimate of group delay at a given baseline for a given scan. A database in GVF format is generated either 1) by PIMA using visibility data and results of fringe fitting or 2) or converted from VGOSDA format, 3) converted from MARK3 DBH format by program mark3_to_gvh. The data in VGOSDA format are produced eigher from data in GVF format or from data in VGOSDB format.

NB: The word database is used for describing parameters related to a given VLBI experiment, which are stored on 4-6 files. It has nothing related to a popular SQL program.

The file vcat.conf located in pSolve SAVE_DIR directory defines directories where all databases are located:

GVF_REP_NAMES:    ... 
GVF_DB_DIR:      directory
GVF_ENV_DIR:     directory
...
GVF_DB_DIR:      directory
GVF_ENV_DIR:     directory
VTD_CONF_FILE: file name
The keyword GVF_REP_NAMES defines the list of so-called reposiatory names. The repository name is in upper case and is limited to 4 characters. The keyword GVF_DB_DIR defines the directory where binary sections of databases of a given repository are located. The keyword GVF_ENV_DIR defines the directory where ascii files with database envelops of a given repository are located. The keyword VTD_CONF_FILE defines the control file for the VTD (VLBI Time Delay) library. pSolve computes theoretical path delays and partial derivatives when a database is loaded and every time when the function "update of theoretical delays" (~) is invoked. Keep in mind that the VTD control file usually contains parameters that are supposed to be periodically (f.e. once a day or once a week) updated. Among them, there are a priori Earth Orientation Parameters, displacements caused by mass loadings, slant path delay in neutral atmosphere, and path delay in the ionosphere. A user is supposed to launch a process to update these parameters (vtd_apriori_update.py from VTD library). If some parameters are not updated, pSolve will stop during process of loading a database ans issue an error message that explains the reason.

If pSolve stops during an attempt to load a database, you need to update the a priori parameters. All stations and sources used in the experiment being analyzed must be defined in the a priori station and source catalogues.

Interactive pSolve

Overview

Interactive pSolve communicates with a the user by means of a set of menus. The majority of programs display menus in a text window and prompt you to enter a command. Each command has a one-letter code and an active field on the screen. A user has two ways to activate a command: either to position a cursor on the active field and then hit the space bar (space key) or to hit a command code letter. Commands codes are enclosed in parentheses and emphasized in bold in this manual. Parentheses themselves are not a part of a command code. Many programs have sub-menus which operate the same way.

pSolve requires terminal window with size not less than 80x24 characters. NB: if you want to change terminal window size, you need to first quit pSolve, then change the terminal size, then start pSolve. Otherwise, the program will crash.

pSolve consists of a set of programs which call each other. In order to launch interactive pSolve, enter a command

solve for example, solve PE The first program of the interactive pSolve is OPTIN. When the program OPTIN starts, it displays the main pSolve menu and prompts for user input. You can terminate pSolve by hitting the key (T) from the main menu, or by hitting the key (CNTRL/C) when pSolve is waiting for user input. The menu settings remain untouched and you can resume analyzing the experiment after the interruption. NB: be cautious with using (CNTRL/C) while pSolve is working: if you interrupt pSolve during disk writing you may not be able to resume analysis. pSolve expects to find VLBI observations and other intermediary data in the scratch files located in your working directory $WORK_DIR. Each scratch file has a suffix with pSolve user initials; this is what allows pSolve to distinguish scratch files that belong to different users.

pSolve scratch files contain information about one or more sessions. To learn the status of scratch files hit the key (X) from the main pSolve menu. pSolve reports the database name, version number, total number of observations (including non-detections), band and status.

Interactive pSolve allows the user to perform the following operations:

Data loading

pSolve is able to read the data in GVF format. Although post-pSolve still supports two other obsolete formats: MARK-3 DBH and superfile format, the use of data in these formats is not explained in the document. Support of Mark-3 DBH and superfile formats will be removed in the near future.

The first operation in the data analysis is to read the experiment and to load it into the scratch files. Only one database can be loaded at any one time. Option (G) or (CNTRL/G) in the OPTIN menu invokes program GETDB, which loads the experiment in the scratch area. On launch, GETDB shows the list of all databases in your database repository. OPTIN shows the current repostory ioni the 6th line. You can change repostitory by hittin (R) You can scroll the list using (ArrowUp), (Arrow Down), (PageUp), and (PageDown) keys. When you find the database that you want to load, position the cursor on that database and hit key (space). However, if you have many databases in your database directory, this method is inconvenient. As an alternative, hit key (T) and pSolve will prompt you to enter the database name. If you enter complete database name, including the version, pSolve will load the database immediately. If you enter a partial database name, pSolve will show you the list of databases with names that start with the substring that you have entered. Option (X) inquires about the current status of the scratch area.

Initial solution

The initial solution is carried out when the experiment is analyzed the first time. The purposes of the initial solution are: The sequence of operations:

Initial settings

Hit (L) key. You should see at the bottom "Last page Parms used / Max parms available:". Select solution type and suppression method. If you have a database generated by PIMA, you will find that the suppression method is set to SUPMET__META. If your database was just converted from Mark-3 DBH, and you loaded it to VTD/pSolve for the first time, it may have a different suppression method. Then you need to change the suppression method to SUPMET__META. Hit key (') and change suppression method to SUPMET__META.

The next step is to select the observable. In a case of single band data you can select "Group delay X-band", "Phase delay X-band", or "Single band X-band". It is suggested that beginners should use "Group delay X-band". NB: for pSolve "X-band" means the upper band and "S-band" means the lower band. For instance, in K/C observations, pSolve will refer to the K-band data as "X-band" and the C-band data as "S-band". This may sound confusing.

In a case of dual-band data you have more choices:

"Group delay X-band", "Phase delay X-band", "Single band X-band", "Group delay S-band" "Phase delay S-band", "Single band S-band", "G-Gxs combination" It is recommended to start with the low band.

(Exception: there are a number of databases created in 1985--1993 where the S-band data are lost, but where the ionosphere calibration is present. For these experiments, do not change solution type and suppression method)

After you change the data type, re-compute theoretical path delays. The ionosphere contribution depends on the solution type. This is a general rule: every time if you change solution type, you need to re-compute theoretical path delays. Failure to do so may bring you erroneous results.

Setting initial parameterization

Go to the SETFL last page menu by hitting (L) in the OPTIN menu. Turn the estimation of site coordinates off by hitting (#), set all EOP flags to 0 (off), set the nutation estimation flag to zero by hitting (.). Then set the clock parameters for all stations except the station taken as a clock reference. Hit the key (E), then you'll see the "site menu" of the SETFL program. It is irrelevant which station is taken as a clock reference station at this step except the cases: a) when there are too few good observations (say, less than 8) at that station; b) when the network is split into two independent subnetworks. Set the clock polynomial flags to 1 for the first three terms (clock shift, clock drift, frequency drift), f.e, Clock polynomials 93/07/21 19:59 1 1 1 0 0 * * * Be sure that all others parameters related to that station are disabled: atmosphere path delay, atmosphere gradients, axis offset. Keys (N) and (P) allow you to move to the next and the previous station.

Check your parameterization once more. Only clock polynomials of the 0,1,2 order should be set up. All other parameters should not be activated. Then run LSQ solution by hitting (Q). Look at the listing of the solution. Check once more that only clock polynomials are in the solution.

Check wrms. It should not exceed 1 microsecond. If the wrms exceeds that value it means that you have trouble, e.g., there are several very strong outliers.

Check the clock offsets and rates (CL 0) and (CL 1). If there are stations with clock offsets greater by modulo than 10-4 sec -- 100 000 nsec and/or the stations with clock rate greater by modulo than 10-9 -- 100 000 D-14, you should apply an a priori clock model for those stations. The number of digits in a float number presentation is not enough to handle the case when adjustments to clock parameters are too large. Rounding errors may corrupt results. To overcome this problem, an a priori clock model is added to the theoretical delays and delay rates. If you notice that clock offsets and clock rates exceed the limit for all stations, it means that the clock-reference station itself has anomalous clock offset or rate. Change the clock-reference station in that case. You can find preliminary values for a clock model from the correlation report if it is available.

Setting a priori clock values.

If all clock offsets and rates are below the limit you can skip this step.

Find all stations with anomalous clock offsets and/or rates. Write down the values of clock and rate for these stations. pSolve supports an a priori clock model for up to 4 stations. Go back to OPTIN menu. Then go to the last page of SETFL. Then hit (<). You will see the menu of SET_ACM program. Follow the SET_ACM manual [2]. Make one more LSQ solutions after applying a priori clock model. Clock parameters for the stations with applied clock models should be about zero.

Group delay ambiguity resolution.

The next step is to check whether you have to resolve group delay ambiguities at both bands. If your database was generated by PIMA, you rarely have ambiguities in group delays. If your database was generated by Fourfit and then converted to GVF format with program mark3_to_gvf you may have group delay ambiguities. Group delay ambiguities is an artifact of Fourfit algorithm. The way how Fourfit works, group delays are not determine in a unique way buy as τ + N * S, where S is the so-called ambiguity spacing and N is an arbitrary integer number. The group delay ambiguity spacing is the quantity that is reciprocal to the minimal frequency separation between intermediate frequencies if the delay is determined with Fourfit, and the quantity that is reciprocal to the resolution of visibility spectrum if the delay is determined with PIMA. Fourfit and PIMA set up the initial value for N in order to keep the residual group delay, i.e. the difference between the estimated group delay and theoretical group delay below 1/2 of the group delay spacing. But accuracy of theoretical model used by Fourfit is often not sufficient and as a result initial value is often wrong and the for a subset of observation it needs to be changed. This process is called group delay ambiguity resolution. Since spectral resolution is much lower than the minimal frequency separation between intermediate frequencies, group delay ambiguity spacing for delays determined with PIMA are much greater, typically 1–10 microseconds, which is usually significantly greater than errors in a~priori model.

If there is more than 3–5 observations with group delay ambiguities determined incorrectly, these errors corrupt least square solution to that extent that the residuals after subtraction the contribution of adjusted parameters to the initial group delay are so severely distorted than it becomes close to impossible to determine which observations caused solution distortion by inspecting residuals. Therefore, a more sophisticated algorithm is required.

When group delay ambiguities are resolved, the residuals that correspond to observations affected by wrong ambiguities are aligned along lines above or below the zero line on the plots of residuals versus time. The deviation from the zero line is close to the N * S.

There is another factor that can lead to appearance of a significant number of observations with residuals that are aligned in plots versus time after the affected observations are identified and eliminated from the solution: errors in fringe fitting algorithm that resulted in picking up a sidelobe in the delay resolution function, or using another language, a secondary maximum of the Fourier transform of the visibility spectrum. If phase bandpass was not determined correctly or there were instrumental factors that distorted fringe phases at some intermediate frequencies and observations had relatively low SNR, the noise in the data may change the amplitude of maximum of delay resolution function. The maximum that is secondary for the undisturbed data may have greater amplitude than the original main maximum. High level of sidelobes, narrow bandwidth of intermediate frequencies, and low SNR increases a chance of an error in the fringe fitting algorithm. Observations affected by this problem have errors in delay at 1–2 main sidelobes. After suppressing these points from the least square solution, the residuals have a pattern that resembles errors in group delay ambiguities. The main difference in plots is the that residuals affected by group delay ambiguities have fixed spacings N*S, but observations affected by errors in pickling the maximum have residuals that are not commensurate to S. These phenomena called sub-ambiguities. Group delay ambiguities can be resolved by pSolve. Sub-ambiguities cannot be resolved by pSolve only. pSolve only marks them as outliers. The sub-ambiguities usually can be resolved by re-running PIMA with a narrow fringe search window. This procedure is called re-fringing. See section resolving sub-ambiguties for details. To make things complicated, observations can be affected by both group delay ambiguities and sub-ambiguities.

There are two ways to resolve group delay ambiguities: manual procedure and automatic procedure GAMB. The automatic procedure should be used, except rare cases which it currently doesn't support. If you have dataset produced by Fourfit, read the guideline for resolving group delay ambiguities, otherwise follow this document.

Manual elimination of observations affected by sub-ambiguities.

Manual group delay ambiguity resolution is an alternative technique.

Steps of manual sub-ambiguities resolution:

Database update

Some parameters related to the solution, such as group and phase delay ambiguities, suppression status, clock and atmosphere parameterization, baseline-dependent clock status, reweighting parameters can be saved in the database. Database update is the final step of analysis. Hit (U) or (CNTRL/U). You have a choice either update the current version of the database (option 1) or create a new database with the updated version counter (option 2). Usually, option 2 is used when the experiment is analyzed the first time. This allows to start analysis anew if an error in analysis is found. It is rarely needed to have version counter greater than 2.

Resolving sub-ambiguties

In a case if the fringe fitting procedure picked a maximum in the Fourier transform of visibilities that is not the global maximum, the estimate of the group delay is wrong. The placement of secondary maximum depends on the frequency sequence. A good frequency sequence has the amplitude of the secondary maximum in a range of 0.5–0.8 provided there is no systematic phase offsets between intermediary frequencies. The probability to pick up a wrong maximum is low even at marginal SNR=6. However, in the presence of systematic phase offsets and/or losses of IFs, the secondary maximum may become close or even exceed the amplitude of the main maximum in a case of a lack of phase offsets. If the excessive phase distortion is persistent over the experiment, the points that correspond to observations where fringe fitting picked up secondary maximum are aligned along horizontal lines on plots of residual group delays. This points are called sub-ambiguities.

Resolving sub-ambiguities is done differently for two cases when the amplitude of the secondary maximum in delay resolution function is less than 0.96 or greater than 0.96. Below we consider a case when the secondary maximum are less than 0.96 when the experiment is processed with PIMA.

We first suppress all outliers. When we are satisfied with solution, we store residuals. First hit (L), then hit (A) to set Print residu(A)ls: ON. The hit (O), and ; to rewind the spool file and then hit key (C) in order to set (C)hange Spooling current: on. Check that the spool file was rewound, spooling is on and print residuals in ON. Hit (Q) to run least square solution. Hit (space) twice in order to get listing, then hit (O) and hit CNTRL/U in order to save the database. You need to save database with version > 1. After that terminate pSolve by hitting (T). Examine the spool file. You will see residual section. Check that you have listing for only one run. Copy the spool file into /vlbi/$exp/$exp_$band_init.spl file. Here $exp is the low case experiment name and $band is low case band. For instance /vlbi/bp192b0/bp192b0_c_init.spl . Check that 1) the PIMA control file is /vlbi/$exp/$exp_$band_pima.cnt; 2) Keyword Band is the same as $band, but in upper case. 3) EXP_NAME and EXP_CODE in file /vlbi/$exp/$exp.desc is $exp in low case; 4) DB_NAME in /vlbi/$exp/$exp.desc is the 10 character long GVF database name that pSolve just processed.

Run script pima_samb.csh:

pima_samb.csh {exp} {band} {snr} [db/no_db] [min_res] [no_staging] where exp is the experiment name (lower case); band is the lower case band; db means to create the output database (default), no_db means to skip database creation, min_res is the minimum by modulo group delay residual for an observation to be eligible for sub ambiguity resolution procedure, no_staging prohibits using staging directory for PIMA even if it is defined in the PIMA control file. Since the search window is narrower, you can reduce the SNR detection limit. 4.8 value is usually adequate. If you have one band, you need to use the fourth argument db, i.e. to create the database. After pima_samb.csh finishes, you load the database, the last version, run least square solution and examine residuals. pima_samb.csh rarely resolve allsub-ambiguities, but almost always resolves some subambiguities. Run ELIM to remove possible outliers and then MILE to restore points, including those that were marked as outliers before running pima_samb.csh.

If your experiment has two bands, you create files residuals for each band separately. Then you first run pima_samb.csh for the lower band (i.e. S-band for X/S observations) and specify no_db as the fourth argument. Then you run pima_samb.csh for the higher band and specify db as the third argument. The load the database and run ELIM and MILE first with the lower band, then the with higher band, then with the ionosphere-free linear combinations of two bands.

What occurs behind the hood?

pima_samb.csh calls program samb. Program samb reads the residual file and computes the narrow search window over group delay for consecutive fringe search. The center of the search window corresponds to the predicted group delay on the base of least square solution. Accuracy of path delay prediction is within 2–3 wrms of positfit residual, typically 100–300 ps. The fourth argument of pima_samb.csh defines the semi-width of fringe search window. If the fourth argument is omitted, a meaningful default is used.

pima_samb.csh generates a control file for PIMA with the narrow fringe search window for each observation marked as an outlier in the residual file. i.e. < or R in the 8th column. Then it runs that control file. This forces PIMA to pick up the maximum in the delay resolution function within the the specified window. Upon completion, pima_samb.csh runs PIMA task mkdb and creates version 1 of the database file in GVF format. Finally, pima_samb.csh runs program gvf_supr_promote . Since you lowered the SNR detection limit during re-fringing, you need to carry the flag of detection into the latest database version. Program gvf_supr_promote does it.

References

Some user documentation related to pSolve.
This document was prepared by Leonid Petrov

Last update: 2015.12.05_22:35:35