Accumulation period (integration time) in DiFX.

I wanted to have to force the DiFX correlator to put each accumulation period (AP) (= integration time) at the regular grid across the entire experiment. The benefits of such a time tag assignemnt is that if the array observed source A in scan #1, source B in scan #2, and again source A in scan #3, scans #1 and #3 can be glued together and the fringe search procedure can be applied to the glued scan, provided the gap between scans is not long enough to hit the decorrelation limit.

Unfortunately, by default the DiFX correlator in August 2010 did not make the gaps between the last uv-point of the previous scan and the first uv-point of the next scan commensurate to the AP lengths.

Adam Deller in his letters of 2010.08.10 provided detailed explanation:

The correlator divides integrations ("APs") into subintegrations, where one subintegration is processed on one node and the results returned to a central location for further processing. Obviously, a subintegration is forced to be an integer number of FFTs. An FFT duration is given by 2 * num spectral points per band * sampling time.

For continuous opbservation the sampling time in seconds is usually 1/2B, where B is the IF width in Hertz.

By default, if an experiment has an integration time which would not lead to an integer number of subintegrations, the VLBA correlator adjusts the integration time to be an integer number of subintegrations. This means the weights remain constant and the visibility time tags of every integration are exactly correct. This behaviour can be relaxed, retaining the original integration time and allowing DiFX to vary the number of subintegrations to remain always within 1/2 subintegration of the desired integration duration. However, this leads to varying integration widths and hence weights (e.g. 0.99, 1.01, 0.99, 1.01 ...) and time tags which are not exactly correct, although they do then lie on an exact grid.
If you need your visibilities to fall on a grid, please include with your instructions to the correlator facility a request to turn off the feature "tweakIntTime". This will prevent the integration time being changed as the correlator job is queued. Depending on your requirements for spectral resolution, it may or may not still be possible to have an integer number of subintegrations per integration. For example, if you had 8 MHz bands and required 512 channels per band, a single FFT would span 64 microseconds, and there is no way to get an integer number of FFTs in exactly 0.25 seconds. If that is the case you will have to accept the drawbacks listed above if having a uniform visibility grid is crucial. If, however, you calculate that you can fit an integer number of FFTs into your integration time, it would then greatly assist the correlator facility if you include this information and specifically request that the subintegration time be an integer divisor of this duration/number of FFTs.
If a subintegration length is not explicitly specified, the queuing software tries to select something sensible based on some simple heuristics, which try to satisfy constraints such as optimal buffer sizes. To see what was chosen, you would need to look at the correlator configuration file. For current and past jobs, there is a line BLOCKS PER SEND which gives the number of FFTs in a subintegration, which is converted to a time by multiplying by the FFT duration. In the development version of DiFX, this is instead specified as SUBINT NANOSECONDS (and if you want to know how many FFTs that is, divide by the FFT duration).
Since the integrations always begin anew at the start of a new scan, the most convenient way to ensure this happens is to have integrations which are 2^N seconds, where N is one of 0, -1, -2, -3... This is only because scans always start on an integer second boundary.
Let me clarify. When I say "scan", I mean the scan defined in the vex file. The first visibility that is generated and written will depend on when a given antenna got on source and started recording. This might be the very first visibility possible, or it might be 5 seconds later, or 10, or whatever. But the visibility time tags will follow an exact sequence beginning at the start of the vex scan. Of course, AIPS and other post-processing software only see the visibilities, so they have no idea when the vex "scan" started. But if you look at the vex file for your experiments, and see when the scans start, you will find that the the visibilities fall on a grid relative to the start of the vex scan, which starts on an integer second.

As far as I understand, the correlator uses (or may use?) flag data to bypass time when the station is not on source. The flag data are rounded to an integer second.

If you have e.g 0.64 second integrations and e.g a 15 second scan, that scan would have 24 integrations but the last one would not be complete, and the first integration of the following scan would not be timestamped at 0.64 seconds after the preceding one, which is no good

I noticed that the integration time in experiment BC191A specified as 0.25 sec in the key file (coravg = 0.25) processed with DiFX 1.5-3. Was changed to 0.26 that made the gap between scans not commensurate to AP lengths.

Adam continues:

I guess in your case the heuristics chose a subint length of either 3250 FFTs, and then changed the integration time to be 0.26 seconds = 16250 FFTs, thus 5 subintegrations per integration. In this case, having 3125 FFTs would have given the correct integration time and still been reasonable. I can look at improving the heuristics, but the best way to be sure this situation does not arise is to 1) make sure it is feasible to have an integer number of FFTs (in this case, yes), and 2) include in your notes/requests to the correlator a request to turn "tweakIntTime" off and to check that the chosen subintegration duration is still an integer divisor of the integration time.

Am I right that if "tweakIntTime off", then the number of sub-integrations in one AP will be the same, except the last AP?

Not necessarily. Imagine you requested an integration time of 1 second, and you had 8 MHz bands with 4096 spectral points. The FFT duration is 512 microseconds, so there are 1953.125 FFTs per second. So say the subintegration duration is set to 125 FFTs, or 64 ms. On average there will be 15.625 subintegrations per integration. But what will actually happen is the first integration will have 16 subintegrations, so it will actually last 1.024 seconds. The next will only have 15 subintegrations, for a duration of 0.96 s. It will go on: 16,15,16,16,15,16,15,16.... keeping the error in the time tag to at most 1/2 a subint = 32ms ~= 3% of an integration
Remember that the integration ("AP") can be made up of some number of subintegrations, and it is the subintegration length which is specified by BLOCKS PER SEND. So here the subintegration length is 0.4 seconds, and there are 10 subintegrations per integration.
The FITS files have the time tags of the integrations. The time tag of an integration is the center of the visibility, not the start. This is why the scan started on 0.13 seconds (half an integration). So the scan itself started on an integer second, but the first visibility record (and hence the start of the scan reported in AIPS) has time "scan start + half an AP".

I got confirmation from Walter today, 2010.08.10, that "tweakIntTime" has a known bug in DiFX-1.5-3 and older version. He will be trying to address it when he gets a chance. But for now, so long as it is turned off and we ensure that the subintegration durations match what we'd like, we are fine.

2010.08.10_21:18:35