Convert SMA to CASA

Creates one FITS-IDI file for each sideband of each chunk of a raw SMA data set. The FITS-IDI files will later be read by a second script, smaimportFix.py, which creates a CASA measurement set (MS). The reason for this two step conversion process is that CASA measurement set format is not a stable target; it can be changed by the CASA software group at any time. FITS-IDI format, on the other hand, is a standard format which is unlikely to change, because doing so would break many existing software packages. See the "Getting Started" section for information about obtaining these scripts.
Uses Tsys information in the SMA dataset to convert SMA visibilities to "pseudo-Jansky" units, which (barring significant problems with the track, e.g., bad pointing) are within ~20% of the correct Jy values. It also uses Tsys, along with integration time, to calculate weights for the visibilities. These calculations are performed before the FITS-IDI files are written.

2. Getting started

By default, sma2casa.py will call the module makevis.c, which requires that makevis.so be present in your current working directory. If it is not, or if sma2casa.py does not work properly, you can recompile makevis.c from source code and a Makefile in the github repository. Alternatively, the script can be run with the -P option. This option has the advantage of being more portable (no recompiling necessary), but the disadvantage of running more slowly (by ~ a factor of 2).

The following Python modules must be present:

numpy
astropy
pyfits (now a part of astropy)

To verify that they are installed, type (on successive lines):

Python
help()
modules

If you don't have the necessary Python modules, install the Anaconda Python distribution, which has everything needed for running sma2casa.py.

3. How to run sma2casa.py

Execute the command in your terminal window:

$ sma2casa.py [path-to-SMA-data] [options]

Script options

-c m,n,o [--chunks x,y,z]       Process chunks x, y and z only (comma separated list)
-h [--help]	        	Print this message and exit
-l [--lower]	 	        Process lower sideband only
-n [--newChunk]	 	        Define a new, synthetic, spectral chunk
-p [--percent]	        	Percent to trim on band edge (default = 10)
-P [--PythonOnly]	 	Do not use the C module "makevis"
-r [--receiver]	        	Specify the receiver for multi-receiver tracks (230, 345, 400 or 650)
-R [--RxFix]	        	Force the data to be treated as single receiver
-s [--silent]	 	        Run silently unless an error occurs
-t [--trim]	        	Set the amplitude at chunk edges to 0.0
-T	 	                -T n=m means use antenna m's Tsys for antenna n
-u [--upper]	 	        Process upper sideband only
-w m,n,o [--without x,y,z]	Do NOT process chunks x, y and z (comma separated list)

The -T option provides a crude way to handle bad Tsys information. If antenna n has noisy or garbage Tsys values, -T allows antenna m's Tsys to be used instead for that antenna. Dual receiver tracks must use the -r option, and each receiver must be processed separately.

The -n switch allows you to define new, synthetic, spectral "chunks" (spws in CASA terminology) which have different total bandwidths and spectral resolutions. The main purpose for this feature is to allow SWARM chunks to be processed at reduced spectral resolution, especially for continuum projects, in order to reduce the size of the CASA Measurement Sets. Although SWARM chunks are the ones we are most apt to want to resample, the -n option will work with ASIC correlator chunks too. One may produce a synthetic chunk from a chunk which has been excluded with the -w switch; indeed, doing so will probably be the most common use case. The syntax for a synthetic chunk is

-n originalChunk:startChannel:endChannel:nAverage

Where originalChunk is the hardware chunk number (49 or 50 for SWARM), startChannel is the first channel to use in the original, hardware chunk, endChannel is the last channel to use, and nAverage is the number of channels of the original chunk to vector average to produce a single channel in the synthetic chunk. (1 + endChannel - startChannel) must be evenly divisible by nAverage. One can define any number of synthetic chunks. Because the -n option allows for an arbitrary number of synthetic chunks, it needs to be the last parameter on the command line, because it grabs all the remaining text on the line, and attempts to parse it into new syntetic chunk specifications.

4. Examples

Process both sidebands of all chunks in the data set located at /sma/SMAusers/taco/130408_17:20:01/
$ sma2casa.py /sma/SMAusers/taco/130408_17:20:01/
Process data from the upper sideband of the 400 GHz Rx only
$ sma2casa.py /sma/SMAusers/taco/130408_17:20:01/ -u -r 400
Make a FITS-IDI file for the lower sideband of chunks s03, s05 and s20, if data exists for those chunks (i.e. if the number of channels has not been set to 0 in the restartCorrelator command).
$ sma2casa.py /sma/SMAusers/taco/130408_17:20:01/ -l -c 3,5,20
Make a FITS-IDI file for the lower sideband of chunk s40, and trim the highest and lowest 10% of the channels by setting their amplitudes to 0.0 (which will ultimately cause them to be flagged bad).
$ sma2casa.py /sma/SMAusers/taco/130408_17:20:01/ -l -c 40 -t
Make a set of FITS-IDI files with the Tsys values for antennas 4 and 5 replaced by the Tsys values for antenna 7.
$ sma2casa.py /sma/SMAusers/taco/130408_17:20:01/ -T 4=7 -T 5=7
Make a set of FITS-IDI files with synthetic SWARM chunks (an no real SWARM chunks) with the edge 2048 channels chopped off, and points vector averaged in sets of 128.
$ sma2casa.py /sma/SMAusers/taco/130408_17:20:01/ -w 49,50 -n 49:2048:14335:128 50:2048:14335:128

5. Script caveats

sma2casa.py calls the C language module, makevis, which does the low-level processing of the visibilities data. The object code for this module is makevis.so and it must be present in your working directory. Since makevis.so is compiled code, it is not as portable as Python code.

If the script does not work properly in this mode, you can do one of the following:

Recompile makevis.c. The source code and a Makefile are included in the git repository.
Run the script, sma2casa.py, with the -P option. This will tell the script to use Python code only, but will slow down execution by ~ a factor of 2.

sma2casa.py maps the entire visibilities data file into RAM. Hence, the script is apt to run very slowly if the computer's available RAM is smaller than the size of the SMA file, sch_read.

smaImportFix.py

1. What the script does

The python script, smaImportFix.py is run from inside CASA. This script reads the FITS-IDI files output by sma2casa.py and writes them to CASA Measurement Set (MS) format. A single output file is produced.

Makes a list of which FITS-IDI files are in the current directory, so that it knows which chunks should be processed.

Reads each FITS-IDI file into a separate CASA measurement set (MS).

To indicate that a data value is bad, smaImportFix.py sets its amplitude to 0.0. It runs the flagdata task on the newly created MS in order to explicitly flag those data points bad. Chunk edge channels are also flagged bad in this step, if you passed arguments to sma2casa.py indicating that you wanted to trim edge channels.

Deletes the FITS-IDI files

Fixes the weights. For all chunks except the pseudo-continuum chunk, the CASA importfitsidi file sets the data weights to 1.0. smaImportFix.py fixes this problem and puts the proper weights, proportional to (integration time)/Tsys**2, in the weight table of the MS.

Generate new scan numbers. In the raw SMA data sets, each timeslice of data stored by the correlator has a unique scan number. CASA MSs usually have scan numbers which change only when the source is changed (helpful in controlling how calibration information is averaged and interpolated). smaImportFix.py therefore generates a new set of scan numbers which increment only when the source changes. This means that there will usually be several integrations which share the same scan number.

Concatenates individual chunk MSs into a single MS. One such concatenated MS is made for each sideband, MyDataLower and MyDataUpper.

2. Getting started

The following Python modules are part of a standard CASA installation and should be present:

3. Running smaImportFix.py

This script is run from inside CASA.

CASA: execfile('smaImportFix.py')

4. Script caveats

There is a strange, intermittent problem with importing dual receiver data into CASA. Occasionally, the frequency scale gets set incorrectly. There is a work-around for this issue:

make an empty directory

copy the FITS-IDI files into the new directory

cd to the new directory

(re)start CASA in the new directory and run the script, smaImportFix.py

copy the MSs to their permanent location

The CASA SYSCAL table has the Tsys values stored in it, but that is probably only useful for plotting the Tsys data (via browsetable). The SYSCAL table is not in the format expected by CASA, so the CASA commands which use Tsys information will not work properly.

5. Script notes

If you don't import the SMA pseudo-continuum "chunk" into CASA, i.e., spectral chunks only:

s01 will correspond to spw0
s02 will correspond to spw1

Note that this has to be set and is not the default. While the SMA pseudo-continuum "chunk" is not very useful in CASA, it keeps the numbering of SMA chunks and CASA spws the same.

The names of the antennas in CASA will be "SMA1", "SMA2", etc. The "Station" parameter for each antenna is set to the pad name for the pad the antenna was sitting on during the observation.

If you get the error message: NameError: name 'tflagdata' is not defined, change "tflagdata" to "flagdata" in the smaimportFix.py script. CASA 4.2.0 stable (r26945) and earlier versions recognize tflagdata; CASA 4.2.0 release (r28322) and later versions do not.

CENTER FOR ASTROPHYSICS | HARVARD & SMITHSONIAN
60 GARDEN STREET, CAMBRIDGE, MA 02138