The RTDC
Processing SMA Data
1.2 m Telescopes
AST/RO
Extra

SMA Data FAQ


General
  1. Can I get help?
  2. How do I acknowledge the SMA in my publication?
  3. Can I get access to the reduction software at CfA?
  4. What is the size of the synthesized beam for my configuration?
Archive
  1. What data is available in the archive?
  2. Do you provide the data in CASA format?
  3. Do you offer reduced data?
  4. How soon is data available on the RTDC machines and in the archive?
  5. Is there a quick way to get all the observed data for my project?
  6. Why do I see multiple data files for the same target on the same UT date?
  7. Can I find out what happened during the observation of this data?

Data Format

  1. What is the difference between a chunk and a spectral window?
  2. What is the resolution of these data?
  3. What will be the size of the final map?
  4. What are all the files in the data directory?
  5. Why does this raw data file have multiple science sources in it?
  6. What is the continuum channel?
  7. Can I find out whether this data is SWARM or ASIC?
  8. Can I find out what sources are in this data set from the command line?
  9. How can I tell if this is polarimetry data?

Data processing

  1. What is the best way to reduce SMA data?
  2. What about updating baselines?
  3. I have polarization data; how do I reduce it?
  4. How can I rebin these data?
  5. Why did SMARechunker crash?
  6. How can I stop sma2casa thinking SWARM data is old format?
  7. Where can I find the passband calibrator? I don't see it in the data I downloaded.
  8. How do I choose a reference antenna?
  9. How can I combine two files so I can use the passband data from a different file?
  10. Is a single pointing for the whole track sufficient?
  11. Why do I need to regenerate the continuum channel?
  12. My flux (or gain) calibration fails, why?
  13. What should the phase look like?
  14. What is a phase jump and what should I do about it?
  15. Does it matter if my bandpass calibrator is before or after my science track?
  16. Do I have to do the full flux calibration?
  17. All the recorded Tsys values equals zero. What should I do?
  18. MIR gives me the message 'Two antennas on the same pad !' What should I do?
[Any questions you'd like to see answered here? Email holly.thomas@cfa.harvard.edu]


General

  1. Can I get help?

    Yes. The SMA can provide help with data reduction using IDL/MIR and MIRIAD. Please email smarequester@cfa.harvard.edu to be put in touch with someone, or email smamiriad@cfa.harvard.edu with MIRIAD specific questions.

  2. How do I acknowledge the SMA in my publication?

    "The Submillimeter Array is a joint project between the Smithsonian Astrophysical Observatory and the Academia Sinica Institute of Astronomy and Astrophysics and is funded by the Smithsonian Institution and the Academia Sinica."

  3. Can I get access to the reduction software at CfA?

    Remote access to the RTDC requires a CF (CfA Computing Facility) account. If you have one you may request an RTDC account. If you do not have a CF log in you can gain RTDC access as a guest but only on site. Alternatively, you can request guest access to the SMA computing facilities in Hilo to reduce SMA data.

  4. What is the size of the synthesized beam for my configuration?

    At 345GHz the sizes of the synthesized beams are subcompact=5", compact=2", extended=0.7" and very extended=0.25".

[Back to the list]


Archive

  1. What data is available in the archive?

    Raw SMA data is available from April 13, 2002 to the present. Before October 2015, calibrated data sets may be available for some sources. A small subset of this data has been imaged. These can be accessed via the science archive or you can search the Processed Image Archive independently.

    See here for more details.

  2. Do you provide the data in CASA format?

    No, but it is possible to convert it to CASA MS format yourself. See Converting SMA data to CASA MS format for instructions.

  3. Do you offer reduced data?

    We do not provide pipeline reduced data at the moment. A small subset of historical data has been calibrated and imaged. You can search just these files at our Processed Imaged Archive.

  4. How soon is data available on the RTDC machines and in the archive?

    SMA data gets copied from Hilo to RTDC8 every night at 21:00 EST. This is 3 or 4pm HST depending on the time of year. This time is chosen to ensure any daytime observing is completed, and data has been copied from the summit to Hilo.

    At midnight the data is copied from the RTDC machines to the CF to be made available through the web based archives.

  5. Is there a quick way to get all the observed data for my project?

    Yes. Both the Proprietary Archive and the main Science Archive now support searching by project code. The Proprietary Archive also allows PIs/authorized users to request rebinned data.

  6. Why do I see multiple data files for the same target on the same UT date?

    Computer issues during observing may have resulted in the creation of a new data file in the middle of a track. Check the observing reports (which will be the same for each if the files are associated with same project), but it probable that these files can be concatenated. In rare cases there may be 20+ data files created during an observation although many of these will not be long enough to contain useful data.

  7. Can I find out what happened during the observation of this data?

    You can view the observing report for non-proprietary data. This will give full details of priming (usually) and the time-line of the observation itself. RTDC versions of the observing reports are only generated for non-proprietary data. Proprietary observing reports cab be viewed from your project page in the SMA Observer Center.

[Back to the list]


Data Format

  1. What is the difference between a chunk and a spectral window?

    In each case these terms refer to spectral regions dealt with independently by the correlator. The different terms refer to the different correlators: for SWARM they are called chunks and are numbered 1-4, for ASIC they are called spectral windows and there were either 24 or 48 of them.

  2. What is the the resolution of these data?

    SWARM data has fixed frequency resolution of 140kHz. This converts to approximate velocity resolutions of 0.19km/s @ 225GHz, 0.12km/s @ 345GHz, and 0.065km/s @ 650GHz. ASIC data is not so predictable as it allowed a different number of channels to be set for every spectral band. SWARM file sizes can be huge so it is likely you will want to rebin your data; ASIC file sizes are very small in comparison so rebinning is unnecessary.

    In MIR you can find the resolution for ASIC data by doing

    IDL> print, sp[0:48].fres     (for ASIC)

    This reports the frequency resolution in MHz for each spectral window. The first number reported (0) is the continuum channel. You can also print sp[0:48].nch which gives the number of channels.

  3. What will be the size of the final map?

    For a single pointing (a single pair of coordinates are stared at during the track), the size of the final processed map is set by the beam size of a single SMA dish. This is ~40'' at 345GHz and ~55'' at 230GHz.

    The resolution of the data is determined by the simulated beam which is primarily dependent on the array configuration.

  4. What are all the files in the data directory?

    When taking interferometer data, the SMA records data in a set of files all contained in a single directory. You can find a description of each file here.

  5. Why does this raw data file have multiple science sources in it?

    All observations include the necessary calibrators along with a science target. Some data will cycle between a number of science targets over the course of the observation. This results in a single raw data file being associated with a number of different sources. You can extract your source of interest during data reduction.

  6. What is the continuum channel?

    The instruments are spectrometers, not continuum cameras, therefore pure continuum data is not being recorded. Instead the continuum is generated by compressing all the spectral channels into a single value for each pixel. This is why it is sometimes referred to as the pseudo-continuum channel. Given the wide bandwidth of SWARM this is an excellent alternative to a continuum instrument.

  7. Can I find out whether this data is SWARM or ASIC?

    You can SMARechunker in list rather than rebin mode. This will report the number of scans and give the chunk numbers. If it reports c1, s1, s2, s3, s4 for the chunks then you only have SWARM data. If the numbering is s01-s48, these are the ASIC bands. Before ASIC was switched off, SWARM chunks were added to the end of the list, hence any bands above number s48 are SWARM. Note that although the SWARM bands are appended to the numbering sequence, they overlap with the ASIC bands in frequency.

    SMARechunker -i /sma/data/science/mir_data.2015/151230_04:37:57/ -L

  8. Can I find out what sources are in this data set from the command line?

    You can use the command 'whatishere' available on RTDC machines. This lists the sources with their scan numbers, frequency, and number of baselines.

    whatishere /sma/data/science/mir_data.2015/151230_04:37:57/

  9. How can I tell if this is polarimetry data?

    You can check the bl.ipol value in MIR. You can check how many different polarization states are present by printing the different values that were recorded.

    IDL> result=uti_distinct(bl[pbf].ipol,npol,/many)
    IDL> print,'npol = ',npol

    If npol=4, the data are full (dual) polarization data, which needs special calibration. If npol=1 then it is non-polarization data and you can do normal calibration. There may be in-between cases though, so as a rule of thumb, any data with npol<4 can be treated as regular (non-polarization) data. You can see the values printed for each integration by simply typing

    IDL> bl[pbf].ipol

[Back to the list]


Data Processing

  1. What is the best way to reduce SMA data?

    The best supported and documented way is to use the IDL package MIR (Millimeter Interferometer Reduction) for calibration. You can find instructions here. For imaging, MIRIAD or CASA are recommended. Find more information at An Overview of SMA Data Reduction.

  2. What about updating baselines?

    After an array configuration change (e.g from compact to extended) a 'baseline track' must be run to get accurate new antenna positions. Once obtained, these positions are applied to future science tracks (as the antennas file in the science directory), but the updated file is not applied retroactively to earlier tracks.

    You can check which data needs a corrected antenna file applied by checking Updating SMA Baselines. Apply the correction in MIR with a single command.

    select,/pos,/res
    sma_cal_bas
        Enter the current ANTENNAS file:
        Enter the new ANTENNAS file: 
    

    For MIRIAD see Correction for uvw coordinates.

    Uncorrected baseline positions will show the trend in phase seen in the plot below. To generate this plot, the phase based gain calibration (IDL: gain_cal, cal_type='pha') was applied, then the gain calibrators selected. This pattern will be more pronounced on some antenna pairs than others. Although the two fits dont look wildly different, a varaition of 50 degrees is considerable.

  3. I have polarization data; how do I reduce it?

    Polarization data is difficult to reduce and no tutorials are provided at this time. You can get specialized help by emailing smarequester@cfa.harvard.edu.

  4. How can I rebin these data?

    We have a program called SMARechunker which rebins your data by a given factor. There are a number of options allowing you to select specific scan/chunk ranges. See Reducing the Size of your Data for details.

    You can download the script from github:
    git clone https://github.com/kenyoung/SMARechunker

  5. Why did SMARechunker crash?

    You might see an error like this:
    Output buffer overflow (2): max: 1571640, now: 1636824 - abort

    If your data files contains any ASIC data the -A flag must be included. For a few months there was overlap so that many files from 2016 contain both ASIC and SWARM data. You can check this by running it with the -L flag which lists the spectral windows - a SWARM only file will only contain s1-4.

  6. How can I stop sma2casa thinking SWARM data is old format?

    All SWARM dat ais new file format. Occassionaly however sma2casa.py gets confused and crashed when trying to prcoess it as old file format. You can force sma2casa.py to prcoess data as a new format file by editing the script. Replace the code on the left, with that on the right.

        f = open(dataDir+'/in_read', 'rb')
        data = f.read()
        firstInt = makeInt(data[0:], 4)
        if firstInt != 0:
            newFormat = True
        else:
            newFormat = False
        f.close()
    
        f = open(dataDir+'/in_read', 'rb')
        data = f.read()
        firstInt = makeInt(data[0:], 4)
        if firstInt != 0:
            newFormat = True
        else:
            newFormat = False
        newFormat = True        >>> This is the new line
        f.close()
    
    Likewsie in MIR you can load data while forcing the new file format flag with

    IDL> readdata,dir='160703_04:43:55', /newformat

  7. Where can I find the passband calibrator? I don't see it in the data I downloaded.

    Sometimes the bandpass observation was shared the other science script run that night and may fall inside the other project's data file. You can check the observing report to confirm this was the case. If so, you will need to download the other data file (look for a file with the same UT date) and extract the bandpass source from it. The file names reflect the observation date and start time so just input the date in question to the webform (e.g. 140101-140101). Please email holly.thomas@cfa.harvard.edu if you have problems or if the data in question is proprietary.

  8. How do I choose a reference antenna?

    Choose any antenna which covers the full observing time range and that looks stable. It is best to choose one from the center of the array (depending on the configuration). If you have login access to the SMAOC you can find this out from the Array Configuration History page, or else you can see which antenna has the 0,0 position in the antennas file in the data directory.

  9. How can I combine two files so I can use the passband data from a different file?

    In MIR you can combine the two files using sma_dat_merge. You will be prompted for the two files in MIR format, so you will first have to read in (readdata) and then save (mir_save) each file. See an example here.

  10. Is a single pointing for the whole track sufficient?

    Yes. Pointing once a track is normal. SMA pointing is stable throughout a given night with respect to the beam size.

  11. Why do I need to regenerate the continuum channel?

    When you collapse the spectral channels you include any noise or spikes which will throw off your continuum value. It is especially important to mask/throw out noisy chunks of the spectrum itself where the underlying spectra baseline is not around 0. For the SMA this may be a full chunks/spectral window which has different characteristics and responses to the rest.

  12. My flux (or gain) calibration fails, why?

    You may have insufficient signal-to-noise. Try choosing only the best behaving chunks and regenerating the continuum.

  13. What should the phase look like?

    Phases should be around zero for a point source. That they aren't is due to instrumental and atmospheric effects. Occasionally phases for point sources may look very scattered. This may be because the source was very low elevation, or the weather was changing. System temperature correctio, band pass and gain calibration correct for these effects along with instrumental ones.

    For extended sources you would not expect to see any phase coherence in the raw data.

  14. What is a phase jump and what should I do about it?

    A phase jump can happen at any point during a track. However it can only be seen during calibration by looking at your reference quasar as it has a strong enough real-time signal (science targets generally being too weak). Looking at the plots you will see the phase change by a large amount (typically >50-70°).

    Sometimes this happens from one scan to the next whilst observing the quasar. In this case it is possible to set the time resolution be to very short when fitting the polynomial and fit right through it.

    More commonly the phase jump occurs at some unknown time while observing the science target. Here you only notice the jump between the quasar observations before and after. As you cannot be sure when it occurred, all science data in between the quasar observations must be flagged.

  15. Does it matter if my bandpass calibrator is before or after my science track?

    No, just as long as there is enough integration time (typical >45mins) to get sufficient signal-to-noise. The system does change slightly through the night but the effect is very small.

  16. Do I have to do the full flux calibration?

    Yes. Reference quasars do not have a stable enough flux to calibrate on.

  17. All the recorded Tsys values equals zero. What should I do?

    Some scans can get mislabeled as the wrong receiver which leads to tracks suffering from problems loading the Tsys. The solution is to identify the mislabeled data, manually set the label to the correct code, then reload the Tsys values for the whole dataset.

    • First figure out the receiver headers in data. You can find the receiver header index by
      IDL> print, c.rec
      230 345 400 240

    • Then check which receivers it thinks are there.
      IDL> print,uti_distinct(bl.irec,nrec,/many)
      0 2 3

      The code means 0 = rx230, 1 = rx345, 2 = rx400, and 3 = rx240. In this case the data was taken with a 230GHz/240GHz receiver configuration so any data with code 2 is mislabelled. Note that it is possible to see data mislabelled as code = -1.

    • The mislabelled data usually comes from a single receiver (although that is hard to verify). Nominally the value returned for the mislabeled data is the difference of the two headers with correctly labeled data; however, this is not always the case, due to internal flagging during data acquisition affecting each receiver differently.

    • To check which insert (0, 1, 2 or 3) it should be, count the scans for each receiver and see which one is missing data. E.g.
      IDL> result=dat_filter(s_f,'"irec" eq "0"',/reset)
      283310 passed in filter
      IDL> result=dat_filter(s_f,'"irec" eq "2"',/reset)
      630 passed in filter
      IDL> result=dat_filter(s_f,'"irec" eq "3"',/reset)
      282640 passed in filter

    • Here is looks like the code 2 data must belong to rx240 (code 3). To rectify this, select the code 2 data, set it to code 3, reselect all the data, then re-read the Tsys.
      IDL> result=dat_filter(s_f,'"irec" eq "2"',/reset)
      IDL> bl[pbf].irec=3
      IDL> select,/res
      IDL> readtsys2

    • You may end up with very high Tsys values which will need flagging. The steps below flag Tsys values above 1200K.
      IDL> result=dat_filter(s_f, ' "tssb" gt "1200" ',/reset)
      430 passed in filter
      IDL> flag,/flag
      IDL> select,/pos,/res

  18. MIR gives me the message 'Two antennas on the same pad !' What should I do?

    The antennas file is loaded to check the uvw coordinates calculation in the data header. This message implies the antennas file is corrupted, although it can arise when one of the antennas in question is in the hnager. If you open the file you will see two antenna numbers at the same x,y coordinates. You can choose to ignore this, although if the original uvw calculation was wrong, it will not be fixed. This only happens very occassionally. It can be corrected and the data fixed. Send an email requesting the correct antennas file.


[Back to the list]

CENTER FOR ASTROPHYSICS | HARVARD & SMITHSONIAN
60 GARDEN STREET, CAMBRIDGE, MA 02138