The RTDC
Processing SMA Data
1.2 m Telescopes
AST/RO
Extra

Reducing the Size of your Data


Using SMARechunker

Warning (12 Feb 2019): A bug has been found in SMARechunker. If a range of channels is selected (example 2 below) the central velocity of the band is not properly recalculated. Note that in CASA the central velocity of the band is an important parameter for tasks involving resampling and velocity corrections. This does not affect the data where the full spectra window is kept. Updates will be posted here.

SMARechunker is SMA dedicated script that can be used to

(1) Rechunk (rebin) raw SMA data.
(2) Extract a channel range from the spectra of raw SMA data.

The SWARM correlator has a fixed bandwidth and resolution, so file sizes in excess of 100GB are now common (see these plots). Users often wish to reduce the size of their data set to reduce memory requirements during reduction. SMARechunker gives users the option to rechunk (rebin) the full bandwidth, and/or to extract a small window of the spectra by specifying a channel range.

Where to find it

Internal users can find it installed on all machines.

  • RTDC - simply type SMARechunker
  • CF - find it at /sma/bin/SMARechunker
  • SMA Hilo - find it at /application/bin/SMARechunker

External users find SMARechunker on github.

The git repository contains a Makefile, so you should only need to type make to get the executable version. Note that this program has only been tested on 64-bit Linux distributions and is not expected to work on a 32 bit distribution.


How to run it

The required flags are -i (input), -o (output) along with rebin information. If your track contains any ASIC data you must include a -A flag.

Example 1) Rebinning a SWARM dataset by a constant factor of 8

$ SMARechunker -i /sma/data/science/mir_data/170101_01:02:03 -o 170101_rebin8 -r 8

This example rebins all chunks/spectral bands by the same factor using the -r flag.

Example 2) Selective rebinning based on chunk and channel number (see warning at top of the page)

This example rebins chunk 1 by a factor of 4 between channels 2500 and 2500, and a factor of 64 across the whole chunk (channels 0 to 16385).

$ SMARechunker -d -i 170101_01:02:03 -o 170101bin4_64 1:2500:3500:4 1:0:16385:64

The -d option means that the rebinned sections are not extracted but are appended to the input file. As a result, the output file, 170101bin4_64, will have a total of 6 chunks: s1-s4 = original raw unbinned chunks; s5 = the channels rebinned by 4; s6 = the channels rebinned by 64.

You can also select based on time by using the -f and -l flags for first and last scan numbers.


Detailed instructions

SMARechunker offers the user many options to tailor their rebinning. You can find detailed instructions at Using SMARechunker

Alternatives

File size is critical if you plan to use MIR for calibration. MIR typically requires 2.5-3x the file size of memory. If you do not use SMARechunker you can reduce the size of the data being read into MIR by selecting sub-sections of the dataset. You can chose to read in just the data from a single receiver, a single side-band, or even a single chunk. This may be appealing for Galactic targets with narrow emission lines.


CENTER FOR ASTROPHYSICS | HARVARD & SMITHSONIAN
60 GARDEN STREET, CAMBRIDGE, MA 02138