Running CASA in Parallel Mode
Before proceeding, be aware of the issues with parallelizing tclean by visiting Checking the output of parallelized tclean with CASA 5.1.2 and Checking the output of parallelized tclean with CASA 5.3.0.
Running in parallel involves three steps.
(1) Initialize CASA with mpicasa instead of regular casa.
(2) Divide your data into multiple MSs using the partition task.
(3) Call your task with parallel=True.
If CASA has been initialized with mpicasa, when a parallel=True flag is encountered it will automatically look for the sub-MSs and run them in parallel.
- First create a CASA script (mytclean.py in the example below) which partitions your data (fulldata.ms), then runs tclean.
partition(vis='fulldata.ms', outputvis='input.ms', numsubms='auto', separationaxis='auto')
tclean(vis='input.ms', selectdata=True,imsize=[1600, 2560],cell="0.3arcsec",start="230km/s",
For partition, the parameter numsubms is the number of sub MSs to create; leave this as auto and it will match the number of cores you specify when starting your script. Leaving separationaxis at auto is also recommended as it will divide your data in the optimal way to balance load across the processors.
For tclean it is important to call it with parallel=True and restart=True, with vis set to the output name from the partition task.
- Next run your script with mpicasa. The n flag specifies the number of cores.
> setenv CASAPATH /opt/casa-release-5.3.0-143.el6/bin
> $CASAPATH/mpicasa -n 3 $CASAPATH/casa --nogui -c $PWD/mytclean.py
If running parallel CASA on the RTDC, do not use more than 3 cores.
This is sufficient to get the time gain, without signifiantly slowing down the machine for other users.