The RTDC
Processing SMA Data
1.2 m Telescopes
AST/RO
Extra

Testing Parallelized CASA on the Hydra Cluster

Progress Report

The Tests

We calibrate the M100 Band 3 Science Verification dataset, using an edited version of the python script alma-m100-analysis-hpc-regression.py, which is included in the (parallelized) CASA tarballs. For purposes of comparison, we do initial tests on rtdc7.

General Notes:

  1. CASA has two parallelization modes depending on the version:
    • Old Parallel (OP) Framework>, which makes use of an external cluster configuration file (deprecated in CASA 4.5.0).
    • Message Passing Interface (MPI) Framework (implemented in CASA 4.3.0).
  2. The following python scripts were used:
            m100-casa44-tarball-script.py (CASA 4.4.0).
            m100-pre-casa44-tarball-script.py (pre-CASA 4.4.0).
  3. The scripts were executed with the commands:
            m100-casa44-tarball-script.sh. (CASA 4.4.0, uses MPI framework).
            m100-pre-casa44-tarball-script.sh. (pre-CASA 4.4.0, uses OP framework).
  4. Steps 0-18 constitute the calibration.
  5. All interactive lines have been removed to enable comparison with hydra batch jobs.
  6. References to ALMA Science Data Model (ASDM) format files have also been removed, since they are not readily available. Before running the script, we rename:
            mv X54.ms X54-monolith.ms
            mv X220.ms X220-monolith.ms
  7. We remove the directories (CASA 4.4.0 and later only):
            \rm -r X54.ms.flagversions
            \rm -r X220.ms.flagversions

New Results: July 2017

CASA 4.7.2 (and previous versions) appear to have no real parallelism capabilities. Changing the number of engines (CPUs) makes negligible difference to the execution time. The dominant factor is the I/O speed of the disk. Running CASA on an NFS mounted disk will take approximately 5 times longer than a local disk.

CASA 4.7.2 (release date: 3/23/2017)

MACHINE Details # Engines Disk type real user system
RTDC rtdc7 3 local 40m
48.6s
26m
21s
4m
59s
RTDC rtdc7 7 local 43m
11s
30m
28s
5m
38s
RTDC rtdc7 13 local 41m
40s
30m
35s
8m
40s
RTDC rtdc9 13 nfs 49m
7s
15m
2s
8m
27s
RTDC rtdc9 3 local 11m
20s
12m
57s
1m
55s
RTDC rtdc9 13 local 8m
51s
14m
6s
4m
20s
RTDC rtdc9 15 local 9m
26s
15m
27s
8m
20s
RTDC rtdc8 3 local 14m
56s
15m
28s
4m
1s
RTDC rtdc13 13 local 20m
7s
30m
48s
14m
42s
. . . . . . .
hydra head node 7 nfs 52m
1.2s
32m
18.9s
8m
56.8s
hydra node 2-9 7 nfs 42m
54s
15m
38s
4m
11s
hydra node 2-9 7 local 11m
16s
13m
34s
2m
58s
hydra node 2-9 13 nfs 41m
51s
19m
2s
4m
39s
hydra node 2-9 13 local 11m
1s
13m
55s
3m
27s
hydra node 2-9 3 local 14m
22s
13m
23s
2m
25s
hydra node 0-4 7 nfs 52m
4s
25m
35s
7m
39s
hydra node 0-4 7 ssd 20m
59s
31m
40s
12m
12s

Notes:
  1. Tests on 0-4 were run while the machine was as already busy w/ an other job.
  2. ssd means it was running off a local SSD disk.
  3. rtdc7 n=7 local & rtdc8 n=3 were done while other jobs were running.
  4. By far the most important factor is the I/O speed of the disk (e.g. local v mounted)
  5. The difference between rtdc7 and rtdc9 is RAM (rtdc7=48GB/rtdc9=132GB). Both have 3.5GHz CPUs.


Historical Results: Sept. 2015

All tests run with seven engines unless otherwise stated.

CASA 4.4.0 (release date: 6/22/2015)

MACHINE Details CPU (GHz) RAM (GB) user system real
rtdc7 RTDC 16x3.5 48.0 37m
30.223s
5m
56.912s
31m
4.924s
. . . . . . .
hydra head node 24 --- 44m
13.792s
11m
49.211s
1h 2m
56.820s
" node 2-9 64 --- 43m
33.455s
7m
43.456s
58m
1.409s
" node 2-9
(local disk)
" --- 37m
50.057s
8m
10.176s
26m
6.772s
" node 0-9
(with SSD disk)
72 --- 18m
58.938s
5m
4.302s
12m
43.140s

Notes:
  1. We run these tests using the MPI framework (see general notes above).
  2. The hydra tests were run post-upgrade (September, 2015).

CASA 4.3.0 (release date: 1/12/2015)
MACHINE Details CPU (GHz) RAM (GB) user system real
rtdc7 RTDC 16x3.5 48.0 20m
30.728s
11m
9.580s
1h 40m
54.629s
. . . . . . .
hydra /pool/cluster7
(NetApp disk)
64 256 1h 10m
23.062s
32m
27.517s
3h 23m
5.59s
" /state/partition1
(local disk)
" " 1h 14m
15.682s
55m
58.055s
6h 17m
18.93s
" /pool/cluster7
(NetApp disk)
" " 1h 10m
3.464s
32m
0.967s
3h 16m
48.75s
" " " " 1h 13m
18.721s
53m
1.329s
3h 25m
30.87s

Notes:
  1. The CASA 4.3.0 release notes claim:
        "CASA has a new MPI parallelization framework and is currently testing use cases."
    We use the OP Framework here (see general notes above), since the MPI Framework is still in the early stages of testing.
  2. The first three hydra tests employed 7 engines, the last one 14 engines.
  3. The first two hydra tests ran without any memory specification, the last two ran with mem=8192.
  4. The hydra tests were run pre-upgrade (May, 2015).
CASA 4.2.0 (release date: 2/11/2014)
MACHINE Details CPU (GHz) RAM (GB) user system real
rtdc7 test failed
(see report)
--- --- --- --- ---
. . . . . . .
hydra did not test --- --- --- --- ---

Notes:
  1. The test fails due to a bug in the software. For details, consult:
            The CASA 4.2.0 report
  2. Tests were not attempted on hydra.

CASA 4.1.0 (release date: 7/2/2013)
MACHINE Details CPU (GHz) RAM (GB) user system real
rtdc7 RTDC 16x3.5 48.0 18m
30.924s
9m
22.946s
1h 22m
41.580s
rglinux12 "   8x2.9 " 23m
48.106s
9m
35.289s
1h 11m
56.657s
. . . . . . .
hydra compute-0-32
(NetApp disk)
64x2.2 256 1h 29m
18.459s
1h 18m
32.504s
4h 46m
53.30s
" compute-0-32
(local disk)
" " 1h 30m
2.212s
1h 37m
44.280s
3h 43m
13.11s

Notes:
  1. The RTDC machine, rglinux12, was run for comparison.
  2. The hydra tests were run on a single dedicated machine (compute-0-32.local) as sole user. Any other means of running resulted in far poorer results.
  3. For details about the test and results, consult:
            The CASA 4.1.0 report .



CENTER FOR ASTROPHYSICS | HARVARD & SMITHSONIAN
60 GARDEN STREET, CAMBRIDGE, MA 02138