Running CASA: Timing Tests
A series of tests designed to find the optimal way to run CASA on the RTDC machines.
The tests run tclean on a mosaiced dataset (this includes the partition task for the parallel option). tclean is the CASA task that has the maximum gain from parallelization. Tests run using CASA 5.1.2.
- Massive time gains by running tclean in parallel. Even using 3 cores is ~3x faster than serial processing.
- Significant gains (~50%) from using machines with faster processors (actually a combination of CPU and disk performace).
- No significant gain to using > 7 cores.
- Better to use fewer, unshared cores, than more, shared cores.
|Data size||Disk type||Machine||Parallel?||# Cores||Other parameters||Time||Comments|
|6.4G ||NFS ||RTDC8 ||N ||- ||blockdev=6496||4h16m ||Data read from rglinux13
|6.4G ||local ||RTDC8 ||N ||- ||blockdev=6496||3h50m ||Local disks definitely better
|6.4G ||local ||RTDC8 ||Y ||9 ||blockdev=6496 ||45m ||Repeated and it completed in 36m |
|6.4G ||local ||RTDC8 || Y|| 7||blockdev=6496 ||42m ||Repeated and it completed in 38m & 43m. Only a couple of minutes separate 7 and 9 cores.
|6.4G ||local ||RTDC8 ||Y ||7 ||blockdev=250 ||67m ||Confirms that blockdev makes a significant difference |
|6.4G ||local ||RTDC8 ||Y || 5||blockdev=6496 ||57m ||Fewer cores = longer. Repeated and it completed in 47m.
|6.4G ||local ||RTDC8 ||Y || 3||blockdev=6496 ||85m ||Fewer cores = longer. As expected. Repeated and it completed in 73m.
|6.4G ||local || RTDC8||N ||- ||blockdev=6496, mstransform[1, 256, 54] || 3h3m||Saving 20% on serial processing time using mstransform to trim|
|6.4G ||local || RTDC8||Y ||7 ||blockdev=6496, mstransform[1, 256, 54] ||39m ||No time gains|
|6.4G ||NFS ||RTDC8 ||Y ||7 ||blockdev=6496||56m ||Data read from NFS mounted disk (rglinux13) & written locally.
|6.4G ||NFS ||RTDC8 ||Y ||7 ||blockdev=6496||78m ||All work done on NFS mounted disk (rtdc9 mounted on rtdc8)
|6.4G ||NFS ||RTDC8 ||Y ||7 ||blockdev=6496||94m||All work done on NFS mounted disk (rglinux13 mounted on rtdc8)
|6.4G ||NFS ||RTDC8 ||Y ||7 ||blockdev=6496||58m ||All work done on local RTDC8 disk that is mounted as NFS at mount point.
|6.4G ||NFS ||RTDC8 ||Y ||7 ||blockdev=6496||56m ||Local disks on rtdc8 mounted as NFS
|6.4G ||local ||RGLINUX13 ||Y ||7 ||blockdev=6496, CPU speed 2.9GHz||78m ||
|13G ||local || RTDC8||Y ||5 ||blockdev=6496 ||57m ||Compare to 57 mins for half the data|
|13G ||local || RTDC8||Y ||9 ||blockdev=6496 ||54m ||Compare to 42 mins for half the data|
|13G ||local || RTDC8||Y ||13 ||blockdev=6496 ||48m |
|13G ||local || RTDC8||Y ||15x2 ||blockdev=6496 ||101m & 102m ||Set two identical scripts running simultaneously.|
Better to run with fewer unshared cores, than to run with more cores that are shared.
|13G ||local || RTDC8||Y ||7x2 ||blockdev=6496 ||79m & 77mm |
|6.4G ||local ||RTDC8 ||Y || 5||blockdev=6496 ||110m ||Re-run with CASA 5.3.0