The RTDC
Processing SMA Data
1.2 m Telescopes
AST/RO
Extra

Running Jobs on Hydra


■ Getting started

You will need to request an account through CF by filling in this webpage.

Log on to one of the Hydra login nodes (login01 or login02). You can access Hydra directly from the RTDC.

$ ssh -X username@hydra-login01.si.edu

You will have home space for small files e.g. script/configuration files. Then a larger working area in /pool/sao/username or /scratch/sao/username - you should create and run jobs from here. Note that files in /pool are scrubbed after 180 days, while files in /scratch are scrubbed after 90 days.

You can copy files to and from Hydra in the usual way with rsync or scp.


■ Creating a job script

The job script can be generated using the QSub Generator, or written manually.
You can find some template scripts for running CASA jobs below.

High-memory job (contains detailed comments)
Low-memory script (job script only)
Serial script (job script only)
Parallel + SSD request script (job script only)

Find more information on all the queue options at HPC: Available Queues


■ Running a job

You can submit your job to the queue from the command line in your cwd. Assuming the -cwd flag is set then all output files will be written here.

$ qsub myscript.job

qsub sends the job from the login node (/pool/sao) to the compute node assigned to you. You can see a list of the compute nodes here, and can select a particular one to run your job on during submission, however there may be a long wait if all the slots on that node are in use.

$ qsub -q 'mThM.q@compute-9-*' myscript.job

Find more information on submitting jobs at HPC: Submitting Jobs


■ Monitoring progress

Your first check can be with qstat. This will confirm your job has been submitted successfully and is either running (r) or queued and waiting (qw).

$ qstat -u username

For more options you must load the tools/local module

$ module load tools/local

For different options based on qstat try these

$ q+ +a%
$ q+ +rr%

To plot memory usage (find job number from qstat/q+). This will open an x-window.

$ plot-qmemuse.pl -x jobnumber

You can see an overview of the whole cluster at Hydra Status. You can follow links to display metrics for individual users.


■ Killing a job

You can kill all your jobs by using your username.

$ qdel -u username

Alternatively you can kill specific jobs using the job ID reported by qstat..

$ qdel 123456


CENTER FOR ASTROPHYSICS | HARVARD & SMITHSONIAN
60 GARDEN STREET, CAMBRIDGE, MA 02138