Tags:
create new tag
, view all tags

How to Submit a MPI Job, using OpenMPI, over the IB fabric

  • This primer describes how to submit parallel MPI jobs, using the OpenMPI implementation (GNU compiler, ORTE), over IB
  • The executable(s) MUST be compiled and linked using OpenMPI and linked with the IB-capable libraries
  • The primer on compilers describe how to compile and link using OpenMPI.
  • An executable compiled w/ an other compiler (Intel, PGI, etc) should not be submitted this way. It may run, but is likely to give you grief.
  • An executable compiled and linked using plain OpenMPI will run fine, but will use TCP/IP, not the IB, for the message passing, so you will not see any speed-up.

  • There is a different primer that explains how to submit OpenMPI jobs over TCP/IP.

To Submit a Job

  • The basics on how to submit a job are described in the primer's introduction on job submission, so read that one first.
  • The job file include the command to launch your OpenMPI program, using the mpirun command.
  • The number of processors and the machinefile (list of hosts to use) is not explicitly specified (hardwired) with the mpirun command.
  • You must invoke the corresponding mpirun (OpenMPI implementation, aka ORTE), see example below.
  • The qsub file (or command)
    • will request a number of processors (CPUs, cores)
    • and specify the corresponding PE (parallel environment), via the qsub command
  • The job scheduler will grant the request and determine the hosts list for that specific job (i.e., the machinefile)
  • The jobs file can specify the PE and the number of processors via an embedded directive (i.e., #$ -pe orte_ib 8)

Example

A Minimal Job File

  • Let's assume that you want to run the OpenMPI executable mycode
    [t]csh syntax [ba]sh syntax
    hydra% cat mycode-csh.job hydra% cat mycode-sh.job
    setenv  OMPI_MCA_plm_rsh_disable_qrsh 1 export OMPI_MCA_plm_rsh_disable_qrsh=1
    /usr/mpi/gcc/openmpi-1.4.3/bin/mpirun -np $NSLOTS mycode /usr/mpi/gcc/openmpi-1.4.3/bin/mpirun -np $NSLOTS mycode
  • Note that the env var $NSLOTS is not defined in the job file,
  • The variable NSLOTS will be set by the Grid Engine at execution time and holds the number of granted slots.

A Minimal qsub File

  • The corresponding qsub file is
    [t]csh syntax [ba]sh syntax
    hydra% cat mycode-csh.qsub hydra% cat mycode-sh.qsub
    qsub -pe orte_ib 8 \ qsub -pe orte_ib 8 \
    -cwd -j y \ -cwd -j y \
    -N mycode \ -N mycode \
    -o mycode.log \ -o mycode.log \
    mycode-csh.job -S /bin/sh \
      mycode-sh.job

NOTE

  • The above example requests 8 processors (CPUs, cores); adjust that number to your needs.
  • The flag -pe orte_ib 8 is the flag that tells SGE to use the parallel environment (PE) orte_ib and requests 8 processors.
    • ORTE is the PE to use with and only with OpenMPI executables,
    • by specifying orte_ib, not orte, your job will run on compute nodes connected to the IB fabric,
    • the -pe orte_ib 8 flag can be embedded (using #$) in the job file, like any other one
  • An other page describes in more detail the available queues.
  • For OpenMPI over IB you must use one of the following queues, sTNi.q, mTNi.q or lTNi.q.
    These correspond to the short, medium and long execution time respectively, (the queue is specified with the -q flag).
  • Options passed to qsub override embedded directives in the job file (including -pe or -q)

Details for Experienced Users

  • At run-time, the scheduler defines the following OpenMPI specific variables:
    NSLOTS The granted number of slots, or number of processors for this MPI run
    PE_HOSTFILE The file name that lists the distribution of processors over the compute nodes

  • Hence you can use, in the jobs file, the commands
    echo number of slots is $NSLOTS to print out the granted value of NSLOTS
    echo pe host file is $PE_HOSTFILE  
    cat $PE_HOSTFILE to print out the name and content of the PE_HOSTFILE

More Examples

Look, on hydra, in ~hpc/tests/mpi/gnu for some examples.

  • To execute them, create a test directory and extract the compressed tar-ball:
hydra% mkdir -p ~/tests/mpi/gnu/orte+ib
hydra% cd ~/tests/mpi/gnu/orte+ib
hydra% tar xvzf ~hpc/tests/mpi/gnu/orte+ib/tests.tgz

  • Build the executable
hydra% make

  • Run (some) of the tests
hydra% source hello-csh.qsub
hydra% source hello-sh.qsub
hydra% qsub hello-csh-opts.job
hydra% qsub -pe orte 4 hello-csh-opts.job
hydra% qsub -pe orte 4 hello-sh-opts.job

Run only one job at a time, use qstat to monitor the job, then look at the hello.log file.

-- SylvainKorzennikHPCAnalyst - 12 Jul 2012

Topic revision: r2 - 2012-07-12 - SylvainKorzennikHPCAnalyst
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2015 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback