Search

link to homepage

Institute for Advanced Simulation (IAS)

Navigation and service


Quick Introduction

JUROPA/HPC-FF Usage Model

The JUROPA/HPC-FF cluster is logically divided into two partitions: JSC and HPCFF. Users can run applications only on the partition they have applied for. Access is automatically controlled by the batch system, which chooses the appropriate partition depending on the given user account. The following table lists the size of the partitions according to the number of compute nodes:

NodesBatchInteractive
JSC2208217632
HPCFF1080106416

Please note that the operation of the HPC-FF partition ends on June 30, 2013. After this date, only access to existing HPC-FF data will be possible. Login to nodes hpcff or juropagpfs for this purpose.

Batch System on JUROPA/HPC-FF

The batch system on JUROPA/HPC-FF is Moab with underlying resource manager TORQUE. The batch system is responsible for managing jobs on the machine, returning job output to the user and provides for job control on the user's or system administrator's request.

Job Limits

Compute nodes are used exclusively by jobs of a single user; no node sharing between jobs is done. The smallest allocation unit is one node (8 processors). Users will be charged for the number of compute nodes multiplied with the wall-clock time used. Approximately 22 GB of memory per node are available for applications.

ResourceValue
Interactive jobsmax. wallclock time (jsc)6 h
max. wallclock time (hpcff)1 h
default wallclock time 30 min
max. number of nodes8
default number or nodes1
max. no. of running jobs (including batch jobs)15
Batch jobsmax. wallclock time 24 h
default wallclocktime 30 min
max. number of nodes (jsc)1024
max. number of nodes (hpcff)512
default number or nodes1
max. no. of running jobs15

Access to JUROPA / HPC-FF

See here for details on how to log on to the system.

Compile with the Intel Compilers

See here for details on how to compile Fortran, C or C++ programs with the Intel compilers.

Write a Batch Script

Write a batch script including the mpiexec command for the just compiled executable.
The minimal template to be filled is:

#!/bin/bash -x
#MSUB -l nodes=<no of nodes>:ppn=<no of procs /node>
#MSUB -l walltime=<hh:mm:ss>
#MSUB -e <full path for error file>
# if keyword omitted : default is submitting directory
#MSUB -o <full path for output file>
# if keyword omitted : default is submitting directory
#MSUB -v tpt=<no of threads per task>
# for OpenMP/hybrid jobs only

### start of jobscript
export OMP_NUM_THREADS=<no of threads/task>
# for OpenMP jobs only
cd $PBS_O_WORKDIR
echo "workdir: $PBS_O_WORKDIR"
NSLOTS=<nodes * ppn>
echo "running on $NSLOTS cpus ..."
mpiexec -np $NSLOTS [--exports=var1,...] <executable>

NOTE: The option --exports along with a comma separated list of environment variables ensures the export of all specified variables from the current job script to the processes spawned by the mpiexec command. This is necessary for instance if OMP_NUM_THREADS is defined for OpenMP. The use of -x for exporting of all environment variables is deprecated.

Job Script Examples

Example 1: MPI application starting 64 tasks on 8 nodes using 8 CPUs per node running for max. 4 hours

#!/bin/bash -x
#MSUB -l nodes=8:ppn=8
#MSUB -l walltime=4:0:00
#MSUB -e /home/jhome3/test_user/my-error.txt
#MSUB -o /home/jhome3/test_user/my-out.txt
### start of jobscript

cd $PBS_O_WORKDIR
echo "workdir: $PBS_O_WORKDIR"

# NSLOTS = nodes * ppn = 8 * 8 = 64
NSLOTS=64
echo "running on $NSLOTS cpus ..."

mpiexec -np $NSLOTS ./mpi_prog


Example 2: Hybrid application (MPI and OpenMPI) on 8 nodes allocating 8 CPUs per node and starting 8 threads per node

#!/bin/bash -x
#MSUB -l nodes=8:ppn=8
#MSUB -e /home/jhome3/test_user/my-error.txt
#MSUB -o /home/jhome3/test_user/my-out.txt
#MSUB -v tpt=8
### start of jobscript
export OMP_NUM_THREADS=8
cd $PBS_O_WORKDIR
echo "workdir: $PBS_O_WORKDIR"

# NSLOTS = nodes * ppn / tpt = 8 * 8 / 8 = 8
NSLOTS=8
mpiexec -np $NSLOTS --exports=OMP_NUM_THREADS ./mpi_prog

Submit the Job

Use msub to submit the job:

msub <jobscript>

On success, msub returns the job ID of the submitted job.

NOTE: You can also define msub options on the command line, e.g.:

msub -l nodes=8:ppn=8,walltime=4:00:00 <jobscript>

Start an Interactive Session

To start an interactive session for 30 minutes on two nodes with 8 processors each use:

msub -I -X -l nodes=2:ppn=8,walltime=00:30:00

You will then automatically get access to a node and can start your applications right there. The option '-X' will enable the X-forwarding which is necessary if you want to use applications or tools which provide a GUI (e.g. totalview).

Other Useful msub Options

  • Receive mail under specified job conditions:

    #!/bin/bash -x
    #MSUB -M <mail-address>
    # official mail address
    #MSUB -m n|a|b|e
    # send mail: never, on abort, beginning or end of job


    Example:

    #MSUB -M v.name@fz-juelich.de
    #MSUB -m abe


    Sample output at job end:

    PBS Job id: 21316.jj28b01
    Job Name: test_mpiexec
    Exec host: jj25c96/3+jj25c96/2+jj25c96/1+jj25c96/0
    Execution terminated
    Exit_Status=0
    resources_used.cput=00:00:01
    resources_used.mem=4272kb
    resources_used.vmem=48116kb
    resources_used.walltime=00:10:11
  • Combine stderr and stdout:

    #MSUB -j oe

  • Define a jobname:

    #MSUB -N <jobname>

Define Job Chains or Dependencies

You can submit a job defining dependencies or even submit job chains.
Example for a simple job dependency:

msub <jobscript>
⇒ Moab answers with a jobid
msub -l depend=<jobid> <jobscript>

In this case, the second job will only start when the job with jobid <jobid> has finished.

Another possibility to define a job dependency is given by:

msub <jobscript>
⇒ Moab answers with a jobid
msub -W depend=afterok:<jobid> <jobscript>

In this case, the second job will only start when the job with jobid <jobid> has finished successfully.

Example script for a job chain:

#!/bin/bash
# submit a chain of jobs with dependency


# number of jobs to submit
NO_OF_JOBS=<no of jobs >

# define jobscript
JOB_SCRIPT=<jobscript>


i=0
echo "msub $JOB_SCRIPT"
JOBID=$(msub $JOB_SCRIPT 2>&1 | grep -v -e '^$' | sed -e 's/\s*//')
while [ $i -le $NO_OF_JOBS ]; do
echo "msub -W depend=afterok:$JOBID $JOB_SCRIPT"
JOBID=$(msub -W depend=afterok:$JOBID $JOB_SCRIPT 2>&1 | grep -v -e '^$' | sed -e 's/\s*//')
let i=$i+1
done

A chain of $NO_OF_JOBS jobs will be submitted and run one after the other, but the next job will start only after successful completion of the preceding one. Please note that a job which exceeds its timelimit is NOT marked successful!

Summary of Moab msub Options

The following table summarizes important msub command options:

OptionSuboptionDescription
-lset job limits (controlled by suboptions)
nodesnumber of compute nodes used by the yob
:ppnprocesses per node
:turbomodeenable CPU over-clocking
walltimewallclock timelimit for the job
dependdefine a job dependency
-edefine file name of job's error output
-odefine file name of job's standard output
-joejoin standard output and error output into one file
-vtptdefine the number of threads per MPI task for an OpenMP job
-Isubmit an interactive job
-Mdefine mail address to receive mail notification
-mdefine when to send a mail notification
nnever (default)
bat job begin
eat job end
ain case of job abort
-Ndefine the job's name
-Wdependdefine job ID this job depends on
afterok:only start job, if previous job in the chain was ok

Other Useful Moab Commands

CommandDescription
showq [-r]show status of all (running) jobs
canceljob <jobid>cancel a job
mjobctl -q starttime <jobid>show estimated starttime of specified job
mjobctl --helpshow detailed information about this command
checkjob -v <jobid>get detailed information about a job

For further information please see also the MOAB documentation.


Servicemeu

Homepage