Search

link to homepage

Institute for Advanced Simulation (IAS)

Navigation and service


Quick Introduction

JUGENE Usage Model

The IBM Blue Gene/P uses so called front-end nodes or login nodes running Linux for interactive access and the submission of batch jobs. Parallel applications have to be cross-compiled on the front-end nodes and can only be executed on the partition residing on the BG/P compute nodes. Access is automatically controlled by the LoadLeveler batch system, which chooses the appropriate partition depending on the requested resources. Serial jobs are executed on the front-end node.

Batch System on JUGENE

The batch system on JUGENE is LoadLeveler. The batch system is responsible for managing jobs on the machine, returning job output to the user and provides for job control on the user's or system administrator's request.

Job Limits

Compute nodes are used exclusively by jobs of a single user; no node sharing between jobs is done. The smallest allocation unit is 32 nodes (128 processors). Users will be charged for the number of compute nodes multiplied with the wallclock time used. Approximately 400 MB of memory per core are available for applications.

Job typeResourceValue
Interactive jobsmax. wallclock time 30 min
default wallclock time 30 min
min. number of nodes 32
max. number of nodes256
Batch jobsmax. wallclock time 24 h
default wallclock time 30 min
min. number of nodes 32
max. number of nodes 73728

Access to JUGENE

See here for details on how to log on to the system.

Compilation

See here for details on how to compile Fortran, C or C++ programs.

Write a Batch Script

Write a jobscript including the mpirun command for the just compiled executable.
The minimal template to be filled is:

# @ job_name = LoadL_Sample_1
# @ comment = "BGP Job by Size"
# @ error = $(job_name).$(jobid).out
# @ output = $(job_name).$(jobid).out
# @ environment = COPY_ALL
# @ wall_clock_limit = 00:20:00
# @ notification = error
# @ notify_user = v.nachname@fz-juelich.de
# @ job_type = bluegene
# @ bg_size = 32
# @ queue
mpirun -exe myprogp.rts -mode VN -np 128 -verbose 2 -args "-t 1"

NOTES:

  • Files for #@ output and #@ error messages have to be defined, here the they are named after the job (script) name and the LoadLeveler job numberand combined into one file (same name for both).
  • The keyword #@ environment = COPY_ALL ensures exporting of all these variables defined in the current job script to the processes spawned by the mpirun command.
  • A valid email address has to be specified in # @ notify_user.
  • # @ bg_size specifies the number of requested nodes.
  • The user commands start after the #@ queue keyword.
  • mpirun starts the parallel executable.

Jobscript Examples

Further job script examples (inkl. multi-step jobs, job chaines, dependencies) can be found here.

Submit the Job

Use llsubmit to submit the job:

llsubmit <jobscript>

On success llsubmit returns the job ID of the submitted job.

Start an Interactive Session

To start an interactive session for 30 min on two nodes:

llrun -np 2 myprog.rts

Your application is started with a connection to your terminal..

Advanced Information

Useful LoadLeveler Commands

CommandDescription
llq show status of all jobs
llq -s <jobid>get detailed information about a job
llqxshow status of all jobs incl. priority
llcancel <jobid>cancel a job
llclassshow information about job classes
llstatusshow general status of LoadLeveler

For further information please see also the LoadLeveler documentation.


Servicemeu

Homepage