- JUQUEEN Usage Model
- Batch System on JUQUEEN
- Access to JUQUEEN
- Write a Batch Script
- Start an Interactive Session
- Advanced Information
- Useful LoadLeveler Commands
JUQUEEN Usage Model
The IBM Blue Gene/Q uses so called front-end nodes or login nodes running Linux for interactive access and the submission of batch jobs. Parallel applications have to be cross-compiled on the front-end nodes and can only be executed on the partition residing on the BG/Q compute nodes. Access is automatically controlled by the LoadLeveler batch system, which chooses the appropriate partition depending on the requested resources. Serial jobs are executed on the front-end node.
Batch System on JUQUEEN
The batch system on JUQUEEN is LoadLeveler. The batch system is responsible for managing jobs on the machine, returning job output to the user and provides for job control on the user's or system administrator's request.
Job Limits and Conditions
- Compute nodes are used exclusively by jobs of a single user; no node sharing between jobs is done.
- The smallest allocation unit is 32 nodes (512 cores).
- The number of nodes as well as the ranks per node need to be chosen as a power of 2 (1,2,4,8,16,32,...)
- The maximum number of ranks per node is 64.
- Users will be charged for the number of compute nodes multiplied with the wall clock time used.
Running 1 rank per node, there will be approximately 16 GB (minus 16 MB for the kernel) available to the application.
Running 'n' ranks per node (n>1), there will be approximately 16GB/n available to the application.
|Batch jobs||max. wall clock time||24 h|
|default wall clock time||6 h|
|min. number of nodes||257 (allocating 512)|
|max. number of nodes||28672|
|Medium batch jobs||max. wall clock time||12 h|
|default wall clock time||6 h|
|min. number of nodes||65 (allocating 128)|
|max. number of nodes||256|
|Small batch jobs||max. wall clock time||30 min|
|default wall clock time||30 min|
|min. number of nodes||32|
|max. number of nodes||64|
Access to JUQUEEN
See Application Optimization on how to compile Fortran, C or C++ programs.
Write a Batch Script
Write a job script including the runjob command for the just compiled executable.
The minimal template to be filled is:
|# @ job_name = LoadL_Sample_1|
# @ comment = "BG Job by Size"
# @ error = $(job_name).$(jobid).out
# @ output = $(job_name).$(jobid).out
# @ environment = COPY_ALL
# @ wall_clock_limit = 00:40:00
# @ notification = error
# @ notify_user = email@example.com
# @ job_type = bluegene
# @ bg_size = 32
# @ queue
runjob --exe myprogp.elf --args "-t 1" --ranks-per-node 8
- Files for #@ output and #@ error messages have to be defined, here the they are named after the job (script) name and the LoadLeveler job number and combined into one file (same name for both).
- The keyword #@ environment = COPY_ALL ensures exporting of all these variables defined in the current job script to the processes spawned by the runjob command.
- A valid email address has to be specified in # @ notify_user.
- # @ bg_size specifies the number of requested nodes.
- The user commands start after the #@ queue keyword.
- runjob starts the parallel executable.
For runjob options, see the BG/Q Administration Manual, chapter 6.2.1, page 75.
The above and further job script examples (inkl. multi-step jobs for job chaines with dependencies) can be found here or in /bgsys/local/samples/LoadLeveler.
Submit the Job
Use llsubmit to submit the job:
On success llsubmit returns the job ID of the submitted job.
Start an Interactive Session
Not available on JUQUEEN.
- LoadLeveler Keywords
- Tuning of applications
- Installed Software
- Data Limits
- CPU Quota and Accounting
- Submission of job chains see job examples in
Useful LoadLeveler Commands
|llsubmit <jobscript>||submit job|
|llq||show status of all jobs|
|llq -s <jobid>||get detailed information about a job|
|llqx [-l]||show status of all jobs incl. priority|
|llcancel <jobid>||cancel a job|
|llclass[x]||show information about job classes|
|llstatus||show general status of LoadLeveler|
|llbgstatx||show status of BlueGene midplanes|
For further information please see also LoadLeveler Commands.