Simulator for Job Schedulers on Supercomputers
JuFo (Juelich Forecast) simulates the global job schedulers of supercomputers. It is based on the Large-Scale System markup Language (LML) and retrieves its input data from the server-component LML_da, which is part of the monitoring tool LLview. The simulator accepts jobs, compute nodes, queue configurations and a large set of configuration parameters as input and simulates the future allocation of the jobs on the particular parallel system. It extends the input LML file by attributes for the predicted start times and compute resources of the simulated jobs. This simulation program can be used as online prediction for job start times as well as a highly configurable simulation for various global job schedulers, which supports supercomputer's administrators by optimizing the work load of the target system.
JuFo is based on an analysis of the batch systems Moab and Loadleveler, which are the main batch systems used on JSC supercomputers. All configuration parameters for a simulation are set via an input LML file. The input data for a simulation can be collected with LLview. As a result, JuFo can be used as additional module in the workflow of LML_da. The simulation results can be visualized as Gantt charts by the LLview client.
A simulation can be configured to use the scheduling algorithms First-Come-First-Served, List-Scheduling and Backfilling. They provide many configuration parameters in order to reconstruct the real job scheduler's algorithm as closely as possible. The most important configurable simulation parameters are
- Generic job prioritization by an arbitrarily configurable formula
- Advanced reservations
- Jobs can request CPUs, GPUs and memory
- Queue constraints, e.g. maximum limits for jobs per user
A detailed documentation on job scheduling, the design of JuFo and on how to write your own extensions to JuFo is given here. To simplify the configuration of a JuFo simulation, you can find the Java application JuCon in the downloads section. It provides a GUI, which allows to configure all available simulation parameters. JuCon reads LML files, adds the user defined configuration parameters and generates the input file for a JuFo simulation run.