Scalable I/O library for parallel access to task-local files
SIONlib is a library for writing and reading data from several thousands of parallel tasks into/from one or a small number of physical files. Only the open and close functions are collective while file access can be performed independently.
SIONlib can be used as a replacement for standard I/O APIs (e.g. POSIX) that are used to access distinct files from every parallel process. SIONlib will bundle the data into one or few files in a coordinated fashion in order to sidestep sequentialising mechanism in the file system. At the same time, the task-per-file picture is maintained for the application, every process has access to its logical file only. File access is performed using SIONlib equivalents to standard C-I/O functionality (fwrite becomes sion_write, fseek becomes sion_seek, etc.) which have similar semantics as their C counterparts.
Internally, the physical files are sub-divided into sequences of blocks, which themselves contain one chunk of data belonging to every logical file. In case the amount to be written to a file is known up front, it can optionally be specified when opening the file and the sequence of blocks collapses into a single block with one chunk per task containing all of its data. If a chunk size cannot be specified ahead of time, a sensible default is chosen and reads and writes that cross chunk boundaries are handled transparently by SIONlib.
SIONlib also uses information about the block size of the underlying file system, because access to the same block from different tasks often leads to contention.
Both, the estimated chunk size and file system block size are used to align individual chunks with file system blocks. Ensuring contention-free access to file system blocks enables efficient parallel writing and reading.
SIONlib provides two different interfaces: one for parallel access (with implementations for different parallel programming technologies such as MPI, OpenMP and hybrid MPI+OpenMP) and one for sequential access which is also used internally by the SIONlib utilities.
SIONlib is a joint project of:
- Forschungszentrum Jülich, Jülich Supercomputing Centre
- German Research School for Simulation Sciences
- Laboratory for Parallel Programming