Data Projects Started at JSC
The migration to the new usage model at JSC in December 2018 initiated the separation of computing time and data projects. Data projects comprise a number of data resources that are located on specific file systems at JSC, each offering their very own characteristics regarding bandwidth, capacity, and retention time. They allow for project data to be managed beyond the scope of a computing time project. Among other things, this approach facilitates the prolongation of data resources across a series of computing time projects.
Initially, each ongoing computing time project has been granted a data project with its existing data resources and a duration aligned with that of the computing time project. In future, principal investigators of projects must apply for data projects in addition to their computing time projects in order to retain their existing data resources.
Four file systems are available within the scope of a data project: ARCHIVE, DATA, FASTDATA, and USERSOFTWARE. The ARCHIVE filesystem offers high capacity for long-term storage at the expense of higher latencies when recalling data. The newly introduced DATA file system has been designed for large capacity at a reasonable level of performance. Its main usages are the sharing of data among projects, medium-term data storage, and making data available to JSC’s OpenStack-based cloud environment, allowing for community-specific services based on community data. It is worth noting that the DATA file system is not available on the HPC systems’ compute nodes, meaning that data staging must be employed to utilize the data within compute jobs. FASTDATA is intended for several higher volume projects that cannot employ data staging prior to executing jobs on JSC’s HPC systems, for example because they use most of their entire data set for each job. USERSOFTWARE is a file system intended to share software installations among several projects. Other file systems, such as PROJECT and SCRATCH, are exclusively available to computing time projects and are not part of the data project grant.
The first round of applications for data projects resulted in approximately 50 projects requesting a total of about 15 PB storage space on ARCHIVE, 10 PB on DATA, and 10 PB on FASTDATA. These projects have now been set up and users are gathering first experiences with the new offering at JSC. Starting on 1 May 2019, users will be able to apply for data projects in a rolling call.
Contact: Björn Hagemeier, b.hagemeier@fz-juelich.de
from JSC News No. 265, 21 May 2019