What data quotas do exist and how to list usage?
Disk quota limitations in $HOME and $WORK file systems are in effect since end of October 2007. This had to be done because in the past file systems were blocked by creating millions of files by single users which caused performance for system commands (ls, du) to be degraded. Also migration for $HOME data didn't work successfully any longer and therefore the new type of archive file system $ARCH was introduced. The following limitations apply since December 2009 in general. The numbers are updated according to the actual capacity in the file systems.
Data quota per group/project within GPFS file systems
Number of Files
|Soft Limit||Hard Limit||Soft Limit||Hard Limit|
|$HOME||6 TB||7 TB||2 Mio||2.2 Mio|
|$WORK||20 TB||21 TB||4 Mio||4.4 Mio|
|$ARCH||-||(see note)||2 Mio||2.2 Mio|
No hard disk space limit for $ARCH exists but if more than 100 TB will be requested please contact the supercomputing support at JSC ( email@example.com ) to discuss optimal data processing particularly with regard to the end of the project. Furthermore for some projects there may exist special guidelines.
File size limit
Although the file size limit on operation system level ( Linux for JUDGE and JUQUEEN) is set to unlimited (ulimit -f) the maximum file size can only be the GPFS group quota limit for the correspondig file system. The actual limits can be listed by q_dataquota.
List data quota and usage by group and user
Members of a group/project can display the hard limits, quotas (soft limit) and usage by each user of the group in a group special file (/homex/group/usage.quota) that is updated every three hours within prime shift (see timestamp at the top of the file). Since End of January 2013 for easy reading the unit of measure is set to GB instead of KB. This causes that the displayed values are always rounded up to the next GB-value. If less then 1 GB are used e.g. 256 KB or 128 MB there will be always 1 GB to be seen.
This file can also be listed in a short and long format by the command
The short format will display the group quota limits and group data usage for each file system followed by the usage of the user herself/himself. The long listing includes the data usage of all users of the group in descending order.
- Although no quota limits for a group may be listed for the $WORK file system quotas are set! Counting quotas will start with the first file created by a user of the group.
- If the message Cannot exceed the user or group quota is displayed when writing data to a file the sum of used and in_doubt blocks has exceeded the hard limit. Please be aware of that not only the used blocks are taken into account!
The column grace reports the status of the quota
none - no quota exceeded
xdays - remaining grace period to clean up after the soft limit is exceeded
expired - no data can be written before cleanup
List in time data quota and usage by group
A prompt update of the group's data usage and limits can be displayed with:
mmlsquota -g <group> [ <FS_without_leading_/> | -C just.fz-juelich.de ]
The output for the specified file system or all file systems of the JUST storage cluster will show the usage summary of the specified group (not the members) in KByte units by default. For better reading a unit of measure can be specified or GPFS can select the best that fits. To do so specify the option (with GPFS 3.5.x)
System actions when limits are exceeded
- Soft limit
If any soft limit is exceeded a grace period of 14 days starts to count down. If no data will be deleted to be under the limit the quota will be expired after the grace period and no files can be created or expanded any longer. If in the meantime the hard limit is exceeded the quota is expired directly.
- Hard limit
If any hard limit is exceeded (sum of used and in_doubt are taken into account) the users in the group cannot create any new files or expand existing files in the correponding file system until the number of files or disk space allocated is less than the limit.
Recommendation for users with a lot of small files
Users with applications that create a lot of relatively small files should reorganize the data by collecting these files within tar-archives using the
tar -cvf archive-filename ...
command. The problem is really the number of files (inodes) that have to be managed by the underlaying operating system and not the space they occupy in total. On the other hand please keep in mind the recomendations under File size limit.