Search

link to homepage

Institute for Advanced Simulation (IAS)

Navigation and service


GPFS data questions

GPFS data answers

What file system to use for different data?

In principle there are three GPFS file systems for different types of user data. Each file system has its own data policies.

  • $HOME
    Acts as repository for source code, binaries, libraries and applications with small size and I/O demands. Data within $HOME are backed up by TSM, see also

  • $WORK
    Acts as a temporary storage location with high I/O bandwidth (measured 160 GB/s from an JUQUEEN application). If the application is able to handle large files and I/O demands, $WORK is the right file system to place them. Data within $WORK is not backed up and daily cleanup is done.

    • Normal files older than 90 days will be purged automatically. In reality modification and access date will be taken into account, but for performance reasons access date is not set automatically by the system but can be set by the user explicitly with
      touch -a <filename>.
      Time stamps that are recorded with files can be ealily listed by
      stat <filename>.
    • Empty directories, as they will arise amongst others due to deletion of old files, will be deleted after 3 days. This applies also to trees of empty directories which will be deleted recursively from bottom to top in one step.
  • $ARCH
    Acts as storage for all files not in use for a longer time. Data are migrated to tape storage by TSM-HSM. It is recommended to use tar-files with a maximum size of 1 TB. This is caused by the speed for reading/writing data from/to tape. All data in $ARCH first has to be backed up which will take 10h for 1TB. Next the data will be migrated to tape which will take 3h per 1TB. Please keep in mind that a recall of the data will need approximately the same time. See also

All GPFS file systems are managed by disk space and/or number of files quotas, see also

What data quotas do exist and how to list usage?

Disk quota limitations in $HOME and $WORK file systems are in effect since end of October 2007. This had to be done because in the past file systems were blocked by creating millions of files by single users which caused performance for system commands (ls, du) to be degraded. Also migration for $HOME data didn't work successfully any longer and therefore the new type of archive file system $ARCH was introduced. The following limitations apply since December 2009 in general. The numbers are updated according to the actual capacity in the file systems.


Data quota per group/project within GPFS file systems

File System

Disk Space

Number of Files

Soft LimitHard Limit Soft LimitHard Limit
$HOME6 TB7 TB2 Mio2.2 Mio
$WORK20 TB21 TB4 Mio4.4 Mio
$ARCH- (see note)2 Mio2.2 Mio

Note:
No hard disk space limit for $ARCH exists but if more than 100 TB will be requested please contact the supercomputing support at JSC ( sc@fz-juelich.de ) to discuss optimal data processing particularly with regard to the end of the project. Furthermore for some projects there may exist special guidelines.

File size limit

Although the file size limit on operation system level ( Linux for JUDGE and JUQUEEN) is set to unlimited (ulimit -f) the maximum file size can only be the GPFS group quota limit for the correspondig file system. The actual limits can be listed by q_dataquota.

List data quota and usage by group and user

Members of a group/project can display the hard limits, quotas (soft limit) and usage by each user of the group in a group special file (/homex/group/usage.quota) that is updated every three hours within prime shift (see timestamp at the top of the file). Since End of January 2013 for easy reading the unit of measure is set to GB instead of KB. This causes that the displayed values are always rounded up to the next GB-value. If less then 1 GB are used e.g. 256 KB or 128 MB there will be always 1 GB to be seen.

more $HOME/../usage.quota

This file can also be listed in a short and long format by the command

q_dataquota [-l]

The short format will display the group quota limits and group data usage for each file system followed by the usage of the user herself/himself. The long listing includes the data usage of all users of the group in descending order.

Notes:

  • Although no quota limits for a group may be listed for the $WORK file system quotas are set! Counting quotas will start with the first file created by a user of the group.
  • If the message Cannot exceed the user or group quota is displayed when writing data to a file the sum of used and in_doubt blocks has exceeded the hard limit. Please be aware of that not only the used blocks are taken into account!
  • The column grace reports the status of the quota

    none - no quota exceeded
    xdays - remaining grace period to clean up after the soft limit is exceeded
    expired - no data can be written before cleanup

List in time data quota and usage by group

A prompt update of the group's data usage and limits can be displayed with:

mmlsquota -g <group> [ <FS_without_leading_/> | -C just.fz-juelich.de ]

The output for the specified file system or all file systems of the JUST storage cluster will show the usage summary of the specified group (not the members) in KByte units by default. For better reading a unit of measure can be specified or GPFS can select the best that fits. To do so specify the option (with GPFS 3.5.x)

--block-size {M|G|T|auto}


System actions when limits are exceeded

  • Soft limit
    If any soft limit is exceeded a grace period of 14 days starts to count down. If no data will be deleted to be under the limit the quota will be expired after the grace period and no files can be created or expanded any longer. If in the meantime the hard limit is exceeded the quota is expired directly.
  • Hard limit
    If any hard limit is exceeded (sum of used and in_doubt are taken into account) the users in the group cannot create any new files or expand existing files in the correponding file system until the number of files or disk space allocated is less than the limit.

Recommendation for users with a lot of small files

Users with applications that create a lot of relatively small files should reorganize the data by collecting these files within tar-archives using the

tar -cvf archive-filename ...

command. The problem is really the number of files (inodes) that have to be managed by the underlaying operating system and not the space they occupy in total. On the other hand please keep in mind the recomendations under File size limit.

How can I recall migrated data?

Normally migrated files are automatically recalled from TSM-HSM tape storage when the file is accessed at JUQUEEN (login nodes only), JUDGE (login and compute nodes), JUROPATEST (login nodes only), or JUROPA (GPFS gateway nodes only).

For an explicit recall the native TSM-HSM command dsmrecall is not available. Please use

tail <filename>
or:
head <filename>

to start the recall process. These commands will not change any file attribute and the migrated version of the file stays valid.

It is strongly recommended not to use

touch <filename>

because this changes the timestamp of the file, so a new backup copy must be created and the file has to be migrated again. These are two additional processes that waste compute ressources if the file is used read only by further processing.

How can I see which data is migrated?

There are three file systems that hold migrated data: /arch, /arch1, /arch2

  • These are so called archive file systems.
  • In principle all data in the file systems will be migrated to TSM HSM tape storage in tape libraries.
  • Data is copied to TSM backup storage prior to migration.
  • Every user owns a personal archive directory that can be specified by the $ARCH resp. $GPFSARCH variable.
  • Data are not quoted by storage but by the number of files per group/project. This is done because UNIX is still not able to handle millions of files in a file system with an acceptable performance.

The TSM-HSM native command dsmls, which shows if a file is migrated, is not available on JUQUEEN nor on JUDGE nor on JUROPA. This command could only run on JUST, the storage cluster, that hosts the file systems for the HPC systems. However JUST is not open for user access.

Please use

ls -ls [mask | filename]

to list the files. Migrated files can be identified by a block count of 0 in the first column (-s option) and an arbitrary number of bytes in the sixth column (-l option).

0 -rw-r----- 1 user group 513307 Jan 22 2008 log1
0 -rw-r----- 1 user group 114 Jan 22 2008 log2
0 -rw-r----- 1 user group 273 Jan 22 2008 log3
0 -rw-r----- 1 user group 22893504 Jan 23 2008 log4

How to restore a file from the home directory?

All files within the users home directories ($HOME) are automatically backed up by TSM (Tivoli Storage Manager) function. To restore a file, use

adsmback [-type={home | gpfshome} ] &

on the login-nodes of JUQUEEN or JUDGE, or the GPFS-gateway nodes at JUROPA. If the option -type is not specified, the user will be prompted for the type of filesystem

Which type of filesystem should be restored? Enter: {home | arch | gpfshome}

This command grants access to the correct backup data of the user's assigned home directory. 'gpfshome' applies to JUROPA only because JUROPA users have an additional GPFS home directory besides the standard Lustre home directory.

Follow the GUI by selecting:

File level -> [j]homeX -> group -> userid -> ...
Select files or directories to restore
Press [Restore] buttom

If the data should be restored to original location then choose within the Restore Destination window:

  • for JUQUEEN: Original location
  • for JUDGE: Original location
  • for JUROPA (GPFS): Following location + /gpfs/homeX + Restore complete path

Don't use the native dsmj-command which will not show any home data.

How to restore a file from the archive directory?

All files within the user's archive directory ($ARCH) for long term storage are automatically backed up by TSM (Tivoli Storage Manager) function. To restore a file, use

adsmback [-type=arch] &

on the login-nodes of JUQUEEN or JUDGE, or the GPFS gateway-nodes at JUROPA. If the option -type is not specified, the user will be prompted for the type of filesystem

Which type of filesystem should be restored? Enter: {home | arch | gpfshome}

This command grants access to the correct backup data of the user's assigned archive directory.

Follow the GUI by selecting:

File level -> archX -> group -> userid -> ...
Select files or directories to restore
Press [Restore] buttom


If the data should be restored to original location then choose within the Restore Destination window:

  • for JUQUEEN: Original location
  • for JUDGE: Original location
  • for JUROPA: Following location + /gpfs/archX + Restore complete path

Don't use the native dsmj-command which will not show any archive data

How to share files by using ACLs?

ACLs (Access Control Lists) provide a means of specifying access rights on files. GPFS access control lists allow the definition of access rights for other users or groups.

Create or change a GPFS access control list

mmeditacl <filename>

which will open the ACL-definition of <filename> with an editor.


Note that for this command to work the EDITOR environment variable must contain a complete path name, for example on JUQUEEN: export EDITOR=/usr/bin/vim

Example:
Set read and execute permission for user user1 and execute permission only for user2 to directory dir1:

mmeditacl dir1
.... (append 3 lines to the displayed lines) ....
mask::r-x-
user:user1:r-x-
user:user2:--x-

Note that mask must have the maximum permission compared to any user permission of this ACL and that access must be granted to every directory in the hierarchy (esp. the home directory). The 4th character stands for the GPFS specific control permission.

When the file is saved, the following has to be answered:

mmeditacl: 6027-967 Should the modified ACL be applied? (yes) or (no)

Which files have an access control list?

The command

ls -l

will show a "+" for every file that has ACL set, eg.

drwx------+ 2 user group 32768 Feb 21 09:25 dir1

Delete a GPFS access control list

mmdelacl <filename>

or remove the added lines by mmeditacl.

Apply a GPFS ACL recursively

Example:
Apply ACL to all subsequent files and directories below dir1, use:

for i in `find dir1`
do
mmgetacl dir1 | mmputacl $i
done

Documentation

Please see the man pages or IBM documentation for further commands:

mmdelacl, mmgetacl, mmputacl


Servicemeu

Homepage