File Systems and File Management
A*CRC systems provide multiple file systems for users’ data. The file systems have different purposes, and different management policies, and users need to be aware of these. The principal user file systems are $HOME, $DATADIR, $FLUSHDIR, $TMPDIR and $LOCALDIR.
User home directory ($HOME) and $DATADIR areas
The $HOME directories should be used for data that is used often. Typically, these include files used to define the user’s environment, scripts, source code, frequently-used executable, and small data sets.
New systems (axle, fuji, aurora, cirrus)
The traditional file structure often suffers from the wastage of disk space as the same files are often duplicated on several systems. Therefore the newer A*CRC systems (axle, fuji and beyond) have a file structure with a fixed 200 MB quota for the home directory, sufficient for any system or application configuration files plus a small number of commonly-used applications. All jobs must be run from the temporary scratch disk as described in the next section below.
There is also a global data space ($HOME/data or $DATA)
shared on the login nodes of all the new systems but not for running any jobs from there. (NOTE: this space is visible only from the head nodes and not from the compute nodes, therefore users are advised NOT TO attempt to do any jobs from there.)
Users may use this space to store their permanent files. The key advantages of this new structure are so:
- files are not duplicated on every system, thereby conserving disk space
- users need not explicitly transfer files via scp or sftp between systems
- with only one set of files, data consistency is maintained across all systems
The new file structure is explained in the figure below. We are monitoring usage patterns before implementing any quota on $DATA.