File Systems and File Management

A*CRC systems provide multiple file systems for users’ data.  The file systems have different purposes, and different management policies, and users need to be aware of these.  The principal user file systems are $HOME, $DATADIR, $FLUSHDIR, $TMPDIR and $LOCALDIR.

User home directory ($HOME) and $DATADIR areas

The $HOME directories should be used for data that is used often.  Typically, these include files used to define the user’s environment, scripts, source code, frequently-used executable, and small data sets.

New systems (axle, fuji, aurora, cirrus)

The traditional file structure often suffers from the wastage of disk space as the same files are often duplicated on several systems. Therefore the newer A*CRC systems (axle, fuji and beyond) have a file structure with a fixed 200 MB quota for the home directory, sufficient for any system or application configuration files plus a small number of commonly-used applications. All jobs must be run from the temporary scratch disk as described in the next section below.

There is also a global data space ($HOME/data or $DATA) shared on the login nodes of all the new systems but not for running any jobs from there. (NOTE: this space is visible only from the head nodes and not from the compute nodes, therefore users are advised NOT TO attempt to do any jobs from there.) Users may use this space to store their permanent files. The key advantages of this new structure are so:
  • files are not duplicated on every system, thereby conserving disk space
  • users need not explicitly transfer files via scp or sftp between systems
  • with only one set of files, data consistency is maintained across all systems
The new file structure is explained in the figure below. We are monitoring usage patterns before implementing any quota on $DATA.
file_struct

Working directory $SCRATCH and $TMP

Users who need a temporary work space for their jobs may make use of either the system-wide $SCRATCH on each system or the $TMP directory which is found on every node.  The differences between $SCRATCH and $TMP are summarised below.

 $SCRATCH
$TMP
Visibility
system-wide
per node
Sizelarger – in the order of TBs shared across the whole system
smaller – in the order of 10s of GBs per device
Performance
lower – in the order of 10s of GB/s shared across the whole system
higher – about 100GB/s per device
Reliability
RAID5
single disk

IMPORTANT
Users who intend to make use of$TMPshould note that they need to explicitly transfer their files to either their home directory or$SCRATCHat the end of each job run. While no quota is currently imposed on$SCRATCHand$TMP, files residing in these areas will not be backed up and may be automatically removed if they have not been accessed for a certain period of time. (Please see “Flushing Policy” below.)

Useful commands for mass file transfers includecp, mc, mv, rsync, scp, sftp,andtar. Please read the respective man pages or browse the web for documentation on their usage. Please note also that this list is by no means exhaustive; users may use any file transfer mechanisms that suits them best.

Flushing Policy

Flushing is implemented on $SCRATCH based on necessity, but with a minimum lifetime of 60 days (current setting). Files that are newer than the minimum lifetime will never be (automatically) flushed.

A*CRC has instituted automated flushing on $SCRATCH on all our systems. When the space usage on $SCRATCH reaches a certain threshold (currently set at 85%) users will be sent a daily e-mail announcing the impending flush. (The e-mails will cease once the usage on $SCRATCH falls below 85%.) When the space reaches a second threshold (currently set at 90%) users will be sent a final reminder that flushing will soon be initiated. If the space usage is still 90% or more 24 hours after the final reminder has been sent then flushing will commence.

During flushing the system will sort files and empty directories from the oldest to youngest, and remove these starting from the oldest until sufficient space is available, or until it runs out of old files to flush. The automated flush process stops flushing at a set age (currently 60 days) so that files newer than this age are not automatically flushed. The age of the files is based on access time, not creation or modification times. A log of the files removed can be found in $FLUSH/flush.YYMMDD where YYMMDD is the date when the flush was performed.

Backup

All data on $HOME and $DATA are backed up daily. The retention period is 30 days. Data on $SCRATCH and $TMP are not backed up. Users should therefore not store any permanent data on them. More on backup facilities can be read Storage page.

Archive

The storage space on A*CRC’s computing systems is meant for users’ current data. Users who need to keep their data for long term retention should make separate arrangements for data archiving. More information on archiving in can be found Storage page

File transfer

File transfer methods, including WinSCP and sFTP, are listed on the Access Information page.