Scientific Computing and Data / High Performance Computing / Documentation / Storage and File Permission Management

Storage and File Permission Management

Default Storage Allocations

Each user will automatically be allocated, free of charge:

  • 20GB Home folder on SLOW Network File System (NFS) storage. This is generally used for configuration files and scripts. The amount of storage is fixed and cannot be increased. This folder is not purged and is backed up. The path to this folder is /hpc/users/<username>. For performance-critical purposes, please use the GPFS, the General Parallel File System, a massively parallel distributed file system.
  • 100GB Work folder on our GPFS file system for their general use. The amount of storage is fixed and cannot be increased. The path to this folder is /sc/arion/work/<username>. This folder is not purged but is not backed up. It is the user’s responsibility to back up all important data to archival storage or other storage resources.
  • Scratch storage on our GPFS file system. Scratch storage is a 100TB shared pool for all users, intended for temporary production work and not long-term storage. A per-user quota of 15TB is implemented to prevent one user from monopolizing the scratch storage. It is the maximum scratch storage one user can use. This DOES NOT guarantee that you can get 15TB in scratch storage. Please note scratch usage frequently reaches its maximum. Files in scratch are not backed up and are automatically deleted after 14 days. The path to this folder is /sc/arion/scratch/<username>.

For HIPAA compliance, the default permission for these files is set to read and write only by the file’s owner (rwx——). If a user wants to share a file or files or the directory with others, then the chmod, setfacl and getfacl commands can be used. Please see documents or video on basic Linux file permissions for greater detail.

 

Purge and Quote Policy on Scratch Storage

Purge Policy: The scratch file system cannot be used for long term storage and files on scratch are not backed up or guaranteed by Minerva. In the event of a file system crash or purge, files in scratch directories cannot be recovered. It is the user’s responsibility to back up all important data to archival storage or other storage resources.Files are exempt from purge if they have been written to or read within the last 14 days. To see the list of files that will be purged you can use:

find  /sc/arion/scratch/$USER -atime +14

Special Note: Modifying file access times (using touch or any other method) for the purpose of circumventing purge policies may result in the loss of access to the scratch file systems. Also note that GPFS maintains a “Born On” date that is unaffected by touch or any standard unix command.

Quota policy: Scratch usage frequently reaches its maximum. This causes unnecessary errors for users who rely on it to pull large data temporarily. To ameliorate the issue, a per-user quota (15T) is implemented in the scratch folders. This policy is to avoid any one user from consuming all off the scratch space and reduce related job failures.

 

Project Storage

The bulk of our storage is allocated to the various projects underway on Minerva. If more storage than the default is required for laboratory projects the PI or a delegate can request that a project be created.

Only the PI or a delegate can authorize changes to a project, e.g., add/remove users, change ownership of files, grant access by researchers not in the project group, etc.

To request a project allocation please fill out the Minerva Project Allocation Form.

The cost for project folders is $100/TB/yr and we charge the PI every 2 months at $8.33/TB/mo. The storage charge includes the cost of all Minerva services including computation and is set yearly by the Mount Sinai Compliance and Finance Departments. Complete the Minerva Storage Information Collection Form, which will need to be filled out to specify the fund number/s that will be used to pay for the storage.

Project directories are not backed up. Minerva team recommends that critical information be backed up to archival storage independently. See TSM page. To request an increase in the size of your project directory, please submit a request to hpchelp@hpc.mssm.edu.

 

Storage and File Permission Management

When a project is created:

  • A unix group is created with the acronym chosen by the PI for the project.
    • Researchers designated by the PI or a delegate are made members of this group which will give them access to the project folder.
  • A folder is allocated with the approved allocation of storage on our GPFS file system.
    • The owner of the folder is the PI if they have a Minerva account. Otherwise another member of the project group.
    • The group owner is the project group.
    • Protections of the top level directory are set to rwxrws—. The group sticky bit is set to ensure that the group ownership of the files and folders below the top level is set to the project group
    • This folder is not purged but is not backed up. We recommend that you use the Archive feature of our Tivoli Storage Manager system to save your important data.
    • The allocated amount can be expanded by modifying the requested allocation on the Allocation Request Form. To do this, open the allocation website; click on the “Returning” button in upper right corner; enter your return code that was sent to you when the allocation submission was acknowledged.
    • The path to the project folder is /sc/arion/projects/<project_acronym>

By default, project directories are created with the project group as the group owner. Therefore members of your project have read and write permissions. Team members must be a member of that project group to access the project directory.

To find out which Unix group owns a project directory (assume your group’s name is projectA):

$ ls -ld  /sc/arion/projects/yourGroupDirectory
drwxrwx--- 2 48 projectA 4096 2011-03-05 12:42 yourGroupProjectDirectory/

 

File System Access Control Lists (FACL)

An ACL is a list of permissions that are associated with a directory or file. It defines which users and groups are allowed to access a particular directory or file. This is over and above the standard unix permissions. This can be used to authorize users or groups that are not normally associated with the files to have access to those specific files and no other.

Typically, one would like to give access to a sub-folder of a project to a specific user but no other folder. To do this, use the setfacl command (see the man page for setfacl and getfacl). As an example, to give a user, userj01, read and execute access to the files and folders in /sc/arion/projects/myproject/toBeShared:
cd /sc/arion/projects/myproject
setfacl -R -m u:userj01:rX toBeShared

The capital “X” means only give execute privileges if it make sense, e.g., to folders.

To complete the grant of access, you need to give the user at least execute privileges to all the directory files in the path. So:

setfacl -m u:userj01:x /sc/arion/projects/myproject

Since the user did not get read privileges, they won’t be able to read the myproject directory file and see what you have stored underneath. The only way to get to the toBeShared folder is to reference it with the full path name. But from there on down they will be able to list the file names and read the files.

getfacl is the companion command that will list the ACL’s on a file. For example:

code>getfacl /sc/arion/scratch/fludee01

yields:
getfacl: Removing leading ‘/’ from absolute path names

# file: sc/arion/scratch/fludee01
# owner: fludee01
# group: hpcstaff
user::rwx
user:pintod02:--x
user:linx19:r-x
group::r-x
mask::r-x
other::---

 

In this example, user pintod01 can only “pass through” the folder but not read it. User linx19 can both read and “pass through” the directory folder.

To remove access, use -x option:

setfacl  -x u:pintod02,u:linx19 /sc/arion/scratch/fludee01

getfacl /sc/arion/scratch/fludee01	
# file: sc/arion/scratch/fludee01
# owner: fludee01
# group: hpcstaff
user::rwx
group::r-x
mask::r-x
other::---

How to Check your GFPS Quota

The work, scratch and project storage all utilize GPFS, the General Parallel File System, a massively parallel distributed file system. To check your GPFS quotas, you can run the following script, either per user or per project is listed:

# showquota -h
usage: showquota [-h] [-u USER | -p PROJECT]

optional arguments:
  -h, --help            Show this help message and exit
  -u USER, --user USER  Show quota for user in groups
  -p PROJECT, --project PROJECT
                        Show quota for a project

Please also understand that the quota is based on GPFS’s quota report that is queried every 15 mins and might have a fudge factor (including some -ve numbers that you should ignore for small set of files) to account for files being created or deleted.

Summary table for the 4-ish folders you can have:

 

Home

/hpc/users/<userid>

$ quota -s

  • 20GB quota
  • Slow. Use for “config” files, executables, but NOT DATA
  • NOT purged and is backed up
Work

/sc/arion/work/<userid>

$ df -h /sc/arion/work/<userid>

  • 100GB quota
  • Fast, keep your personal data here
  • NOT purged but is NOT backed up
Scratch

/sc/arion/scratch/<userid>

$ df -h /sc/arion/scratch

  • Free for all, shared by all; for temporary data
  • Current size is about 100TB
  • Purge every 14 days and limit per user is 10TB
Project

/sc/arion/projects/<projectid>

$ df -h /sc/arion/projects/<projectid>