Data Storage and Management System

This file system resides on Hybrid Parallel File System (HPFS).

"Home" directories are on /global/u

/global/u is partition in parallel high performance Linux file system based on HPE parallel file system (HPFS). It holds the home directories of all individual users. When users request and are granted an allocation of HPC resources, they are assigned a <userid> and a 100 GB allocation of disk space for home directories on /global/u/<userid>. These home directories are on global file system mounted only on login nodes. There is no local storage on nodes . That means that 1. data can be accesses only from PFFS and login node(s) 2. no local storage on nodes is available. Codes that write intermediate results on disk are typically slow and should be run only on condo nodes that have local disk storage. All home directories are backed up on weekly basis.

/scratch

/Scratch is fast file system from which all jobs start. There is no quota on scratch, but scratch files are temporary and are not backed up. Single user cannot however take whole space with his/her data. That means users can run data sets that exceed their home space but they cannot use /scratch for storage. It is important to understand that /scratch must be used only for submitting jobs. Output from jobs may ONLY temporarily be stored on /scratch (up to 10 days). Consequently in order to submit a job for execution, a user must stage or mount the files required by the job to /scratch from /global/u using UNIX commands and/or from SR1 using iRODS commands. The later is mounting data directly from project space. Note that SR is slower file system so large files is better to be staged from /home directories. Files in /scratch are automatically relocated to local archive when their inactive residence on scratch exceeds 90 days. Local archive has limited capacity and serves as data buffer, so strict policies of cleaning up temporary archive are in place. Upon relocation the user will get a warning ( via e-mail) and must either move files to his/her home directory or to SR1. Note that files left in temp archive will be purged after 30 days. Users must not store valuable data or compiled codes on scratch since these are static type of files.

“Project” directories

“Project” directories are managed through iRODS and accessible through iRODS commands, not standard UNIX commands. In iRODS terminology, a “collection” is the equivalent of “directory”.

A “Project” is an activity that usually involves multiple users and/or many individual data files. A “Project” is normally led by a “Principal Investigator” (PI), who is a faculty member or a research scientist. The PI is the individual responsible to the University or a granting agency for the “Project”. The PI has overall responsibility for “Project” data and “Project” data management. To establish a Project, the PI completes and submits the online “Project Application Form”. Project data are stored on project space on main file system. Valuable parts of the projects must be curated by PI's and stored in HPCC long term archive.

Typical Workflow

Typical workflows for Penzias Appel and Karle in are described below:

1. Copying files from a user’s home directory or from SR1 to SCRATCH.
If working with HOME:

  cd /scratch/<userid>
  mkdir mySLURM_Job && cd mySLURM_Job
  cp /global/u/<userid>/myProject/a.out ./
  cp /global/u/<userid>/myProject/mydatafile ./

If working with SR1 (storage repository):

  cd /scratch/<userid>
  mkdir mySLURM_Job && cd mySLURM_Job
  iget myProject/a.out 
  iget myProject/mydatafile

2. Prepare SLURM job script. Typical SLURM sript is similar to the following:

  #!/bin/bash 
  #SBATCH --partition production 
  #SBATCH -J test 
  #SBATCH --nodes 1 
  #SBATCH --ntasks 8 
  #SBATCH --mem 4000
  echo "Starting…"

  cd $SLURM_SUBMIT_DIR
  mpirun -np 4 ./a.out ./mydatafile > myoutputs
  echo "Done…"

Your SLURM may be different depending on your needs. Read section Submitting Jobs for a reference.

3. Run the job

  sbatch ./mySLURM_script

4. Once job is finished, clean up SCRATCH and store outputs in your user home directory or in SR1.

If working with HOME:

  mv ./myoutputs /global/u/<userid>/myProject/.
  cd ../
  rm -rf mySLURM_Job

iRODS (The iRODS Section is in REVIEW and may not be CURRENT)

iRODS is the integrated Rule-Oriented Data-management System, a community-driven, open source, data grid software solution. iRODS is designed to abstract data services from data storage hardware and provide users with hardware-agnostic way to manipulate data.

iRODS is a primary tool that is used by the CUNY HPCC users to seamlessly access 1PB storage resource (further referenced as SR1 here) from any of the HPCC's computational systems.

Access to SR1 is provided via so-called i-commands:

iinit
ils
imv

Comprehesive list of i-commands with detailed description can be obtained at iRODS wiki.

To obtain quick help on any of the commands while being logged into any of the HPCC's machines type i-command -h. For example:

ils -h

Following is the list of some of the most relevant i-commands:

iinit -- Initialize session and store your password in a scrambled form for automatic use by other icommands.

iput -- Store a file

iget -- Get a file

imkdir -- Like mkdir, make an iRODS collection (similar to a directory or Windows folder)

ichmod -- Like chmod, allow (or later restrict) access to your data objects by other users.

icp -- Like cp or rcp, copy an iRODS data object

irm -- Like rm, remove an iRODS data object

ils -- Like ls, list iRODS data objects (files) and collections (directories)

ipwd -- Like pwd, print the iRODS current working directory

icd -- Like cd, change the iRODS current working directory

ichksum -- Checksum one or more data-object or collection from iRODS space.

imv -- Moves/renames an irods data-object or collection.

irmtrash -- Remove one or more data-object or collection from a RODS trash bin.

imeta -- Add, remove, list, or query user-defined Attribute-Value-Unit triplets metadata

iquest -- Query (pose a question to) the ICAT, via a SQL-like interface

Before using any of the i-commands users need to identify themselves to the iRODS server running command

# iinit

and providing HPCC's password.

Typical workflow that involves operations on files stored in SR1 include storing/getting data to and from SR1, tagging data with metadata, searching for data, sharing (setting permissions).

Storing data to SR1

1. Create iRODS directory (aka 'collection'):

  # imkdir myProject

2. Store all files 'myfile*' into this directory (collection):

  # iput -r <userid> myfile* myProject/.

3. Verify that files are stored:

  # ils
  /cunyZone/home/<userid>:
  C- /cunyZone/home/<userid>/myProject
  # ils myProject
  /cunyZone/home/<userid>/myProject:
     myfile1
     myfile2
     myfile3

Symbol 'C-' in the beginning of output of 'ils' shows that listed item is a collection.

4. Combining 'ils', 'imkdir', 'iput', 'icp', 'ipwd', 'imv' user can create iRODS directories and store files in them similarly to what is normally done with UNIX commands 'ls', 'mkdir', 'cp', 'pwd', 'mv' etc...

Getting data from SR1

1. To copy file from SR1 to current working directory run

  # iget myProject/myfile1

2. Now listing current working directory should reveal myfile1:

  # ls
  myfile1

3. Instead of individual files the whole directory (with sub-directories) can be copied with '-r' flag (stands for 'recursive')

  # iget -r myProject

NOTE: wildcards are not supported, therefore the command below will not work

  # iget myProject/myfile*

Tagging data with metadata

iRODS provides users with extremely powerful mechanism of managing data with metadata. While working with large datasets it's sometimes easy to forget what is stored in this or the other file. Metadata tags help organizing data in a very easy and reliable manner.

Let's tag files from previous example with some metadata:

# imeta add -d myProject/myfile1 zvalue 15 meters
AVU added to 1 data-objects
# imeta add -d myProject/myfile1 colorLabel RED
AVU added to 1 data-objects
# imeta add -d myProject/myfile1 comment "This is file number 1"
AVU added to 1 data-objects
# imeta add -d myProject/myfile2 zvalue 10 meters
AVU added to 1 data-objects
# imeta add -d myProject/myfile2 colorLabel RED
AVU added to 1 data-objects
# imeta add -d myProject/myfile2 comment "This is file number 2"
AVU added to 1 data-objects
# imeta add -d myProject/myfile3 zvalue 15 meters
AVU added to 1 data-objects
# imeta add -d myProject/myfile3 colorLabel BLUE
AVU added to 1 data-objects
# imeta add -d myProject/myfile3 comment "This is file number 3"
AVU added to 1 data-objects

Here we've tagged myfile1 with 3 metadata labels:

- zvalue 10 meters

- colorlabel RED

- comment "This is file number 1"

Similar tags were added to 'myfile2' and 'myfile3'

Metadata come in form of AVU -- Attribute|Value|Unit. As seen from the above examples Unit is not necessary.

Let's list all metadata assigned to file 'myfie1':

# imeta ls -d myProject/myfile1
AVUs defined for dataObj myProject/myfile1:
attribute: zvalue
value: 15
units: meters
----
attribute: colorLabel
value: RED
units:
----
attribute: comment
value: This is file number 1
units:

To remove an AVU assigned to a file run:

# imeta rm -d myProject/myfile1 zvalue 15 meters
# imeta ls -d myProject/myfile1
AVUs defined for dataObj myProject/myfile1:
attribute: colorLabel
value: RED
units:
----
attribute: comment
value: This is file number 1
units:
#
#
# imeta add -d myProject/myfile1 zvalue 15 meters

Metadata may be assigned to directories as well:

# imeta add -C myProject simulationsPool 1
# imeta ls -C myProject
AVUs defined for collection myProject:
attribute: simulationsPool
value: 1
units:

Note the '-C' key that is used instead of '-d'.

Searching for data

Power of metadata becomes obvious when data needs to be found in large collections. Here is an illustration of how easy this task is done with iRODS via imeta queries:

# imeta qu -d zvalue = 15
collection: /cunyZone/home/<userid>/myProject
dataObj: myfile1
----
collection: /cunyZone/home/<userid>/myProject
dataObj: myfile3

We see both files that were tagged with label 'zvalue 10 meters'. Here is different query:

# imeta qu -d colorLabel = RED
collection: /cunyZone/home/<userid></myProject
dataObj: myfile1
----
collection: /cunyZone/home/<userid>/myProject
dataObj: myfile2

Another powerful mechanism to query data is provided with 'iquest'. Following is a number of examples that show 'iquest' capabilities:

iquest "SELECT DATA_NAME, DATA_SIZE WHERE DATA_RESC_NAME like 'cuny%'"
iquest "For %-12.12s size is %s" "SELECT DATA_NAME ,  DATA_SIZE  WHERE COLL_NAME = '/cunyZone/home/<userid>'"
iquest "SELECT COLL_NAME WHERE COLL_NAME like '/cunyZone/home/%' AND USER_NAME like '<userid>'"
iquest "User %-6.6s has %-5.5s access to file %s" "SELECT USER_NAME,  DATA_ACCESS_NAME, DATA_NAME WHERE COLL_NAME = '/cunyZone/home/<userid>'"
iquest " %-5.5s access has been given to user %-6.6s for the file %s" "SELECT DATA_ACCESS_NAME, USER_NAME, DATA_NAME WHERE COLL_NAME = '/cunyZone/home/<userid>>'"
iquest no-distinct "select META_DATA_ATTR_NAME"
iquest  "select COLL_NAME, DATA_NAME WHERE DATA_NAME like 'myfile%'"
iquest "User %-9.9s uses %14.14s bytes in %8.8s files in '%s'" "SELECT USER_NAME, sum(DATA_SIZE),count(DATA_NAME),RESC_NAME"
iquest "select sum(DATA_SIZE) where COLL_NAME = '/cunyZone/home/<userid>'"
iquest "select sum(DATA_SIZE) where COLL_NAME like '/cunyZone/home/<userid>%'"
iquest "select sum(DATA_SIZE), RESC_NAME where COLL_NAME like '/cunyZone/home/<userid>%'"
iquest "select order_desc(DATA_ID) where COLL_NAME like '/cunyZone/home/<userid>%'"
iquest "select count(DATA_ID) where COLL_NAME like '/cunyZone/home/<userid>%'"
iquest "select RESC_NAME where RESC_CLASS_NAME IN ('bundle','archive')"
iquest "select DATA_NAME,DATA_SIZE where DATA_SIZE BETWEEN '100000' '100200'"

Sharing data

Access to the data can be controlled via 'ichmod' command. It's behavior is similar to UNIX 'chmod' command. For example if there is a need to provide user <userid> with read access to file myProject/myfile1 execute the following command:

  ichmod read <userid1> myProject/myfile1

To see who has access to a file/directory use:

  # ils -A myProject/myfile1
  /cunyZone/home/<userid>/myProject/myfile1
  ACL - <userid1>
  #cunyZone:read object   <userid>#cunyZone:own

In the above example user <userid1> has read access to the file and user <userid> is an owner of the file.

Possible levels of access to a data object are null/read/write/own.

Backups (IN REVIEW)

Backups. /global/u user directories. Project files are backed up automatically to a remote tape silo system over a fiber optic network. Backups are performed daily.

If the user deletes a file from /global/u, it will remain on the tape silo system for 30 days, after which it will be deleted and cannot be recovered. If a user, within the 30 day window finds it necessary to recover a file, the user must expeditiously submit a request to hpchelp@csi.cuny.edu.

Less frequently accessed files are automatically transferred to the HPC Center robotic tape system, freeing up space in the disk storage pool and making it available for more actively used files. The selection criteria for the migration are age and size of a file. If a file is not accessed for 90 days, it may be moved to a tape in the tape library – in fact to two tapes, for backup. This is fully transparent to the user. When a file is needed, the system will copy the file back to the appropriate disk directory. No user action is required.

Data retention and account expiration policy (IN REVIEW)

Project directories on SR1 are retained as long as the project is active. The HPC Center will coordinate with the Principal Investigator of the project before deleting a project directory. If the PI is no longer with CUNY, the HPC Center will coordinate with the PI’s departmental chair or Research Dean, whichever is appropriate.

For user accounts, current user directories under /global/u are retained as long as the account is active. If a user account is inactive for one year, the HPC Center will attempt to contact the user and request that the data be removed from the system. If there is no response from the user within three months of the initial notice, or if the user cannot be reached, the user directory will be purged.

DSMS Technical Summary (IN REVIEW)


File Space	Purpose	Accessibility	Quota	Backups	Purges
Scratch: /scratch/<userid> on PENZIAS, ANDY, SALK, BOB	High Performance Parallel scratch filesystems. Work area for jobs, datasets, restart files, files to be pre-/post processed. Temporary space for data that will be removed within a short amount of time.	Not globally accessible. Separate /scratch/<userid> exists on each system. Visible on login and compute nodes of each system and on the data transfer nodes.	None	None	Files older than 2 weeks are automatically deleted OR when scratch filesystem reaches 70% utilization
Home: /global/u/<userid>	User home filespace. Essential data should be stored here, such as user's source code, documents, and data structures.	Globally accessible on the login and on the data transfer nodes through native GPFS or NFS mounts	Nominally 50 GB	Yes, backed up nightly to tape. If the active copy is deleted, the most recent backup is stored for 30 days.	Not purged
Project: /SR1/<PID>	Project space allocations	Accessible on the login and on the data transfer nodes. Accessible outside CUNY HPC Center through iRODS.	Allocated according to project needs	Yes, backed up nightly to tape. If the active copy is deleted, the most recent backup is stored for 30 days and retrievable on request, but the iRODS metadata may be lost.	Not purged

• SR1 is tuned for high bandwidth, redundancy, and resilience. It is not optimal for handling large quantities of small files. If you need to archive more than a thousand of files on SR1, please create a single archive using tar.

• A separate /scratch/<userid> exists on each system. On PENZIAS, SALK, KARLE, and ANDY, this is a Lustre parallel file system, on HERBERT it is NFS. These /scratch directories are visible on the login and compute nodes of the system only and on the data transfer nodes, but are not shared across HPC systems.

• /scratch/<userid> is used as a high performance parallel scratch filesystem, for example, temporary files (e.g. restart files) should be stored here.

• There are no quotas on /scratch/<userid>, however any files older than 2 weeks are automatically deleted. Also, a cleanup script is scheduled to run every two weeks or whenever the /scratch disk space utilization exceeds 70%. Dot-files are generally left intact from these cleanup jobs.

• /scratch space is available to all users. If the scratch space is exhausted, jobs will not be able to run. Purge any files in /scratch/<userid>, which are no longer needed, even before the automatic deletion kicks in.

• /scratch/<userid> directory may be empty when you login, you will need to copy any files required for submitting your jobs (submission scripts, data sets) from /global/u or from SR1. Once your jobs complete copy any files you need to keep back to /global/u or SR1 and remove all files from /scratch.

• Do not use /tmp for storing temporary files. The file system where /tmp resides in memory is very small and slow. Files will be regularly deleted by automatic procedures.

• /scratch/<userid> is not backed up and there is no provision for retaining data stored in these directories.

Data Handling Practices

HPFS, i.e., /global/u

• The HPFS is not an archive for non-HPC users. It is an archive for users who are processing data at the HPC Center. “Parking” files on the HPFS as a back-up to local data stores is prohibited.

• Do not store more than 1,000 files in a single directory. Store collections of small files into an archive (for example, tar). Note that for every file, a stub of about 4MB is kept on disk even if the rest of the file is migrated to tape, meaning that even migrated files take up some disk space. It also means that files smaller than the stub size are never migrated to tape because that would not make sense. Storing a large number of small files in a single directory degrades the file system performance.

/scratch

• Please regularly remove unwanted files and directories and avoid keeping duplicate copies in multiple locations. File transfer among the HPC Center systems is very fast. It is forbidden to use "touch jobs" to prevent the cleaning policy from automatically deleting your files from the /scratch directories. Use tar -xmvf, not tar -xvf to unpack files. tar -xmvf updates the times stamp on the unpacked files. The tar -xvf command preserves the time stamp from the original file and not the time when the archive was unpacked. Consequently, the automatic deletion mechanism may remove files unpacked by tar –xvf, which are only a few days old.

Data Storage and Management System

Contents

"Home" directories are on /global/u

/scratch

“Project” directories

Typical Workflow

iRODS (The iRODS Section is in REVIEW and may not be CURRENT)

Storing data to SR1

Getting data from SR1

Tagging data with metadata

Searching for data

Sharing data

Backups (IN REVIEW)

Data retention and account expiration policy (IN REVIEW)

DSMS Technical Summary (IN REVIEW)

Data Handling Practices

HPFS, i.e., /global/u

/scratch

Navigation menu