Skip to content

Hyak: Managing Files and Storage

Managing Files on Hyak 

Storage on Klone consists of several different areas with different quotas, access permissions and intended uses. It is important to understand how each of these file areas is allocated, and what it can and should not be used for. Every user has a personal home directory, which is limited in size, but contains files that are accessible only to that individual user. 

The primary storage location for computational data is an area called ‘gscratch’ Its name refers to the space’s intended use as ‘scratch’ space. In other words, this is a large pool of data, that is accessible to all users, where researchers can store large amounts of data while computation is being performed. It should not be used as long-term storage, and it is important for the users to back up their data regularly. 

Storage quotas are expressed in two types: Block quotas (the amount of total file space used) and inode quotas (the number of individual files used). Use the command hyakstorage on Klone to view your current quota usage by user and group. 

Home Directory 

Each Hyak user has a home directory which is shared among all Hyak nodes. Quotas are set to 10GB with a limit of 250,000 files (inodes). Hyak home directories are intended for files you want to keep completely private (ssh keys), or which are completely unique to you (your login scripts). They are not for storing big code source trees or doing any computation. 

Upon login to Klone, your prompt’s working directory will be your home director.  If you’d like to find out what your home directory path is, type: pwd directly after logging on. 

~ $ pwd
/mmfs1/home/shrike
~ $

The above image shows the output of the pwd command and shows that this user’s home directory path is:  /mmfs1/home/shrike

Note: in UNIX environments the ‘~’ character is shorthand for your home directory, and the ‘$’ is the standard command prompt character.

gscratch 

It is recommended that Hyak users read the Hyak Storage documentation before proceeding, as a better understanding of gscratch will help before you proceed. 

First thing you will want to do is to create your gscratch directory on Hyak Klone.  To do so, you will need to know what group(s) you belong too.  If you are a student and have joined the Research Computing Club, you will belong to the “stf” group.  You may also belong in the “coenv” group.  To see your list of groups, type in the command:  groups 

~ $ groups
all coenv test
~ $

To begin, log in to Hyak per the instructions above.  Once logged in, go to the gscratch directory by typing cd /gscratch/<groupname>. For instance to use the coenv gscratch directory you would type:

~ $ cd /gscratch/coenv
coenv $

Type: 

coenv $ ls -a
 .            containers         jhrag      sadm_shrike   .snapshots
 ..           darrd              jswon11    samwhite      upanpra
 avancise     erfanbh            local      sgarrote     
 bumblereem   farrm              noahrose   shared
 chrisbee     .hyakstorage.csv   nwharton   shrike
 cliu18       isabrand           qgoestch   sjuber
coenv $

This will show you a directory list of the subfolders in the ‘coenv’ gscratch directory.  

Create a new directory with your username, then change to that directory and list it’s contents:

coenv $ mkdir $USER
coenv $ cd $USER
shrike $ ls -a
.  ..
shrike $ pwd
/gscratch/coenv/shrike
shrike $

NOTE: The environment variable ‘$USER’ expands to your login ID.

The new directory will be empty, showing only the ‘.’ and ‘..’ entries which represent the current directory and the directory containing the current directory (up one directory). You will use this path in your jobs/code as the path where your temporary working data is stored.  The image below shows this process. 

Tip: Create a shortcut in your home directory directly to your gscratch folder for easier access:

shrike $ cd
~ $ ln -s /gscratch/coenv/$USER gscratch
~ $ ls -lh
lrwxrwxrwx  1 shrike all   22 Feb 10 15:20 gscratch -> /gscratch/coenv/shrike

File Transfers 

The main method to transfer data to/from Hyak is to use rsync. rsync is a file transfer program. It copies specified files/folders from one location to another. Additionally, it verifies data integrity after the files have been transferred. This feature is critical, due to the large file sizes we frequently work with. 

Copy files to Klone: 

rsync --archive --progress --verbose username@remote_computer_IP:/path/to/remote/directory/path/to/remote/file /path/to/klone/directory 

Copy entire folder to Klone (it is important to make sure there is no / at the end of the remote path): 

rsync --archive --progress --verbose username@remote_computer_IP:/path/to/remote/directory/path/to/remote/directory /path/to/klone/directory 

Copy files from Klone (run command on the Klone login node): 

rsync --archive --progress --verbose /path/to/local/file username@remote_computer_IP:/path/to/remote/directory 

Copy entire folder from Klone: 

Navigate to the directory immediately above the one which you are interested in copying and then run the following command): 

rsync --archive --progress --relative --verbose ./directory_to_copy username@remote_computer_IP:/path/to/remote/directory 

The above command switches can be shortened to their single-character versions:

rsync -aPRv ./directory_to_copy username@remote_computer_IP:/path/to/remote/directory 

Another option for transferring data to and from Hyak is Rclone. You can find more information about Rclone at Hyak: Using Synology Storage