Skip to content

Hyak: HPC at the College of the Environment

All users please read the Hyak User Agreement before continuing

OVERVIEW 

HYAK is the name of a UWIT operated integrated, shared scalable, scientific super-computer that uses the Lolo archive system and a high-performance research network. “Shared” means that while groups purchase their own nodes and group members have priority use for those nodes, the nodes become available to anyone with a Hyak account when they are idle.  

KLONE 

Klone Hyak is the third generation Hyak high performance compute (HPC) cluster. Klone nodes run Rocky 8 Linux and currently hosts over 18,000 compute cores for use by UW researchers. The UW Research Computing group maintains comprehensive documentation on the Hyak cluster at https://hyak.uw.edu/docs 

Requesting Access & Groups 

Students: Any student can request access to Hyak through the Research Computing Club and access the STF nodes.  It is HIGHLY RECOMMENDED that students join this club, as the STF has approx. 98 nodes for students to use! 

Faculty and Staff: HPC at the UW is not a free service and faculty and researchers are encouraged to invest in shared CPU/GPU nodes. It is recommended you read the College of the Environment’s HYAK Use Policy for information on HYAK. 

The college has a shared pool of nodes in the COENV group. These nodes have been purchased by various faculty, researchers, and the Dean’s Office and use is restricted to include these investors.

Once you are placed in a group in Hyak, you will have access to that group’s compute resources, or ‘nodes’.  For example, the student STF group has access to 98 nodes. 

To determine what group you belong to in Hyak, simple use the command:  groups 

Setup 2-Factor Authentication “Duo” 

Once you have received a confirmation email that you have been added to the STF or coenv  group on Hyak, you will need to set up your 2FA devices by going to this link, and scrolling down to the “manage your device(s)” section. Follow the instructions there. Once you have set up your 2FA devices, you’ll be able to use your UW NetID, password, and 2FA device to log into Hyak. 

Activating the Hyak Computing Service 

At this point, you’ve either joined the Research Computing Club and have requested access to their nodes or requested access to the coenv nodes. Hopefully you have received confirmation of your request as you will need to activate and subscribe to these new UWIT Computing Services. 

Open your web browser of choice and go to the Manage UWNETID Resources  page, and log in using the UWNETID that will be accessing Hyak. 

Make sure you have clicked on “Computing Services” and on the right-hand window, you will see your Active Services and Inactive Services.  Hyak Server and Lolo Server should be listed under your Inactive Services.  Check the box for Hyak Services and click Subscribe.  When that subscription is successful, refresh this page.  Again, check the box for Lolo Server and click on the Subscribe button.  When that subscription is successful, refresh the page to verify that both the Hyak Server and Lolo Server are now listed as an Active Service.  

More information can be found here

How to Connect to Klone Hyak 

Mac Terminal SSH to Klone Hyak 

Hyak is a computer running UNIX. PC/Windows computers are not based on UNIX and thereby do not understand UNIX commands within the built in Windows Command Prompt (Windows understands DOS commands). All Macintosh computers have access to a Terminal that uses Unix. Terminal on the Mac can be found in the Applications/Utilities folder. 

To SSH into Klone Hyak, first open a terminal window.  At the prompt, type in: 

SSH -X yourUWNETID@klone.hyak.uw.edu 

Hit the <ENTER> key.  At this point, you will see a long message stating the authenticity of host can’t be established and asks if you are sure you want to continue, type ‘yes’

You will get a Password: prompt.  Please type in the password for your UWNETID and hit <enter>. 

At this point, you will be given options for Duo.  Use whatever option you have set up for your Duo connections. 

If your Duo connection was successful, your terminal will be logged in to Klone Hyak. 

Creating a Mac Terminal short to Klone Hyak. 

Macintosh Terminal shortcuts are a great way to save your fingers from typing out long and repeated commands.  In this example, we are going to teach create a terminal shortcut command called “klone” (minus the quotes) that will quickly SSH you to the Klone Hyak machine. 

Open a new Terminal window.  At the prompt, type the command:  nano klone 

ssh -X youruwnetid@klone.hyak.uw.edu 

Hit the CTRL and X keys 

Save the file as:  klone 

You just created a file called “klone”.  That file creates a single line of SSH code that will allow your UWNETID to log in to Klone.   You can type:  more klone to verify the contents of the klone file.  

 In order to get this file to work at your terminal prompt, type: 

chmod +x klone 

Hit the <enter> key 

Your newly created klone shortcut file should have the same permissions as below 

To activate or use your Klone shortcut, at your terminal prompt, type in: 

./klone 

You should be asked for your UWNETID password and then on to Duo and finally into Klone. 

Creating an alias to Klone Hyak. 

From the command line in the terminal: 

% cd  # (just to make sure we're in the home directory) 
% nano .bashrc  

type:

alias klone='ssh -X youruwnetid@klone.hyak.uw.edu' 

exit nano (ctrl X and follow directions to write and save). 

Windows OpenSSH connection to Klone Hyak 

Windows 10 and up now offer an optional OpenSSH client. To install click the Start menu, and type ‘Settings’ to open the Windows Settings app. 

Click the Apps section, and select ‘Optional Features’ 

Click ‘View Features’ and select ‘OpenSSH Client’ from the list. Click ‘Next’ to install the SSH client. To launch the ssh command from the windows command line use the command

ssh -X youruwnetid@klone.hyak.uw.edu 

Node Types 

There are 2 main types of Hyak nodes: 

  • Login 
  • Compute 

When you first ssh into KLONE (e.g., klone.hyak.uw.edu) you land on one of the two login nodes (i.e., klone1, klone2). Login nodes are shared amongst all users to transfer data, navigate the file system, and request resource slices to perform heavy duty computing. You should not use login nodes for heavy compute and automated mechanisms exist to monitor and enforce violations. 

Login Node 

  • Shell prompt looks like [UWNetID@klone1 ~]$ 
  • The first node you encounter upon logging in. 
  • For file transfers and manipulation. 
  • This node has internet connectivity. 
  • To submit jobs from the login node, see 5.  The Klone Job Scheduler 

Compute Node 

  • Build nodes have been removed in Klone, use any compute node available to your group to compile code 
  • Compute nodes have limited internet access and should not be used for data transfers 
  • Can be used to run interactive or batch-mode compute jobs see 5.  The Klone Job Scheduler 

Managing Files on Hyak 

Storage on Klone consists of several different areas with different quotas, access permissions and intended uses. It is important to understand how each of these file areas is allocated, and what it can and should not be used for. Every user has a personal home directory, which is limited in size, but contains files that are accessible only to that individual user. 

The primary storage location for computational data is an area called ‘gscratch’ Its name refers to the space’s intended use as ‘scratch’ space. In other words, this is a large pool of data, that is accessible to all users, where researchers can store large amounts of data while computation is being performed. It should not be used as long-term storage, and it is important for the users to back up their data regularly. 

Storage quotas are expressed in two types: Block quotas (the amount of total file space used) and inode quotas (the number of individual files used). Use the command hyakstorage on Klone to view your current quota usage by user and group. 

Home Directory 

Each Hyak user has a home directory which is shared among all Hyak nodes. Quotas are set to 10GB with a limit of 250,000 files (inodes). Hyak home directories are intended for files you want to keep completely private (ssh keys), or which are completely unique to you (your login scripts). They are not for storing big code source trees or doing any computation. 

Upon login to Klone, your prompt’s working directory will be your home director.  If you’d like to find out what your home directory path is, type: pwd directly after logging on. 

The above image shows the output of the pwd command and shows that this user’s home directory path is:  /usr/lusers/parker 

gscratch 

It is recommended that Hyak users read the Hyak Storage documentation before proceeding, as a better understanding of gscratch will help before you proceed. 

First thing you will want to do is to create your gscratch directory on Hyak Klone.  To do so, you will need to know what group(s) you belong too.  If you are a student and have joined the Research Computing Club, you will belong to the “stf” group.  You may also belong in the “coenv” group.  To see your list of groups, type in the command:  groups 

To begin, log in to Hyak per the instructions above.  Once logged in, go to the gscratch directory by typing:

cd /gscratch 

Type: 

ls -a  

This will show you a directory list of all the directory groupings in Hyak.  

In our example, we want to go to the directory called coenv and create our own directory called “parker”.  It is STRONGLY recommended that you create a directory that matches your UWNETID.   

To get in the coenv directory, type: 

cd coenv 

To see the entire list of directories, type:

ls -a 

To create your own directory type:  mkdir YOURUWNETID (in this example, I typed in mkdir parker.  Use your OWN UWNETID….not mine) 

To get in to your newly created directory, type:  cd YOURUWNETID 

To verify or find the path to this directory, type: pwd 

You will see the path to this newly created directory is: /gscratch/coenv/parker 

You will use this path in your jobs/code as the path where your temporary working data is stored.  The image below shows this process. 

File Transfers 

The main method to transfer data to/from Hyak is to use rsync. rsync is a file transfer program. It copies specified files/folders from one location to another. Additionally, it verifies data integrity after the files have been transferred. This feature is critical, due to the large file sizes we frequently work with. 

Copy files to Klone: 

rsync --archive --progress --verbose username@remote_computer_IP:/path/to/remote/directory/path/to/remote/file /path/to/klone/directory 

Copy entire folder to Klone (it is important to make sure there is no / at the end of the remote path): 

rsync --archive --progress --verbose username@remote_computer_IP:/path/to/remote/directory/path/to/remote/directory /path/to/klone/directory 

Copy files from Klone (run command on the Klone login node): 

rsync --archive --progress --verbose /path/to/local/file username@remote_computer_IP:/path/to/remote/directory 

Copy entire folder from Klone: 

Navigate to the directory immediately above the one which you are interested in copying and then run the following command): 

rsync --archive --progress --relative --verbose ./directory_to_copy username@remote_computer_IP:/path/to/remote/directory 

The Klone Job Scheduler 

How to Run a Job 

See the Hyak Documentation page for complete information on scheduling jobs in Hyak. 

Klone compute resources are organized by Account and Partition. An account is similar to the ‘group’ you have joined and a Partition represents a specific type of compute resource (for instance CPU or GPU). 

To see the resources that you have access to, you can issue the hyakalloc command from the command prompt. You will see a table that looks like this: 

This user has access to the resources from two groups: coenv, and goaclim. Both groups have nodes in the ‘compute’ partition, which is the standard high-performance computing resource type. Other partitions that groups may have access to are high-memory and GPU partitions. Run the sinfo command to see a complete list of all partitions. 

Klone uses the SLURM scheduler to manage access to the compute resources. SLURM jobs can be submitted from the initial login nodes and will run on resources from the compute nodes. In general, you will be submitting ‘batch’ or non-interactive jobs directly to the compute nodes, however you can also launch ‘interactive’ jobs which give you access to the command line on the compute nodes so you can interact with your compute processes. You may also create ‘recurring’ jobs that will run on a predefined schedule. When a non-interactive batch job completes, Hyak will send you a notification email. 

Interactive Jobs 

To run an interactive job from the login node, use the salloc command: 

salloc -A coenv -p compute -N 1 -c 4 –mem=10G –time=1:00:00 

This command will allocate a single compute node (-N 1) with 4 processor cores (-c 4), 10 gigabytes of memory (–mem=10G) for one hour (–time=1:00:00). When the command finishes, you will find yourself in a command line shell on the allocated compute node. You may use this shell to run your compute processes. 

Batch Jobs 

Batch jobs are run from the login node using the sbatch command. Sbatch uses a script to execute its jobs, which you must provide. An example script to set up a batch job looks like this: 

#!/bin/bash 
#SBATCH --job-name=<name> 
#SBATCH --mail-type=<status> 
#SBATCH --mail-user=<email> 
#SBATCH --account=coenv 
#SBATCH --partition=compute 
#SBATCH --nodes=<num_nodes> 
#SBATCH --ntasks-per-node=<cores_per_node> 
#SBATCH --mem=<size[unit]> 
#SBATCH --gpus=<type:quantity>  
#SBATCH --time=<time> # Max runtime in DD-HH:MM:SS format. 
#SBATCH --chdir=<working directory> 
#SBATCH --export=all 
#SBATCH --output=<file> # where STDOUT goes 
#SBATCH --error=<file> # where STDERR goes

# Modules to use (optional). 
<e.g., module load apptainer> 

# Your programs to run. 
<my_programs> 

Bold fields should be replaced with your custom values. Save the script with a name like mybatch.slurm. Then you will run the batch job with the command sbatch mybatch.slurm 

Note that there are two lines in the script that were not necessary under the previous cluster, mox: 

#SBATCH --account=coenv 
#SBATCH --partition=compute 

d. Modules 

Modules are used in Hyak to allow an easy ‘plug-in’ method to install additional software packages. To see what modules are currently available, you can use the command module avail from a compute node interactive shell. Note that this command will not work on the login node and will instead give you a warning message. 

By default, the Research Computing groups maintains a large variety of useful modules such as compilers (gcc, g++, gfortran), programming and data languages (R, Python) and libraries. Additional modules are also supplied by the community at large, but these may not have full support available. Modules that are supported by default are marked with a (D) when you run the module avail command.  

To load a module for use, you can use the command module load <modulename> and to unload a module, use module unload <modulename> 

These commands can be included at the beginning of your batch scripts to ensure that any required software is available when your batch job is run 

Help & Resources 

Help is only a click away.  Here are some common Help & Resource avenues you may want to pursue. 

Hyak-users Mailman list. 

The UW maintains a Mailman Email List for Hyak Users. 

The UW Research Computing Hyak Documentation 

The Research Computing group have published detailed documentation on the usage of the Hyak Klone compute cluster at https://hyak.uw.edu/docs/ 

Helpful Hyak Commands 

groups # shows you the groups you belong too in Hyak. 
hyakalloc # shows you have much memory is available for each node 
saact # to find specifics about your job.