Getting started on FASSE#
The following are instructions for logging in to FASSE and setting up your own workspace.
Prerequisites. Join our project group#
Get a FASRC account by requesting it here.
Navigate to the Add Grants page in portal, you will need to login with your FASRC account
Expand the plus sign next to “Other”
Find the project group you want to be added to:
dominici_nsaph
Select the checkbox for the project group you want to be added to
Your PI will have to approve the addition. Once you’re notified of the approval, it can take up to an hour for your permissions to be configured. If you’re not able to access the VPN or your home directory, try waiting an hour and logging in again.
Step 1. Connect to Harvard’s VPN#
Install Cisco AnyConnect client to connect to the FASRC VPN. Install 2FA, i.e., Google Authenticator for FASRC. Set it up as explained here.
Type
vpn.rc.fas.harvard.edu
in the Cisco AnyConnect text box (see figures).Type your username in the format
username@fasse
, password and verification code (same as for FASRC).


Warning
CMS prohibits accessing data while outside of the U.S., this includes not only opening data files but also submitting code/jobs to run on the data.
Step 2. Access FASSE#
There are a few ways to access FASSE. You can access it via VDI/OoD (in the web browser) by clicking the link here: https://fasseood.rc.fas.harvard.edu/
You can also access it via command line (Terminal) by typing: ssh username@fasselogin.rc.fas.harvard.edu
.
To learn more about working in the command line, check out this Unix Shell tutorial.
Note
The username, password and verification code are the same as in the previous step (and the same as for FASRC).
Tip
For more information, see the official documentation.
Step 3. Project workspace#
Your project name should be informative for the group members and outsiders. Think of a project name in the following format:
<exposure>-<outcome>-<method>
Exposure examples:
pm-components
,pm-no2
,pm-no2-o3
,heat-alert
Outcome examples:
cardiovascular
,respiratory
,adrd
Method:
reinforcement-learning
,causalgps
For example: heat_alert-mortality-reinforcement_learning
or shorter heat_alert-mortality-rl
.
In practice, you may have multiple exposures and outcomes. In that case, use your best judgement for your project name based on the guidelines. Avoid adding information such as usernames and current date or year.
Next, you should create a folder with your project name in the NSAPH projects folder at /n/dominici_nsaph_l3/Lab/projects
.
You can do that by opening “File System” in FAS-RC Remote Desktop and navigating to the projects folder (see Fig.).

Create there a new folder with your project name (ie, heat_alert-mortality-rl
).
Note
Use your project name folder in /n/dominici_nsaph_l3/Lab/projects
as a workspace
for your analysis data and code.
Step 4. Create a git repository on GitHub#
Navigate to NSAPH Projects GitHub organization in your web browser. NSAPH Projects GitHub organization is a shared account where all NSAPH members can collaborate across many projects at once. If you are not already a member of NSAPH Projects, ask one of the admins to add you to the organization.
Crete a new git repository under NSAPH Projects and name it with your project name.
Going forward, make sure to update your GitHub repository daily with your analysis code and documentation.
If you are not familiar with using git
, check out this git tutorial.
Also, check out our guidelines for collaborative work on GitHub.
Note
You should link your GitHub account to the FASSE workspace by typing the commands below in FASSE’s command line. By doing this, all code contributions (commits) from FASSE will be linked to your GitHub account.
git config --global user.name "Mona Lisa"
git config --global user.email "email@example.com"
While you may find FASRC documents suggesting the use of SSH to your repositories, FASSE environments are configured specifically so that the port used for SSH is blocked. Therefore, you should use the HTTPS version of git repo address when VCS your projects. This does mean that you are required to enter username and password each time a sync is performed between remote and the local.
Note
The prompt for your password is NOT your actual Github password. Instead, you need to enter the generated token in replacement of the password. See how to generate token here.
Step 5. Analytic Data#
Much of the NSAPH data is already available on FASSE. Check out the data catalogue here.
If you’d like to use any of the analytic datasets, create a symbolic link (symlink) of that dataset instead of creating a new copy. A symbolic link is a reference to another file or directory that the operating system interprets as a path to that file or directory (a shortcut).
This is how you create a symlink from your data
folder (in the command line):
cd data
ln -s ../analytic/DATA_FOLDER .
Step 6. Setting up R and RStudio#
To load R and install packages, follow these directions.
If you’re using RStudio, you’ll need your R_LIBS_USER
path to set up the interactive session.
In RStudio, if you want to see files outside of your home directory, you can click the three dots
on the upper right-hand side of the Files window in RStudio (under the refresh arrow) and type
in the directory path you want. If you want to save files outside your home directory, you can change
your working directory using the command setwd([directory path])
in the Console.
Step 7. Organize your folder#
Consider organizing your project folder (and repository) as follows:
project-name
├── README.md
├── data/
├── code/
├── figures/
├── reports/
├── results/
└── .gitignore
Tip
Have a look at the NSAPH Project Template. Also, here is another template example for new research projects: djnavarro/newproject
Make sure to use the README.md
and .gitignore
special files. A README.md
file is a standard documentation file where you should put information about the content of your
repository. A .gitignore
file tells Git which files to ignore when committing your project to the GitHub
repository. It should be located in the root directory of your repo. Large data file and sensitive data should be
ignored by Git.
Warning
Be careful not to push sensitive data on GitHub. Don’t forget that Medicare/Medicaid data should not leave Harvard,
but your analysis code should be versioned with Git. A .gitignore
file helps with that.
Add path and/or file names of your data in the .gitignore
file. You can ignore the data
sub-folder and/or
all files of a certain format like .csv
, .nc
or .rst
. Add these as new lines in .gitignore
. For example:
data/
*.csv
*.nc
*.rst
Note:#
The terminology of CPUs, nodes, and cores can be confusing in FASSE. One node is like a computer, and it contains one CPU and many cores. The multiple cores can do calculations at the same time. When requesting a job in FASSE, you request a portion of a node. However, the Slurm job management system refers to CPUs as the cores of the CPU.
Note
Make sure to acknowledge the use of FASRC in your publications. From the FASRC website, “Please use the following text as a guideline: ‘The computations in this paper were run on the FASRC Cannon cluster supported by the FAS Division of Science Research Computing Group at Harvard University.’”