New Users - Welcome

Welcome!

A warm welcome to new users of the Biomedical Research Computing (BMRC) Facility. This short guide aims to help you orientate yourself and get started quickly.

Please join our mailing list

Please follow the instructions here on how to join our mailing list. All users are warmly encouraged to join. The mailing list is reserved for important news and service announcements (it is not a 'chat' list).

Your First Login

Once your account has been created, you will receive a username and temporary password to access our systems. The temporary password must be changed on first login - please follow the instructions in the welcome email. See our login guide for general information about how to connect and note that you must be connected to the University of Oxford network, either via a physical connection or via VPN, in order to connect via ssh (see the login guide for further info). If you have problems with your first login, please email us on bmrc-help@medsci.ox.ac.uk. If you have problems on subsequent logins, please first see of Frequently Asked Questions page for advice on self-diagnosing the issue.

Using the Linux Shell

BMRC systems use the BASH shell. If you are not familiar with shells, you can find numerous tutorials on the internet. The website HPC Carpentry offers a good introduction here.

Where should I put my files?

On the BMRC cluster, you have two folders for your dedicated personal use.

Your home folder will be located at /users/<group>/<username> - with <group> and <username> being your group name and username from your welcome email. This folder is intentionally very small (max 10GB) - you should use it only for storing essential configuration files which software often expects to find there, like the Bash configuration file .bashrc.
Your group home folder will be located at /well/<group>/users/<username>. This folder shares in your group's allocation for disk space so please use this folder to store all your data, code and other files.

You can see how much data is in your home folder by running du -hs ~ (this may take a while to calculate).

You can see how much space is available in your group folder by running df -BG /well/<group> (NB use your own group path here).

Note for Conda/Anaconda users

By default, conda will store your environments and downloaded packages in your home directory under ~/.conda but this will quickly cause your home directory to run out of space. To prevent this from happening see our recommended conda configuration.

Accessing Pre-installed scientific Software

Please see our guide to software modules and our directory of pre-installed scientific software.

Running Code

To learn how to submit your code to the cluster, please read our guide to Using the Cluster. This guide introduces the concepts of cluster computing so that you can understand what the cluster is, how to submit your jobs, and why you should not run your code directly on cluster1-4.

When using the BMRC cluster, you do not run your code directly. In this respect, using the BMRC cluster is different to using your own computer and it is important that you understand why this is.

After logging in to the BMRC cluster, you will arrive on one of the computers named cluster1-4. The purpose of cluster1-4 is not to run your code - their purpose is to allow you to submit your code to the computing cluster via either a pre-written script using sbatch or via an interactive cluster session using srun as explained in the guide mentioned above.

All CPU-intensive, RAM-intensive or disk-intensive code running directly on cluster1-4 is considered a misuse and liable to be terminated without warning in order to prevent adverse effects on other users.

Monitoring your cluster jobs

You are able to check the status of your currently queued and running jobs with squeue. You can also check the status of completed jobs using the sacct command. It is VITAL that these commands are not overused.

Overusing using squeue or sacct can overload the scheduling software. This would mean that the cluster would be unusable for everyone - hundreds and potentially thousands of other users, including yourself - a catastrophic result.

Using squeue or sacct manually (i.e. typing it yourself every so often) is harmless. Problems are likely to arise, however, if these commands are repeatedly called in an automated way. In order to prevent catastrophe, please ensure that any software you are using (scripts, pipelining tools, etc) run these commands with a delay of at least 100 seconds between calls.

If you must use e.g. the watch command then use e.g. watch -n 100 squeue -j <job_id> (i.e. set the value of n to at least 100). However, the circumstances where it would make sense to use watch with squeue or sacct are rare. If you want to start one job only when another has finished, or if you want a notification when a job starts or ends, there are better ways to achieve that - please email us for advice.

Users of snakemake, a python pipelining tool, should note that the default settings of snakemake will cause a catastrophic incident because by default snakemake runs the monitoring commands ten times per second. In order to use snakemake safely, you MUST therefore set --max-status-checks-per-second 0.01 in order to ensure that these commands are run 100 seconds apart.

Accessing the Internet

Please note that internet access is only possible directly from cluster1-4. On the compute nodes access is only available via a proxy.

Installing your own R packages

When you need to install your own R packages, please follow our dedicated guide.

Interactive Sesssions via Open OnDemand

Interactive sessions are available via Open OnDemand. Please follow our guide here. Available options include a remote desktop, Jupyter and RStudio.

Cookies on this website