Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

Cluster Software and Modules

Introduction

Scientific research computing often requires the use of specialist software. This guide discusses how to use the most commonly required standard software packages used by research scientists via the modules system (explained below).

Software modules are one excellent source of research software. Often, however, users will find they have a need for a piece of software that is not yet available on our system - perhaps because it is new or because they are the first to consider using it on our system. If you need to use software that is not currently installed, please email us with details.

Scientific Software Directory

Please see our Scientific Software Directory for a list of all available software, including software modules and other software available as singularity images.

The Need for Software Modules

Providing research-specific software to meet scientific needs on a cluster system raises a number of challenges.

  • For reasons of stability and reproducibility in your scientific work, different users may need different versions of specific software. For example, if you wish to verify an analysis run with version N of a particular software, or if you wish to continue an analysis begun with version N, then version N needs to remained installed even if version N is now several years old and the current version of that software has moved on.
  • For reasons of different project requirements, you may need to use different versions of a particular piece of software for different projects. For example, you may need to use version N with one project but need to use version M with a different project.
  • For reasons of different computing environments, some users may need a particular software to be compiled for a specific CPU or GPU while other users need need it compiled for a different CPU or GPU.
  • For reasons of different software requirements, version N of software A may require version X of software B to be installed, while version N+1 of software A may required version X+1 of software B to be installed.
  • For reasons of software development, software developers and testers will often need to install new versions of software for testing purposes while keeping the existing installed versions so that users of the existing versions can continue their work.

For all of these reasons, users need to be able to choose between multiple versions of the same software and they also need a simple way - ideally, an automatic way! - to get the correct versions of any supporting software packages. The modules sytems is the standard way to provide multiple versions of the same software with automatic configuration of supporting software.

Modules for different cpu architectures

The BMRC cluster currently comprises computers with two different Intel CPU architectures, ivybridge 
and skylake. For this reason, we maintain separate versions of each piece of software for each CPU architecture

  • Ivybridge software and modules are located in /apps/eb/ivybridge/...
  • Skylake software and modules are located in /apps/eb/skylake/...

In general, users need not worry about these details - all the relevant configuration happens automatically. However, when submitting jobs using qsub (see our cluster guide), using of the -V parameter can interfere with this process, so we ask users to avoid using that parameter and instead do all necessary environment configuration within their job submission script.

We also have a number of older software installs and modules installed on the system, but we encourage all users to switch to using the newer dedicated ivybridge/skylake modules wherever possible.

Listing and Choosing Available Modules

First login to the cluster.

Once logged in, run module avail to see the list of available modules. This will list all the modules available to you, grouped by their path on the file system. For example, on rescomp1 the output you see might look like this:

$ module avail
------------- /apps/eb/skylake/modules/all ------------
ANTs/2.3.1-foss-2018b-Python-3.6.6
Anaconda3/2019.10
Anaconda3/5.1.0
Anaconda3/5.3.0
Autoconf/2.69-GCCcore-6.4.0
Autoconf/2.69-GCCcore-7.3.0
Autoconf/2.69-GCCcore-8.2.0
[More packages here...]

The output here shows modules that are being sourced from our skylake repository at /apps/eb/skylake/...

Loading Modules

To load a module run use the module load or module add command with the full name of the package. For example, to load a recent version of the R software you can run:

$ module load R/3.6.2-foss-2019b
It is strongly recommended that you use the full name of the module that you wish to load in your module add or module load command. If you run simply module load R, then the version of R you get may be unpredictable over time and so pipelines that previously ran correctly may start to fail. Better to load modules using their full name.

Automatic Dependency Loading

When you load a module, the system automatically takes care to load the correct versions of any dependencies by loading the relevant versions of their modules. To see the full list of modules you have load, run . For example, let’s see what happens when we load the Python/3.7.4-GCCcore-8.3.0 module. First clear your currently loaded modules by running module purge. Now run these commands:

$ module purge
$ module list
No Modulefiles Currently Loaded.
$ module load Python/3.7.4-GCCcore-8.3.0
$ module list
Currently Loaded Modulefiles:
1) GCCcore/8.3.0 3) binutils/2.32-GCCcore-8.3.0 5) ncurses/6.1-GCCcore-8.3.0 7) Tcl/8.6.9-GCCcore-8.3.0 9) XZ/5.2.4-GCCcore-8.3.0 11) libffi/3.2.1-GCCcore-8.3.0
2) zlib/1.2.11-GCCcore-8.3.0 4) bzip2/1.0.8-GCCcore-8.3.0 6) libreadline/8.0-GCCcore-8.3.0 8) SQLite/3.29.0-GCCcore-8.3.0 10) GMP/6.1.2-GCCcore-8.3.0 12) Python/3.7.4-GCCcore-8.3.0

As you can see, the Python/3.7.4-GCCcore-8.3.0 module required a number of other supporting pieces of software, each with specific version requirements. The modules system automatically takes care of this for you. In this example, a request to load the Python/3.7.4-GCCcore-8.3.0 module automatically loaded a total of twelve separate module files.

Removing/Unloading Individual Modules

To remove or unload a specific module, use the rm or unload commands

$ module unload Python/3.7.4-GCCcore-8.3.0
$ module list
Currently Loaded Modulefiles:
1) GCCcore/8.3.0 3) binutils/2.32-GCCcore-8.3.0 5) ncurses/6.1-GCCcore-8.3.0 7) Tcl/8.6.9-GCCcore-8.3.0 9) XZ/5.2.4-GCCcore-8.3.0 11) libffi/3.2.1-GCCcore-8.3.0
2) zlib/1.2.11-GCCcore-8.3.0 4) bzip2/1.0.8-GCCcore-8.3.0 6) libreadline/8.0-GCCcore-8.3.0 8) SQLite/3.29.0-GCCcore-8.3.0 10) GMP/6.1.2-GCCcore-8.3.0
 This example demonstrates that while the module system takes care of automatically loading supporting software, it does not automatically unload supporting software. In the above example, running module load Python/3.7.4-GCCcore-8.3.0 < automatically also loaded eleven other supporting modules. However, running module unload Python/3.7.4-GCCcore-8.3.0 unloads only that specific software module. If you wish to unload these other modules, you can do so either individually or by unloading all modules with the module purge command (see below).

Removing/Unloading All Loaded Modules

To unload all loaded modules, run:

$ module purge

Using Modules in Scripts and Cluster Jobs

You can use modules in scripts, including those you will submit as cluster jobs, exactly as you would use them as the command line. For example, you can simply add the line module load R/3.6.2-foss-2019b to your script and then proceed to use R in your script.

How Modules Work

Sometimes, it is helpful to know in detail how software modules work behind the scenes. Software modules are also called environment modules because they work by setting a number of options in the user’s terminal environment. Most modules, for example, alter the user executable search path (the PATH environment variable in order to make the relevant software available. For example, if you compare the output of echo $PATH before and after running module load Python/3.7.4-GCCcore-8.3.0 you will see that loading this module involves adding /apps/eb/skylake/software/Python/3.7.4-GCCcore-8.3.0/bin to the beginning of your PATH. This ensures that your terminal knows where to find this specific version of python and uses it in preference to any other python version which may be in your path.