Cluster Software & Modules
Cluster Software and Modules
Quick Links
- Why software modules are needed
- Modules for different CPU architectures
- Using Modules
- Using "In Development" Modules
- How Modules work
Introduction
Scientific research computing often requires the use of specialist software. This guide discusses how to use the most commonly required standard software packages used by research scientists via the modules system (explained below).
Software modules are one excellent source of research software. Often, however, users will find they have a need for a piece of software that is not yet available on our system - perhaps because it is new or because they are the first to consider using it on our system. If you need to use software that is not currently installed, please email us with details.
Scientific Software Directory
Please see our Scientific Software Directory for a list of all available software, including software modules and other software available as singularity images.
Why Software Modules are needed
Providing research-specific software to meet scientific needs on a cluster system raises a number of challenges.
- For reasons of stability and reproducibility in your scientific work, different users may need different versions of specific software. For example, if you wish to verify an analysis run with version N of a particular software, or if you wish to continue an analysis begun with version N, then version N needs to remained installed even if version N is now several years old and the current version of that software has moved on.
- For reasons of different project requirements, you may need to use different versions of a particular piece of software for different projects. For example, you may need to use version N with one project but need to use version M with a different project.
- For reasons of different computing environments, some users may need a particular software to be compiled for a specific CPU or GPU while other users need need it compiled for a different CPU or GPU.
- For reasons of different software requirements, version N of software A may require version X of software B to be installed, while version N+1 of software A may required version X+1 of software B to be installed.
- For reasons of software development, software developers and testers will often need to install new versions of software for testing purposes while keeping the existing installed versions so that users of the existing versions can continue their work.
For all of these reasons, users need to be able to choose between multiple versions of the same software and they also need a simple way - ideally, an automatic way! - to get the correct versions of any supporting software packages. The modules sytems is the standard way to provide multiple versions of the same software with automatic configuration of supporting software.
Modules for different cpu architectures
The BMRC cluster currently comprises computers with two different Intel CPU architectures, ivybridge
and skylake. For this reason, we maintain separate versions of each piece of software for each CPU architecture
- Ivybridge software and modules are located in /apps/eb/ivybridge/...
- Skylake software and modules are located in /apps/eb/skylake/...
In general, users need not worry about these details - all the relevant configuration happens automatically. However, when submitting jobs using qsub (see our cluster guide), using of the -V parameter can interfere with this process, so we ask users to avoid using that parameter and instead do all necessary environment configuration within their job submission script.
We also have a number of older software installs and modules installed on the system, but we encourage all users to switch to using the newer dedicated ivybridge/skylake modules wherever possible.
Using Modules
Listing and Choosing Available Modules
First login to the cluster.
Once logged in, run module avail
to see the list of available modules. This will list all the modules available to you, grouped by their path on the file system. For example, on rescomp1 the output you see might look like this:
$ module avail
------------- /apps/eb/skylake/modules/all ------------
ANTs/2.3.1-foss-2018b-Python-3.6.6
Anaconda3/2019.10
Anaconda3/5.1.0
Anaconda3/5.3.0
Autoconf/2.69-GCCcore-6.4.0
Autoconf/2.69-GCCcore-7.3.0
Autoconf/2.69-GCCcore-8.2.0
[More packages here...]
The output here shows modules that are being sourced from our skylake repository at /apps/eb/skylake/...
Loading Modules
To load a module run use the module load
or module add
command with the full name of the package. For example, to load a recent version of the R software you can run:
$ module load R/3.6.2-foss-2019b
Automatic Dependency Loading
When you load a module, the system automatically takes care to load the correct versions of any dependencies by loading the relevant versions of their modules. To see the full list of modules you have load, run . For example, let’s see what happens when we load the Python/3.7.4-GCCcore-8.3.0 module. First clear your currently loaded modules by running module purge
. Now run these commands:
$ module purge
$ module list
No Modulefiles Currently Loaded.
$ module load Python/3.7.4-GCCcore-8.3.0
$ module list
Currently Loaded Modulefiles:
1) GCCcore/8.3.0 3) binutils/2.32-GCCcore-8.3.0 5) ncurses/6.1-GCCcore-8.3.0 7) Tcl/8.6.9-GCCcore-8.3.0 9) XZ/5.2.4-GCCcore-8.3.0 11) libffi/3.2.1-GCCcore-8.3.0
2) zlib/1.2.11-GCCcore-8.3.0 4) bzip2/1.0.8-GCCcore-8.3.0 6) libreadline/8.0-GCCcore-8.3.0 8) SQLite/3.29.0-GCCcore-8.3.0 10) GMP/6.1.2-GCCcore-8.3.0 12) Python/3.7.4-GCCcore-8.3.0
As you can see, the Python/3.7.4-GCCcore-8.3.0 module required a number of other supporting pieces of software, each with specific version requirements. The modules system automatically takes care of this for you. In this example, a request to load the Python/3.7.4-GCCcore-8.3.0 module automatically loaded a total of twelve separate module files.
Removing/Unloading Individual Modules
To remove or unload a specific module, use the rm
or unload
commands
$ module unload Python/3.7.4-GCCcore-8.3.0
$ module list
Currently Loaded Modulefiles:
1) GCCcore/8.3.0 3) binutils/2.32-GCCcore-8.3.0 5) ncurses/6.1-GCCcore-8.3.0 7) Tcl/8.6.9-GCCcore-8.3.0 9) XZ/5.2.4-GCCcore-8.3.0 11) libffi/3.2.1-GCCcore-8.3.0
2) zlib/1.2.11-GCCcore-8.3.0 4) bzip2/1.0.8-GCCcore-8.3.0 6) libreadline/8.0-GCCcore-8.3.0 8) SQLite/3.29.0-GCCcore-8.3.0 10) GMP/6.1.2-GCCcore-8.3.0
module load Python/3.7.4-GCCcore-8.3.0
< automatically also loaded eleven other supporting modules. However, running module unload Python/3.7.4-GCCcore-8.3.0
unloads only that specific software module. If you wish to unload these other modules, you can do so either individually or by unloading all modules with the module purge
command (see below).
Removing/Unloading All Loaded Modules
To unload all loaded modules, run:
$ module purge
Using Modules in Scripts and Cluster Jobs
You can use modules in scripts, including those you will submit as cluster jobs, exactly as you would use them as the command line. For example, you can simply add the line module load R/3.6.2-foss-2019b
to your script and then proceed to use R in your script.
USING "IN DEVELOPMENT" MODULES
We maintain a "development" tree of new modules which contains a number of newer modules than those available by default, although note that these are often considered "in testing". To activate your access to these modules, you may use the following within a script or at the command line:
module use -a /apps/eb/dev/${MODULE_CPU_TYPE}/modules/all
You can also add this line to your ~/.bash_profile file in order to automatically activate the development modules.
Once activated, you can access them in the normal way i.e. module avail to show what is available and module load to load a module.
How Modules Work
Sometimes, it is helpful to know in detail how software modules work behind the scenes. Software modules are also called environment modules because they work by setting a number of options in the user’s terminal environment. Most modules, for example, alter the user executable search path (the PATH
environment variable in order to make the relevant software available. For example, if you compare the output of echo $PATH
before and after running module load Python/3.7.4-GCCcore-8.3.0
you will see that loading this module involves adding /apps/eb/skylake/software/Python/3.7.4-GCCcore-8.3.0/bin
to the beginning of your PATH. This ensures that your terminal knows where to find this specific version of python and uses it in preference to any other python version which may be in your path.