Installing Local R Packages
Installing Local R packages
The versions of R pre-installed on the cluster come with a variety of common packages built in, so when looking for a particular package, we recommend double-checking first whether it is included by default.
When a package is not available by default, users are able to install it for their personal use by using a folder on disk as their user package library. This short guide takes you through how to do this on the BMRC cluster.
NB The method described below only works for R >= 3.2 which was released in 2015. Users of earlier versions of R are strongly encouraged to make the transition to a more recent version.
The need for multiple local R Package Repositories
At any one time, the BMRC cluster comprises nodes with different generations of Intel CPU architecture. As of April 2020, for example, our cluster nodes comprise machines that fall into two broad Intel CPU families, ivybridge and skylake. Software built for one CPU family may not be compatible with another and this is also true for R packages. If you try to use an R package built for one CPU family on a different CPU family, your code will abort with an "Illegal instruction" warning.
For this reason, it is necessary when installing your own R packages to ensure that you install each package TWICE - one version for ivybridge nodes and a second version for skylake nodes. You also need to ensure that the correct version is selected when running your jobs on the cluster. The instructions below explain how this can be achieved with minimal effort and maintenance.
Notes for existing R users
If you have already installed your own local R packages then you will already have an existing R user library. Where this is the case, we recommend starting afresh with a new user library location and following the method below to re-install your existing packages. This will ensure that your R packages are compatible with all of our cluster queues and nodes.
Users who have an existing ~/.Rprofile file should normally be able to add the extra code shown below to the beginning of their existing file. In the unlikely event of this causing a conflict, please contact us.
Please also note that R only ever sources one .Rprofile file for the user's local settings. So if you are making use of project-specific or directory-specific .Rprofile files, then you will need to copy the code below into each of them or place it in a central location and source it from each of your .Rprofile files.
Notes for users of "R --vanilla" including snakemake users
The method explained below involves adding some code to your ~/.Rprofile file, which R reads by default on startup. However, using the command R --vanilla explicitly tells R not to read any of its startup files - hence the method described below will not work in those cases. To solve this particular issue, you would need to add the code below to the R script file you are running with R --vanilla. There is no harm in running the code below multiple times, so you can include it anywhere it might be needed.
Please note that when calling an R script from within snakemake as an external script, you script is called with R --vanilla so the advice above also applies.
How to Install Local R Packages
- First choose a new folder to be the main repository directory where your locally installed R packages will live. We recommend a directory in your group's space e.g. /well/<group>/users/<username>/R . This should be a new folder i.e. please do not re-use an existing folder of R packages.
- If it doesn't already exist, create the file ~/.Rprofile - i.e. this is a file named .Rprofile in your home directory. You can reach your home directory by running the cd command on its own. The leading dot in the filename is essential.
- At the beginning of your ~/.Rprofile file, add the following two lines of code:
You need to customise the first line by setting R_LIBS_BASE to the directory you chose in Step 1. Everything up to but not including the last directory of R_LIBS_BASE must already exist.
- Now you are ready to test.
Load a recent R module e.g. module load R/3.6.2-foss-2019b and then run with the R command. If everything is running ok, you should see some output that looks like this:
 "[BMRC] You have sourced the BMRC Rprofile provided at /apps/misc/R/rprofile/Rprofile"
 "[BMRC] Messages coming from this file (like this one) will be prefixed with [BMRC]"
 "[BMRC] You are running R on host <XXX> with CPU <YYY>"
 "[BMRC] While running on this host, local R packages will be sourced from and installed to /well/<group>/users/<username>/R"
The path in the final line should an extension of your R_LIBS_BASE variable, with sub-directories for the R version and the CPU architecture i.e. /<R_LIBS_BASE>/<R version>/<architecture>
- As a final check, from within R run the .libPaths() command and check that the first entry shows the same directory as specified in the last line of the output in (a).
- Once you have tested your setup using the instructions above, you are ready to start installing your own packages. The most important point to remember is that:
For each local package that you want to install, you will need to install the package TWICE. In particular, you will need to run the install procedure ONCE on either rescomp1 or rescomp2 (which both use the skylake architecture) and ONCE on rescomp3 (which uses the ivybridge architecture)NB1 To connect to rescomp3 you must first connect as normal to rescomp1 or rescomp2 and then ssh to rescomp3.
NB2 R packages can be installed by running e.g. install.packages("zip") within R to install the zip package.
- After installing packages, you can now safely submit R jobs to the cluster which use your locally installed packages. The above method with ensure that the correct CPU-specific version of the package will be loaded. NB Please note that in order to avoid similar problems with our pre-installed software, we strongly recommend against using the -V parameter with qsub - see our software guide for further info.
- If something goes wrong in this setup, an informative error message should be printed with the "[BMRC] ..." prefix and R will automatically quit - it will quit immediately on startup and no R jobs will run. Please let us know if you are unable to resolve these errors yourself.