GPU Resources 2026
How to access the BMRC GPU resources.
Before running jobs on the BMRC GPU cluster resources you should know:
1 If your Slurm project has GPU shares ?
2 What GPUs you want your jobs to run on ?
3 How long your jobs will take ?
Jump to 'Submitting Jobs' for example Slurm commands.
Overwiew for PIS
Since 2011, BMRC has sold access to the cluster based on the idea of a “share”. In its simplest form, you could imagine that each group was buying a share of the cluster platform. If one group bought twice the share of another group then, at any time, the scheduler would try to be running twice the number of jobs for the first group as the second group. However, if only one group was submitting jobs at a certain time, then all resources would be handed to that group: use it or lose it, credits could not be stored up. While the details of the implementation have changed and BMRC has become immensely more complex, the basic approach remains the same. You can find out more about the BMRC share philosophy and the fairshare calculation at: https://www.medsci.ox.ac.uk/for-staff/resources/bmrc/cluster-shares.
When there were only a few GPUs in the cluster, and only a few groups using them, we were able simply to extend the existing share model to cover them without too many issues. However, over the past few years GPU methods have become mainstream and GPUs have become incredibly expensive: BMRC currently invests much more in GPU-accelerated servers than in CPU-only servers, but even so it has many fewer GPUs than it has CPU cores. This has led to severe scheduling challenges with the current approach and means that BMRC is not recovering enough through its charging to replace GPUs as they get old without unfairly charging users of CPUs.
BMRC is now charging separately for CPU shares and GPU shares. To run CPU-only jobs you will need CPU shares and to run GPU jobs you will need GPU shares – to run both types of compute you will need both types of shares. This has meant that CPU shares are now significantly cheaper than they were previously. Note that the cost of the GPU share includes the costs of the CPUs and memory of the servers hosting the GPUs: you don’t need CPU shares to run a GPU job.
Having looked at GPU usage patterns we have decided we can offer entry-level continuous access to GPUs for small-to-moderate use at just over £1000 per project per year. Groups with heavier usage will simply need to buy multiple shares. Just as with CPU shares and to avoid overcommitting, there is a cap on the total number of GPU shares that will be sold for the GPU partitions. The cap is related to the total number of physical GPUs that we have available.
To extend the share approach to GPUs has been quite complicated since there are many different kinds of GPU. To solve this, we have further defined a weighting for each type of GPU depending on the cost of providing them (buying, powering and administering them). When accounting is performed, the runtime of a job is multiplied by the relevant GPU weight meaning that users can use more time on cheaper GPUs for the same cost to the project. This also means that only one type of GPU share is needed and it is good for all our types of GPU from RTX6000 to H200, dramatically simplifying the accounting.
Shares
BMRC sells both CPU and GPU shares to projects to enable use of the Slurm cluster. CPU and GPU shares are separate: CPU shares do not give access to GPU-accelerated nodes and vice versa. If you wish to enquire about GPU shares please send a request to bmrc-help@medsci.ox.ac.uk.
A GPU share is an abstract quantity which affects the scheduling priority of the job. Once a job is running it gets all the resources that it has requested and it is not competing with other jobs. More shares mean higher priority for jobs in the queue. Each type of GPU has been given a weighting depending on the cost of providing them (buying, powering and administering them). For reference, an 80GB A100 GPU is defined to have a weighting of 1.00.
Based on an analysis of previous usage we defined an appropriate GPU share price based on the cost of continuous usage of up to 1/3 of a GPU. If there is no GPU usage for any particular quarter then there will be no charge for the GPU shares for that quarter.
Selecting the gpu type
BMRC supports a number of GPU types. There is a partition for each GPU type. You must select the partitions corresponding to the GPUs that you wish to run your jobs on.
Important factors are:
1 Billing weight
2 GPU memory
3 Numerical precision
Each GPU type has a billing weight reflecting the cost of running the GPU nodes. This is factored in to usage calculations by the scheduler. More expensive GPUs have a higher weight than cheaper GPUs. If the weight is <1 then less usage will be accounted for. This has an impact in how jobs are prioritised in the scheduler, less usage will lead to a higher priority. Note this is called a 'billing' weight by Slurm, it is *not* a financial weighting on the cost of a GPU share.
To maximise throughput of your jobs you should have an estimate of the GPU memory required by your job and select partitions that satisfy the memory requirement. You can find information about GPU memory in the tables below.
All GPUs provide incredible FP32 performance (default single precision), most applications work at this precision. If your jobs require double precision (FP64) or a lower precision (e.g. FP16, FP8) or more sophisticated cores for e.g. AI workloads, then you should carefully select the appropriate GPUs for your task (A100, V100, P100).
There is a partition (gpu_inter) for users who need shell access to a GPU node for development or analysis.
Partition Table
| PARTITION | NUM_GPU | GPU_MEMORY(GB) | WEIGHT | MAX_RUNTIME(Hours) | NUM_CPU_DEFAULT | MEM(GB)_DEFAULT |
| Batch Partitions | ||||||
| gpu_a100_80gb | 24 | 80 | 1 | 60 | 11 | 120 |
| gpu_rtx8000_48gb | 12 | 48 | 0.72 | 60 | 7 | 185 |
| gpu_a100_40gb | 16 | 40 | 0.89 | 60 | 7 | 90 |
| gpu_v100_32gb | 2 | 32 | 0.89 | 60 | 7 | 750 |
| gpu_p100_16gb | 12 | 16 | 0.66 | 60 | 5 | 90 |
| gpu_v100_16gb | 4 | 16 | 0.7 | 60 | 11 | 60 |
| Interactive Partitions | ||||||
| gpu_inter | 18 | 24 | 0.59 | 12 | 7 | 80 |
Selecting runtime
The maximum runtime for most of the GPU partitions is 60hrs, some are shorter. If you know your jobs will finish sooner than 60hrs then you can apply a Slurm QOS (Quality of Service) to your jobs which will significantly boost the priority of the job in the queue and apply an appropriate runtime limit to the job. The priority boost is most significant for jobs that run under a 4 hour time limit, followed by a 24hr QOS, with 60hr jobs getting no priority boost at all.
GPU QOS Table
| QOS Name | Runtime (hrs) | Priority Boost |
| Partition QOS | ||
| gpu_bmrc_partition_limits | 60 | 0 |
| gpu_bmrc_interactive_limits | 12 | 0 |
| User selectable QOS | ||
| gpu_bmrc_4hr | 4 |
20000000 |
| gpu_bmrc_24hr | 24 |
10000000 |
Partition QOS are applied automatically when you select a partition for your job.
User selectable QOS can be applied at job submission (--qos gpu_bmrc_4hr)
Note about limits
As GPUs are a limited resource under considerable demand we need to apply limits to usage to ensure that there is throughput for jobs from all projects and to allow for essential regular maintenance activities to be completed.
We cannot extend the 60hr runtime for jobs in normal operation.
Checkpointing and increasing parallelisation by breaking work into smaller chunks are two common ways to complete your workloads within shorter runtimes. They will also improve the resilience of your workload to interruption.
The per-group limit to the number of GPUs that can be in use is 24. This applies across all partitions.
At least 1 GPU must be selected (--gres gpu:1) for a job to run.
For all jobs the GPU limits are: 24 GPU per project, min 1 GPU per job
For jobs on the batch partitions: 60 hours max runtime
For sessions on the interactive partition: 1 GPU per user max, 12 hour max runtime
If you require GPU resources for jobs that must run longer than 60hrs or you need direct instant access to the GPU or you want to maintain persistent sessions over a long time period, BMRC also provide GPU accelerated VMs in the BMRC private cloud. If you would like to discuss access to the cloud resources please send a request to bmrc-help@medsci.ox.ac.uk.
Hardware
|
Node |
GPU Type |
Slurm Features |
Num GPU Cards |
GPU RAM per card |
CPU Cores per GPU |
RAM GB per GPU |
CPU Compatibility |
|
compg009 |
p100-sxm2-16gb |
flash |
4 |
16 |
6 |
91.2 |
Skylake |
|
compg010 |
p100-sxm2-16gb |
flash |
4 |
16 |
6 |
91.2 |
Skylake |
|
compg011 |
p100-sxm2-16gb |
flash |
4 |
16 |
6 |
91.2 |
Skylake |
|
compg013 |
p100-sxm2-16gb |
|
4 |
16 |
6 |
91.2 |
Skylake |
|
compg016 |
v100-pcie-32gb |
flash |
2 |
32 |
6 |
750 |
Skylake |
|
compg019 |
quadro-rtx6000 |
flash |
4 |
24 |
8 |
91.2 |
Skylake |
|
compg020 |
quadro-rtx6000 |
flash |
4 |
24 |
8 |
91.2 |
Skylake |
|
compg021 |
quadro-rtx6000 |
flash |
4 |
24 |
8 |
91.2 |
Skylake |
|
compg026 |
p100-pcie-16gb |
flash |
4 |
16 |
10 |
91.2 |
Skylake |
|
compg027 |
v100-pcie-16gb |
|
4 |
16 |
12 |
60.8 |
Skylake |
|
compg028 |
quadro-rtx8000 |
flash |
4 |
48 |
8 |
187.2 |
Cascadelake |
|
compg029 |
quadro-rtx8000 |
flash |
4 |
48 |
8 |
187.2 |
Cascadelake |
|
compg030 |
quadro-rtx8000 |
flash |
4 |
48 |
8 |
187.2 |
Cascadelake |
|
compg031 |
a100-pcie-40gb |
flash |
4 |
40 |
8 |
91.2 |
Cascadelake |
|
compg032 |
a100-pcie-40gb |
flash |
4 |
40 |
8 |
91.2 |
Cascadelake |
|
compg033 |
a100-pcie-40gb |
flash |
4 |
40 |
8 |
91.2 |
Cascadelake |
|
compg034 |
a100-pcie-40gb |
flash |
4 |
40 |
8 |
91.2 |
Cascadelake |
|
compg035 |
a100-pcie-80gb |
flash |
4 |
80 |
8 |
91.2 |
Icelake |
|
compg036 |
a100-pcie-80gb |
flash |
4 |
80 |
8 |
91.2 |
Icelake |
|
compg037 |
a100-pcie-80gb |
flash |
2 |
80 |
24 |
256 |
Icelake |
| compg038 |
a100-pcie-80gb |
flash |
2 |
80 |
24 |
256 |
Icelake |
|
compg039 |
a100-pcie-80gb |
flash |
4 |
80 |
12 |
128 |
Icelake |
|
compg040 |
a100-pcie-80gb |
flash |
4 |
80 |
12 |
128 |
Icelake |
|
compg041 |
a100-pcie-80gb |
flash |
4 |
80 |
12 |
128 |
Icelake |
| compg042 |
a100-pcie-80gb |
flash |
4 |
80 |
12 |
128 |
Icelake |
| compg047 |
l4 |
flash |
6 |
24 |
10 |
80 |
Emerald Rapids |
LEGACY DEDICATED hardware
We maintain a number of GPU nodes which are dedicated to specific projects and experimental instrument workflows. Please email us with any questions regarding these dedicated nodes.
There are a small number of partitions dedicated to specific projects or instrument workflows
gpu_strubi
gpu_cryosparc
|
Node |
GPU Type |
Num GPU cards |
GPU RAM per card |
CPU Cores |
Total RAM GB |
CPU Compatibility |
|
compg017 |
v100-pcie-32gb |
2 |
32 |
24 |
1500 |
Skylake |
|
compg018 |
quadro-rtx6000 |
4 |
24 |
32 |
384 |
Skylake |
|
compg022 |
v100-pcie-16gb |
4 |
16 |
32 |
384 |
Skylake |
|
compg024 |
quadro-rtx6000 |
4 |
24 |
32 |
384 |
Skylake |
|
compg025 |
quadro-rtx8000 |
4 |
48 |
32 |
384 |
Skylake |
|
compg043 |
l40s |
4 |
48 |
12 |
128 |
Sapphire Rapids |
|
compg044 |
l40s |
4 |
48 |
12 |
128 |
Sapphire Rapids |
|
compg045 |
l40s |
4 |
48 |
12 |
128 |
Sapphire Rapids |
|
compg046 |
l40s |
4 |
48 |
12 |
128 |
Sapphire Rapids |
Submitting jobs
Jobs are submitted using sbatch in a similar way to submitting a non-gpu job; however, you must supply some extra parameters to indicate your GPU requirements as follows:
sbatch --account gpu_<X>.prj --partition gpu_p100_16gb --gres gpu:<N> <JOBSCRIPT>
gpu_<X>.prj is the name of the research group/project Slurm GPU account.
<N> is the number of GPUs required for each job.
The default number of CPU cores per GPU depends on the partition (see 'Selecting the GPU type'). You can request more (or fewer) CPU cores for your job with --cpus-per-gpu <N>. Alternatively, you can set the total number of cores required for the job with -c <N>. <N> is the number of cores.
The default system memory available per GPU depends on the partition (see 'Selecting the GPU type'). You can request more (or less) system memory for your job with --mem-per-gpu <M>G. Alternatively, you can specify the total memory requirement for your job with --mem <M>G. <M> is the number of GB of memory required.
Examples:
Submit a job requiring a single A100 80GB GPU:
sbatch -A gpu_<X>.prj -p gpu_a100_80gb --gres gpu:1 <SCRIPT>
Submit a job requiring 2 GPUs, can be RTX8000 or A100 40GB, that will finish in under 24 hours:
sbatch -A gpu_<X>.prj -p gpu_rtx8000_48gb,gpu_a100_40gb --gres gpu:2 --qos gpu_bmrc_24hr <SCRIPT>
Submit a job for an interactive session:
srun -A gpu_<X>.prj -p gpu_inter --gres gpu:1 --pty bash
Using fast local scratch space
A number of nodes have fast local NVMe drives for jobs that require a lot of I/O. This space can be accessed from:
/flash/scratch
or from project specific folders in /flash on the nodes.
It is the users responsibility to create a project folder in scratch for their job.
In Slurm you can select nodes with a scratch folder with:
sbatch -A gpu_<X>.prj -p gpu_p100_16gb --gres gpu:1 --constraint "flash" <JOBSCRIPT>
The scratch folder is open to all jobs, so care should be taken to protect your data by placing it in subfolders with the correct permissions.
As the space on these drives is limited you should remove any data from the scratch space when the job is complete. A scheduled automatic deletion from /flash/scratch will be introduced.
Monitoring
In an interactive session you should use the nvidia-smi command to check what processes are running on the GPUs and top to check what is running on the CPUs.
You can attach an interactive session to a running job to run nvidia-smi, top or ps to monitor your running jobs with
srun --jobid <JOB_ID> --pty bash
On the scheduled nodes, from a login node you should run e.g.
squeue -p gpu_rtx8000_48gb,gpu_a100_40gb
to see the jobs running and waiting in those GPU partitions.
You can see the occupancy of the GPUs for a partition with
sinfo -N -O "Nodelist:16,Partition,Available:6,Timelimit,CPUsState,StateCompact:8,Gres:32,GresUsed:32" -p gpu_a100_80gb
