Memory allocation on GPU partition#

The Slurm options --mem, --mem-per-cpu and --mem-per-gpu do not currently allow you to suitably configure the memory allocation of your job on GPU servers. The memory allocation is automatically determined by the number of reserved CPUs.

To adjust the amount of memory allocated to your job, you must adjust the number of CPUs reserved per task (or GPU) by specifying the following option in your batch scripts, or when using salloc in interactive mode:

   --cpus-per-task=...      # --cpus-per-task=1 by default

Be careful, --cpus-per-task=1 is by default. If you do not modify its value, as explained below, you will not have access to as much memory per GPU as you could have and this could rapidly result in memory overflow.

Example: memory allocation on Turing#

On Turing node by default gpu partition offers 384 GB of usable memory. The memory allocation is automatically computed on the basis of:

  • 8 GB per reserved CPU core if hyperthreading is deactivated (Slurm option --hint=nomultithread).

The default gpu partition is composed of 4 GPUs and 48 CPU cores: you can reserve for instance 1/4 of the node memory per GPU by reserving 12 CPU cores (i.e. 1/4 of 48 CPU cores) per GPU. However, it is suggested to choose a slightly smaller ( = 9, 10) value as the server does not allow the full utilisation of the GPUs.

 --cpus-per-task=10     # reserves ~1/4 of the node memory per GPU (default gpu partition)

In this way, you have access to 80 GB of memory per GPU if hyperthreading is deactivated (if not, half of that memory).

Be careful, other gpu servers (full list) might have less CPUs. Choose the number of CPUs in relation with the total number of CPUs of the chosen server

Comments#

You can ask for more memory per GPU by increasing the value of --cpus-per-task as long as it does not exceed the total amount of memory available. Be careful, the computing hours are counted proportionately. For example, if you ask for 1 GPU on the default gpu partition by specifying --ntasks=1 --gres=gpu:1 --cpus-per-task=24, the invoice will be the same as for a job running on 2 GPUs due to --cpus-per-task=24.

If you reserve a node in exclusive mode, you have access to the entire memory capacity of the node, regardless of the value of --cpus-per-task. The invoice will be the same as for a job running on an entire node.

The amount of memory allocated to your job can be seen by running the command:

  $ scontrol show job $JOBID     # searches for value of the "mem" variable

Important: While the job is in the wait queue (PENDING), Slurm estimates the memory allocated to a job based on logical cores. Therefore, if you have reserved physical cores (with --hint=nomultithread), the value indicated can be two times inferior to the expected value. This is updated and becomes correct when the job is started.

To reserve resources on the prepost partition, you may refer to: Memory allocation with Slurm on CPU partitions. The GPU which is available on each node of the prepost partition is automatically allocated to you without needing to specify the --gres=gpu:1 option.