Slurm GPU partition and QOS#

Job Submission#

When you submit a job with Slurm on Liger, you must specify:

  • A partition which defines the type of compute nodes you wish to reserve.
  • A QoS (Quality of Service) which calibrates your resource needs (number of nodes,execution time, ...). If not specified, its value will be the Default QOS specified for your account (usually qos_gpu)

There is 1 partition on Liger for GPU resources (including turing01). It is called: gpus.

Partition#

Slurm partition added on gpu nodes:

PartitionName=gpus
   AllowGroups=ALL AllowAccounts=gpu-coquake,gpu-milcom,gpu-others,gpu-ici AllowQos=ALL
   AllocNodes=ALL Default=NO
   DefaultTime=01:00:00 DisableRootJobs=NO GraceTime=0 Hidden=NO
   MaxNodes=UNLIMITED MaxTime=4-12:00:00 MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED
   Nodes=viz[01-04],turing01
   Priority=1 RootOnly=NO ReqResv=NO Shared=YES:4 PreemptMode=OFF
   State=UP TotalCPUs=120 TotalNodes=4 SelectTypeParameters=N/A
   DefMemPerCPU=8192 MaxMemPerNode=368640

That means here we have:

  • 8192 MB ram per core
  • 12 cores per GPU
  • a total of 368 GB ram

Note: DefMemPerCPU, MaxMemPerNode correspond to the maximum memory for nodes with the largest capacity in Liger. Other GPU nodes have less memory and therefore will throw an error if trying to reserve more memory than they have.

QoS policy#

Partition Qos Time Limit MaxJobsPerUser
gpus qos_gpu 20 hours 3
gpus qos_gpu-long 100 hours 2
gpus qos_gpu-dev 2 hours 2

Requesting GPUs#

To request GPU nodes:

  • 1 node with 1 core and 1 GPU card

    --gres=gpu:1

  • 1 node with 2 cores and 2 GPU cards

    --gres=gpu:2 -c2

  • 1 node with 3 cores and 3 GPU cards, specifically the type of Tesla V100 cards. Note that It is always best to request at least as many CPU cores are GPUs

    --gres=gpu:V100:3 -c3

The available GPU node configurations are shown here.

When you request GPUs, the system will set two environment variables - we strongly recommend you do not change or unset these variables:

CUDA_VISIBLE_DEVICES
GPU_DEVICE_ORDINAL

To your application, it will look like you have GPU 0,1,.. (up to as many GPUs as you requested). So if for example, there are two jobs from different users: the first one requesting 1 GPU card, the second 3 GPU cards, and they happen landing on the same node gpu-08: