Slurm has two commands which you can use to launch jobs: srun and sbatch.
srun is straightforward: srun <arguments> <program>
You can use various arguments to control the resource allocation of your job, as well as other settings your job requires.
You can find the full list of available arguments here. Some of the most common arguments are:
-c # – allocate # CPUs per task.
‑‑gres=gpu:# – allocate # GPUs.
‑‑mpi=<mpi_type> – define the type of mpi to use. The default (when isn’t explicitly specified) is pmi2.
‑‑pty – run job in pseudo terminal mode. Useful when running bash.
1. To get a shell with two GPUs, run:
srun ‑‑gres=gpu:2 ‑‑pty /bin/bash
run ‘nvidia-smi’ to verify the job received two GPUs.
2. Run the script ‘script.py’ using python3:
srun python3 script.py
sbatch lets you use a Slurm script to run jobs. You can find more information here.