User Tools

Site Tools


slurm_tutorial

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
slurm_tutorial [2017/10/27 14:25] – [Slurm: A quick start tutorial] lenocilslurm_tutorial [2019/01/16 08:35] (current) – [Less-common user commands] lenocil
Line 7: Line 7:
 Slurm is **free** software distributed under the  [[http://www.gnu.org/licenses/gpl.html|GNU General Public License]]. Slurm is **free** software distributed under the  [[http://www.gnu.org/licenses/gpl.html|GNU General Public License]].
  
-==== What is parallel job? ====+==== What is parallel computing? ====
  
-//A parallel job consists of tasks that run simultaneously.// Parallelization can be achieved in different ways, among which: +//A parallel job consists of tasks that run simultaneously.// Parallelization can be achieved in different ways. Please read the relevant wiki page [[https://en.wikipedia.org/wiki/Parallel_computing|here]] to know more.
-  * by running a multi-process program, for example using [[https://www.open-mpi.org/|OpenMPI]]. +
-  * by running a multi-threaded program, for example see [[http://en.wikipedia.org/wiki/Pthreads|pthreads]]+
- +
-A multi-process program consists of multiple tasks orchestrated by MPI and possibly executed by different nodes. On the other hand, a multi-threaded program consists of multiple task using several CPUs on the same node. +
- +
-Slurm's command `srun' (see below) allows users to create tasks and/or request CPUs for a particular task such that both types of parallelizations mentioned above can be achieved easily. For instance, the //--ntasks n (-N)// option will create **n processes**, while the //--cpus-per-task n (-c)// option will created a single **n-threaded process**. Tasks cannot be split across several compute nodes. See the examples below.+
 ==== Slurm's architecture ==== ==== Slurm's architecture ====
  
Line 55: Line 49:
 <code> <code>
 $ sinfo $ sinfo
-PARTITION   AVAIL  TIMELIMIT  NODES  STATE NODELIST 
-playground*    up   infinite      4  alloc maris[029-032] 
-playground*    up   infinite     32   idle maris[004-022,033,035-046] 
-lowmem         up 7-00:00:00      1    mix maris047 
-lowmem         up 7-00:00:00     20   idle maris[048-050,052-068] 
-lowmem-inf     up   infinite      1    mix maris047 
-lowmem-inf     up   infinite     20   idle maris[048-050,052-068] 
-highmem        up 7-00:00:00      6   idle maris[069-074] 
-highmem-inf    up   infinite      6   idle maris[069-074] 
-notebook       up   infinite      2    mix maris[023-024] 
-notebook       up   infinite      4   idle maris[025-028] 
- 
- 
 </code> </code>
 A * near a partition name indicates the default partition. See ''man sinfo'' A * near a partition name indicates the default partition. See ''man sinfo''
  
-**What jobs exist on the system?**+**Display all active jobs by user bongo?**
  
 <code> <code>
-$squeue  +$squeue -u <username>
-             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON) +
-             12277    lowmem CFnumder marxxxel  R    1:23:18      1 maris047 +
-             12276 playgroun pkequal_ maxxxxel  R    1:24:57      1 maris032 +
-              8439  notebook obrien-j   oxxxen  R   18:10:55      1 maris024 +
-              8749 playgroun slurm_co oxxxxxkh  R   17:28:55      1 maris029 +
-              5801  notebook ostroukh oxxxxxkh  R 4-05:02:54      1 maris023 +
-              8750 playgroun slurm_en oxxxxxkh  R   17:28:52      4 maris[029-032] +
- +
 </code> </code>
  
Line 92: Line 64:
 <code> <code>
 $ scontrol show partition notebook $ scontrol show partition notebook
-PartitionName=notebook 
-   AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL 
-   AllocNodes=ALL Default=NO QoS=notebook 
-   DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO 
-   MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED 
-   Nodes=maris0[23-28] 
-   PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO 
-   OverTimeLimit=NONE PreemptMode=OFF 
-   State=UP TotalCPUs=48 TotalNodes=6 SelectTypeParameters=NONE 
-   DefMemPerNode=UNLIMITED MaxMemPerCPU=4096 
- 
 </code> </code>
  
 <code> <code>
 $scontrol show node maris004 $scontrol show node maris004
-NodeName=maris004 Arch=x86_64 CoresPerSocket=4 
-   CPUAlloc=8 CPUErr=0 CPUTot=8 CPULoad=0.01 
-   AvailableFeatures=(null) 
-   ActiveFeatures=(null) 
-   Gres=(null) 
-   NodeAddr=maris004 NodeHostName=maris004 Version=16.05 
-   OS=Linux RealMemory=16046 AllocMem=16000 FreeMem=2082 Sockets=2 Boards=1 
-   State=ALLOCATED ThreadsPerCore=1 TmpDisk=9951 Weight=1 Owner=N/A MCS_label=N/A 
-   BootTime=2016-12-22T12:08:05 SlurmdStartTime=2017-02-17T09:19:46 
-   CapWatts=n/a 
-   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0 
-   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s 
- 
 </code> </code>
  
 <code> <code>
 novamaris [1087] $ scontrol show jobs 1052 novamaris [1087] $ scontrol show jobs 1052
-JobId=1052 JobName=slurm_engine.sbatch 
-   UserId=xxxxxxx(1261909) GroupId=lorentz(9999) MCS_label=N/A 
-   Priority=1 Nice=0 Account=zzzzzz QOS=normal 
-   JobState=RUNNING Reason=None Dependency=(null) 
-   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 
-   RunTime=00:49:06 TimeLimit=UNLIMITED TimeMin=N/A 
-   SubmitTime=2017-02-23T12:17:34 EligibleTime=2017-02-23T12:17:34 
-   StartTime=2017-02-23T12:17:36 EndTime=Unknown Deadline=N/A 
-   PreemptTime=None SuspendTime=None SecsPreSuspend=0 
-   Partition=average-computation AllocNode:Sid=maris004:20658 
-   ReqNodeList=(null) ExcNodeList=(null) 
-   NodeList=maris[024-033,035-040] 
-   BatchHost=maris024 
-   NumNodes=16 NumCPUs=128 NumTasks=128 CPUs/Task=1 ReqB:S:C:T=0:0:*:* 
-   TRES=cpu=128,mem=514784M,node=16 
-   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* 
-   MinCPUsNode=1 MinMemoryNode=32174M MinTmpDiskNode=0 
-   Features=(null) Gres=(null) Reservation=(null) 
-   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) 
-   Command=./slurm_engine.sbatch 
-   WorkDir=/marisdata/%u/ 
-   StdErr=/marisdata/xxxxxxx/.log/abcd.err 
-   StdIn=/dev/null 
-   StdOut=/marisdata/xxxxxxx/.log/abcd.out 
-   Power= 
- 
 </code> </code>
  
Line 161: Line 83:
 0: maris005 0: maris005
 1: maris006 1: maris006
 +</code>
  
-</code> 
 **Create three tasks running on the same node** **Create three tasks running on the same node**
 <code> <code>
Line 171: Line 93:
 </code> </code>
 **Create three tasks running on different nodes specifying which nodes should __at least__ be used** **Create three tasks running on different nodes specifying which nodes should __at least__ be used**
 +
 <code> <code>
  srun -N3 -w "maris00[5-6]" -l /bin/hostname  srun -N3 -w "maris00[5-6]" -l /bin/hostname
Line 195: Line 118:
 **Create a job script and submit it to slurm for execution** **Create a job script and submit it to slurm for execution**
  
-:!: Use `swizardto generate a batch script.+Suppose ''batch.sh'' has the following contents 
 +<code> 
 +#!/bin/env bash 
 +#SBATCH -n 2 
 +#SBATCH -w maris00[5-6] 
 +srun hostname 
 +</code> 
 + 
 +then submit it using  ''sbatch script.sh''.
  
-Or wrote your own script to submit using  ''sbatch script.sh''.+See ''man sbatch''.
  
 ==== Less-common user commands ==== ==== Less-common user commands ====
Line 212: Line 143:
 <code> <code>
 $ sacctmgr show qos format=Name,MaxCpusPerUser,MaxJobsPerUser,Flags $ sacctmgr show qos format=Name,MaxCpusPerUser,MaxJobsPerUser,Flags
-      Name MaxCPUsPU MaxJobsPU                Flags  
----------- --------- --------- --------------------  
-    normal                                           
-playground        32                    DenyOnLimit  
-  notebook                          DenyOnLimit  
 </code> </code>
  
Line 233: Line 159:
 </code> </code>
  
-:!: If your job is serial (not parallel, that is not submitted using `srun') do not forget to append ''.batch'' to the job id. +:!: Note that in the example above the job is identified by id ''8749.batch'' in which the word `batch' is appended to the id displayed using the squeue commandThis is a necessary addition whenever a running program is not parallel i.e. not using `srun'.
- +
-:!: For parallel jobs ''sstat <jobid>'will work.+
  
 === sshare === === sshare ===
Line 241: Line 165:
  
 <code> <code>
-$ sshare -U -u xxxxx+$ sshare -U -u <username>
              Account       User  RawShares  NormShares    RawUsage  EffectvUsage  FairShare               Account       User  RawShares  NormShares    RawUsage  EffectvUsage  FairShare 
 -------------------- ---------- ---------- ----------- ----------- ------------- ----------  -------------------- ---------- ---------- ----------- ----------- ------------- ---------- 
-xxxxx                    yyyyyy         1    0.055556      691078      0.005142   0.937857 +xxxxx                    yyyyyy          1    0.024390    37733389      0.076901   0.112428
  
 </code> </code>
  
-:!: On maris, usage parameters will decay over time according to a PriorityDecayHalfLife of 14 days. 
  
 === sprio === === sprio ===
Line 278: Line 201:
  
 </code> </code>
 +
 +:!: Use ''--noconvert'' if you want sacct to display consistent units across jobs.
 ===== Tips ===== ===== Tips =====
  
-To minimize the time your job spends in the queue you could specify multiple partitions so that the job can start as soon as possible. Use ''--partition=notebook,playground,lowmem''+To minimize the time your job spends in the queue you could specify multiple partitions so that the job could start as soon as possible. Use ''--partition=notebook,playground,computation'' for instance.
  
 To have a rough estimate of when your queued job will start type ''squeue --start'' To have a rough estimate of when your queued job will start type ''squeue --start''
Line 292: Line 217:
  
 <code> <code>
-watch -n 1 -x sinfo -S"-O" -o "%.9n %.6t %.10e/%m %.10O %.15C"+sinfo -i 5 -S"-O" -o "%.9n %.6t %.10e/%m %.10O %.15C"
 </code> </code>
  
Line 321: Line 246:
 For instance ''#SBATCH --nodelist=maris0xx'' For instance ''#SBATCH --nodelist=maris0xx''
  
-=== Environment variables available to any slurm job ===+=== Environment variables available to slurm jobs ===
  
-You can use any of the following variables in your jobs+Type ''printenv | grep -i slurm'' to display them.
  
-<code> 
-$ salloc -p playground -N 10 
-salloc: Granted job allocation 13709 
-$  printenv | grep -i slurm_ 
-SLURM_NODELIST=maris[031-033,035-041] 
-SLURM_JOB_NAME=bash 
-SLURM_NODE_ALIASES=(null) 
-SLURM_JOB_QOS=normal 
-SLURM_NNODES=10 
-SLURM_JOBID=13709 
-SLURM_TASKS_PER_NODE=1(x10) 
-SLURM_JOB_ID=13709 
-SLURM_SUBMIT_DIR=/tmp/bla-bla 
-SLURM_JOB_NODELIST=maris[031-033,035-041] 
-SLURM_CLUSTER_NAME=maris 
-SLURM_JOB_CPUS_PER_NODE=1(x10) 
-SLURM_SUBMIT_HOST=novamaris.lorentz.leidenuniv.nl 
-SLURM_JOB_PARTITION=playground 
-SLURM_JOB_ACCOUNT=yuyuysu 
-SLURM_JOB_NUM_NODES=10 
-SLURM_MEM_PER_NODE=32174 
- 
-</code> 
slurm_tutorial.1509114359.txt.gz · Last modified: 2017/10/27 14:25 by lenocil