add first version of training matterials

Vang Le-Quy 4 years ago
parent e91b5fec06
commit 79165e32ad
  1. 2
  2. 235
  3. 160
  4. 45
  5. 4
  6. BIN
  7. BIN
  8. BIN
  9. BIN
  10. BIN
  11. BIN

@ -0,0 +1,2 @@
pandoc -o SlurmAndSingularityTraining.pdf -t beamer --slide-level=2 --pdf-engine=xelatex --pdf-engine-opt=-shell-escape

@ -0,0 +1,235 @@
title: Slurm and Singularity Traing for AI cloud
date: April 2019
- Mads Boye
- Tobias Lindstrøm Jensen
- Vang Le-Quy
affiliation: CLAAUDIA, Aalborg University
theme: AAUsimple
aspectratio: 169
- \usepackage{graphicx}
- \usepackage{amsmath}
- \usepackage{minted}
## Agenda
# System design and intended uses
- Job partitioning: High priority, normal. High resource use, normal, low.
- Run mode: batch, interactive
Introduction about system setup. How jobs are run. How users are supposed to use the resources.
# Slurm basics
## Why slurm ?
General introduction about [slum](
### Resource management
### Queue system
## Query commands
- sinfo: singularity -p batch, sinfo --Node
`sinfo -o "%D %e %E %r %T %z" -p batch`
- squeue: squeue -u $USER -i60 # query every 60 seconds
- smap - report system, job or step status
- sview
## Accounting commands
- sacct - report accounting information by individual job and job step
- sstat - report accounting information about currently running jobs and job steps (more detailed then sacct)
- sreport - report resources usage by cluster, partition, user, account, etc
sacct -A claaudia -u
sreport cluster AccountUtilizationByUser cluster=ngc account=claaudia start=4/01/19 end=4/17/20 format=Accounts,Cluster,TresCount,Login,Proper,Used
## Essential commands
- sbatch -- Submit job script (Batch mode)
- salloc -- Create job allocation and start a shell (Interactive Batch mode)
- srun – Run a command within a batch allocation that was created by sbatch or salloc
- scancel -- Delete jobs from the queue
- squeue -- View the status of jobs
- sinfo – View information on nodes and queues
- sattach - connect stdin/out/err for an existing job or job step
## Interactive jobs
- `salloc --time=<hh:mm:ss> --gres=gpu:2`
- `ssh <email>`
- `srun --time=<hh:mm:ss> --gres=gpu:2`
## Slurm Batch job script
#!/usr/bin/env bash
#SBATCH --job-name MySlurmJob
#SBATCH --partition batch # equivalent to PBS batch
#SBATCH --dependency=aftercorr:498 # More info slurm head node: `man --pager='less -p \--dependency' sbatch`
#SBATCH --time 24:00:00 # Run 24 hours
#SBATCH --gres=gpu:2
## Slurm Batch job script (cont')
sbcast --force my.prog /tmp/my.prog
srun --ntasks-per-node=2 /tmp/my.prog
## Control job status
- scancel - singal/cancel jobs or job steps
`scancel --user="" --state=pending`
- strigger - event trigger management tools
## Other useful commands
- sbcast - transer file to a compute nodes allocated to a job
# Slurm admin
Important readings:
- [Multifactor Priority Plugin](
- [Trackable Resource](
- [Accounting](
- [Resource Limit](
## Slurm admin commands
- scontrol show job <JOBID>, admin tool: scontrol show partition
- scontrol show partition <patitionName> '
- `scontrol write batch_script job_id optional_filename`
- `scontrol update qos=short jobid=525`
## Slurm admin commands
- sacctmngr - database management tool
- sprio - view factors comprising a job priority
- sshare - view current hierarchical fair-share information
- sdiag - view statistics about scheduling module operations
## sacctmngr
- sudo sacctmgr modify QOS normal set MaxTRESPerUser=gres/gpu=2
- sacctmgr show qos format=name,priority,maxtresperuser,MaxWall
- `sacctmgr show assoc format=account,user,qos,tres,maxtresperuser,grptres`
# Singularity basics
## Why singularity
To overcome Docker's drawbacks while still work well with Docker
1. Security
- root access
- resource exposure
2. Compatibility with `slurm`
- resource policy
3. Simplicity
4. HPC-geared
## Commands to learn
`srun singularity -h`
see singularity help <command>
exec Execute a command within container
run Launch a runscript within container
shell Run a Bourne shell within container
test Launch a testscript within container
apps List available apps within a container
bootstrap *Deprecated* use build instead
build Build a new Singularity container
check Perform container lint checks
inspect Display container's metadata
mount Mount a Singularity container image
pull Pull a Singularity/Docker container to $PWD
## Commands to learn
1. search
2. build
3. exec
4. inspect
5. pull
6. run
7. shell
8. image.*
9. instance.*
## build
::: columns
:::: column
![Build IO](images/build_input_output.svg "Build Input Ouput"){height=65%}
:::: column
` sudo singularity build \
lolcow.sif \
# Scenarios/ Use cases
## Run stock docker images
srun --gres=gpu:2 bash -c 'mkdir -p $HOME/data;
source $HOME/.bashrc;
singularity run --nv -B $HOME/data:/data \
docker:// nvidia-smi'
## Run stock singularity images
srun --gres=gpu:1 singularity run shub://
## Build and run NVIDIA’s stock Docker images
- Setup env variables
- build and run
## Write Singularity definition and build
## Build and run customized Singularity images
Singularity [ definition file ](
This needs introduction about Singularity recipe, `build` command.
# QnA
## Questions
- How to run distributed jobs with Singularity?

@ -0,0 +1,160 @@
# This file was downloaded from
BootStrap: docker
From: ubuntu:16.04
# install some system deps
apt-get -y update
apt-get -y install locales curl bzip2 less unzip
# this is a X11 dep for IGV
apt-get -y install libxext6
# tools to open PDF and HTML files
apt-get -y install firefox xpdf
# some extra devel libs
apt-get -y install zlib1g-dev libssl-dev
locale-gen en_US.UTF-8
apt-get clean
# download and install miniconda3
curl -sSL -O
bash -p /opt/miniconda3 -b
rm -fr
export PATH=/opt/miniconda3/bin:$PATH
conda update -n base conda
conda config --add channels conda-forge
conda config --add channels bioconda
# install some bioinfo tools from Bioconda
conda install --yes -c bioconda samtools==1.7
conda install --yes -c bioconda bwa==0.7.17
conda install --yes -c bioconda trimmomatic==0.36
conda install --yes -c bioconda perl-findbin==1.51
conda install --yes -c bioconda fastqc==0.11.7
conda install --yes -c bioconda seqprep==1.2
conda install --yes -c bioconda gatk4==
conda install --yes -c bioconda igv=2.3.98
conda install --yes -c bioconda vcftools==0.1.15
conda install --yes -c bioconda snpeff=4.3.1t-0
conda install --yes -c bioconda varscan==2.4.3
conda install --yes -c bioconda muscle==3.8.1551
conda install --yes -c bioconda mafft==7.313
conda install --yes -c bioconda raxml==8.2.10
conda install --yes -c bioconda beast==1.8.4
conda install --yes -c bioconda phylip==3.696
conda install --yes -c bioconda paml==4.9
conda install --yes -c bioconda qualimap==2.2.2a
conda install --yes -c bioconda picard==2.18.3
conda install --yes -c bioconda biopython==1.71
# install the R programming language
conda install --yes -c conda-forge r-base==3.4.1
# install some dependencies to build R packages
apt-get -y install build-essential gfortran
#conda install --yes -c conda-forge make
#conda install --yes gfortran_linux-64
#conda install --yes gxx_linux-64
#conda install --yes gcc_linux-64
# install some extra R packages
Rscript -e "source (''); biocLite(c('ape', 'pegas', 'adegenet', 'phangorn', 'sqldf', 'ggtree', 'ggplot2', 'phytools'))"
# install the jupyter notebook
conda install --yes jupyter
# install R kernel for jupyter
Rscript -e "source (''); biocLite(c('repr', 'IRdisplay', 'evaluate', 'crayon', 'pbdZMQ', 'git2r', 'devtools', 'uuid', 'digest'))"
ln -s /bin/tar /bin/gtar
Rscript -e "devtools::install_url('')"
#Rscript -e "devtools::install_github('IRkernel/IRkernel')" # this one doesnt work
Rscript -e "IRkernel::installspec(user = FALSE)"
# install TNT
curl -sSL -O
unzip -p tnt > /usr/local/bin/tnt
chmod +x /usr/local/bin/tnt
# donwload and uncompress figtree to /opt/FigTree_v1.4.3/
# also create a wrapper script in /usr/local/bin
curl -sSL -o /opt/figtree.tgz ""
tar -xvf /opt/figtree.tgz -C /opt/
chmod +x /opt/FigTree_v1.4.3/bin/figtree
cat <<EOF >>/usr/local/bin/figtree
cd /opt/FigTree_v1.4.3/
java -Xms64m -Xmx512m -jar lib/figtree.jar $*
chmod +x /usr/local/bin/figtree
export LANG=en_US.UTF-8
export LANGUAGE=en_US:en
export LC_ALL=en_US.UTF-8
export PATH=/opt/miniconda3/bin:$PATH
%apprun samtools
samtools "$@"
%apprun bwa
bwa "$@"
%apprun trimmomatic
trimmomatic "$@"
%apprun fastqc
fastqc "$@"
%apprun seqprep
seqprep "$@"
%apprun gatk4
gatk-launch "$@"
%apprun vcftools
vcftools "$@"
%apprun snpeff
snpeff "$@"
%apprun varscan
varscan "$@"
%apprun muscle
varscan "$@"
%apprun mafft
mafft "$@"
%apprun raxml
raxmlHPC-PTHREADS "$@"
%apprun beast
beast "$@"
%apprun phylip
phylip "$@"
%apprun paml
codeml "$@"
%apprun picard
picard "$@"
%apprun qualimap
qualimap "$@"
%apprun R
R "$@"
%apprun jupyter
jupyter "$@"
%apprun tnt
tnt "$@"
%apprun figtree

@ -0,0 +1,45 @@
# Read more about this definition at
Bootstrap: library
From: ubuntu:18.04
touch /file1
/file1 /opt
export LISTEN_PORT=12345
export LC_ALL=C
apt-get update && apt-get install -y netcat
echo "export NOW=\"${NOW}\"" >> $SINGULARITY_ENVIRONMENT
echo "Container was created $NOW"
echo "Arguments received: $*"
exec echo "$@"
grep -q NAME=\"Ubuntu\" /etc/os-release
if [ $? -eq 0 ]; then
echo "Container base is Ubuntu as expected."
echo "Container base is not Ubuntu."
Version v0.0.1
This is a demo container used to illustrate a def file that uses all
supported sections.

File diff suppressed because one or more lines are too long


Width:  |  Height:  |  Size: 53 KiB