User Tools

Site Tools


institute_lorentz:xmaris

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
institute_lorentz:xmaris [2020/02/19 07:53] – [Multithreading] lenocilinstitute_lorentz:xmaris [2024/02/29 14:16] (current) – [Xmaris Partitions] jansen
Line 1: Line 1:
-====== xmaris ====== +====== Xmaris ====== 
-xmaris is a small computational cluster at the [[https://www.lorentz.leidenuniv.nl|Lorentz Institute]] +Xmaris is a small computational cluster at the [[https://www.lorentz.leidenuniv.nl|Lorentz Institute]] 
-financed by external research grants. Access is **primarily** for those +financed by external research grants. As such, its access is granted **primarily** to the 
-research groups who have purchased the machines, but there may well be +research groups who have been awarded the grants Other research groups wishing to use xmaris can enquire whether there is any left-over computing time by getting in touch with either 
-computing time available for othersIf you would like to use xmaris please get in touch with +
  
-  * Xavier Bonet Monroig (Oort 260) +|Xavier Bonet MonroigOort 260| 
-  Carlo Beenakker (Oort 261)+|Carlo Beenakker|Oort 261|
  
-to see what resources can be made available for your needs. You can __then__ +to discuss what resources can be made available to their needs. After a preliminary assessment and approval, access to xmaris will be granted by the IT staff. Any technical questions should be addressed via https://helpdesk.lorentz.leidenuniv.nl to
-request access by sending an email to //support@lorentz//.+
  
-The cluster is optimised for [[https://en.wikipedia.org/wiki/Thread_(computing)#Multithreading|multithreading applications]] and  [[https://en.wikipedia.org/wiki/Embarrassingly_parallel|embarrassingly parallel problems]] but there have been some recent investments to improve nodes interconnection communications.+Leonardo Lenoci | HL409b|
  
-:!: Lorentz Institute guests are encouraged to explore other HPC possibilities, such as the [[https://wiki.alice.universiteitleiden.nl/index.php?title=ALICE_User_Documentation_Wiki|ALICE HPC cluster]] of the University of Leiden. +:!: External research groups to the Lorentz Institute are strongly encouraged to explore other HPC possibilities, such as the [[https://wiki.alice.universiteitleiden.nl/index.php?title=ALICE_User_Documentation_Wiki|ALICE HPC cluster]] of the University of Leiden.
-===== xmaris features and expected lifetime =====+
  
-xmaris runs CentOS v7.and consists of heterogeneous computation nodesA list of configured nodes and partitions on the cluster can be obtained using slurm ''sinfo''.+Xmaris is optimised for [[https://en.wikipedia.org/wiki/Thread_(computing)#Multithreading|multithreading applications]] and  [[https://en.wikipedia.org/wiki/Embarrassingly_parallel|embarrassingly parallel problems]], but there have been some recent investments to improve nodes interconnection communications to enable [[institute_lorentz:xmaris#parallelism_101|multiprocessing]]. Currently, multiprocessing is possible on the nodes of the ''ibIntel'' partition which are interconnected via an **InfiniBand EDR** switch. Each one of these nodes  is capable of a practical __9.6 TFLOPS__.
  
-:!: Because xmaris features different CPU types that understand different types of instructions (see [[https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html|here]]), each slurm node is characterised by a list of ''Features'' that, among other things, describe the type of CPUs mounted in that node 
  
-<code> +Xmaris is the successor of the maris cluster, renamed with a prefix ''x'' because its nodes deployment is automated using the [[https://www.xcat.org/|xCAT]] software. Less formallythe presence of the ''x'' prefix also suggests the time of the year when xmaris was first made available to IL usersthat is Christmas (Xmas).
- sinfo -o " %n  %P %t %C %z %m %f %l %L" -N -n maris077 +
- HOSTNAMES  PARTITION STATE CPUS(A/I/O/T) S:C:T MEMORY AVAIL_FEATURES TIMELIMIT DEFAULTTIME +
- maris077  compIntel idle 0/96/0/96 4:12:2 512000 broadwell,10Gb,R830 infinite 1:00:00 +
-</code>+
  
-In the example above maris077 CPUs belong to the ''broadwell'' familyTo request allocation of specific features, please see below.+[[https://www.gnu.org/|{{https://www.gnu.org/graphics/heckert_gnu.transp.small.png?50 }}]][[https://wiki.centos.org/|{{https://wiki.centos.org/ArtWork/Brand/Logo?action=AttachFile&do=get&target=centos-logo-light.png?200 }}]] [[https://openondemand.org/|{{https://www.osc.edu/sites/default/files/OpenOnDemand_horiz_RGB.png?200  }}]] [[https://slurm.schedmd.com|{{https://slurm.schedmd.com/slurm_logo.png?60  }}]] [[https://easybuild.readthedocs.io/en/latest/|{{https://docs.easybuild.io/img/easybuild_logo.png?100  }}]] 
 +===== Xmaris features and expected cluster lifetime =====
  
-xmaris aims to offer a stable computational environment to its users in the period **Dec 2019 -- Jan 2024**Within this period, the OS might be patched only with important security updates. Past January 2024 all working xmaris nodes will be re-provisioned from scratch with a newer version of CentOS.+Xmaris runs CentOS v7 and consists for historical reasons of heterogeneous computation nodesA list of configured nodes and partitions on the cluster can be obtained on the command line using slurm's ''sinfo''.
  
-==== compute nodes data disks ====+:!: Because Xmaris features different CPU types that understand different types of instructions (see [[https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html|here]]), we have associated to each computation node a list of slurm ''Features'' that also describe the type of CPUs mounted in that node. To request allocation of specific features to the resource manager, see [[institute_lorentz:xmaris#request_nodes_with_particular_features|this example]].
  
-All compute nodes have access to the following data partitions **//at least//**+You can display just one node specs and features or all nodes specs and features with ''sinfo''
  
-  * /scratch (local to each node, therefore faster I/O) +<code bash> 
-  * /marisdata (NFS+# specific node 
-  /home (NFS)+sinfo -o " %n  %P %t %C %z %m %f" -N -n maris077 
 +# all nodes 
 +sinfo -o " %n  %P %t %C %z %m %f" -N 
 +# all nodes (more concise
 +sinfo -Nel 
 +</code>
  
-Extra scratch space+Xmaris aims to offer a //stable// computational environment to its users in the period **Dec 2019 -- Jan 2024**. Within this period, the OS might be patched only with important security updates. Past January 2024, all working xmaris nodes will be re-provisioned from scratch with  newer versions of the operating system and software infrastructure. At this time all data stored in the [[institute_lorentz:xmaris#compute_nodes_data_disks|temporary scratch disks]] will be destroyed and the disks reformatted.
  
-  * /IBSSD (maris0[78,79,80,81] -- InfiniBand/iSER high-rate I/O)  +==== Compute nodes data disks ====
  
 +All compute nodes have **//at least//** access to the following data partitions 
  
-:!: iSER stands for “iSCSI Extensions for RDMA”. It is an extension of the iSCSI protocol that includes RDMA (Remote Dynamic Memory Access) support.+^Mount Point^ Type ^Notes^ 
 +|/scratch | HD | **temporary**, local| 
 +|/marisdata |NetApp| 2TB/user quota, medium-term storage, remote| 
 +|/home |NetApp| 10GB/user quota, medium-term storage, remote| 
 +|/ilZone/home | [[institute_lorentz:irods_fair_storage|iRODS]]| 20GB/user quota, archive storage, remote|
  
  
-Backup snapshots of /home directory are taken hourly, daily, and weekly and stored in ''/home/.snapshot/''.+Extra efficient scratch spaces are available to all nodes on the infiniband network (''ibIntel''
 + 
 +^Mount Point^ Type^ Notes^ 
 +|/IBSSD| SSD |**DISCONTINUED**, InfiniBand/iSER((iSER stands for “iSCSI Extensions for RDMA”. It is an extension of the iSCSI protocol that includes RDMA (Remote Dynamic Memory Access) support. BeeGFS is parallel filysystem. IBSSD will be discontinued by the end of 2022 in favour of PIBSSD.))|   
 +|/PIBSSD| SSD|**temporary**, InfiniBand/BeeGFS| 
 + 
 + 
 +Backup snapshots of ''/home'' are taken hourly, daily, and weekly and stored in ''/home/.snapshot/''.
  
 xmaris users are strongly advised they delete (or at least move to the shared data disk), if any, their data from the compute nodes scratch disks upon completion of their calculations. All data on the scratch disks __might be cancelled without prior notice__. xmaris users are strongly advised they delete (or at least move to the shared data disk), if any, their data from the compute nodes scratch disks upon completion of their calculations. All data on the scratch disks __might be cancelled without prior notice__.
  
-10GB/user quotas are enforced on the home disk. Please use /marisdata to __temporarily__ store large datafiles. Note also that **these policies might change at any time at the discretion of the cluster owners**.+Note that **disk policies might change at any time at the discretion of the cluster owners**. 
  
-:!: The **OLD** ''/clusterdata'' is deliberately made not available on xmaris, because it is **no longer** maintained.  If you have any data on it, **it is your responsibility** to create backups. All data on ''/clusterdata'' will get permanently lost in case of hardware failure.  
  
 +----
  
-**All data** on the scratch partitions is assumed to be temporary and will be **deleted** upon a node re-installation.+Please also note the following   
 +  xmaris' home disk is different than your [[institute_lorentz:gnulinux_workstations|IL workstation]] or [[institute_lorentz:remote_workspace|remote workspace]] home disk. 
 +  The **OLD** (as in the old maris) ''/clusterdata'' is deliberately made unavailable on xmaris, because it is **no longer** maintained.  If you have any data on it, **it is your responsibility** to create backups. All data on ''/clusterdata'' will get permanently lost in case of hardware failure. 
 + 
  
-:!: maris' homes are different than IL workstations' homes.+==== Xmaris usage policies ====
  
-==== xmaris usage policies ====+Usage policies are updated regularly in accordance with the needs of the cluster owners and **may change at any time without notice**. At the moment there is an __enforced usage limit of 128 CPUs per user__ that does not apply to the owners. Job execution priorities are defined via a complex [[https://slurm.schedmd.com/archive/slurm-21.08.8-2/priority_multifactor.html|multi-factor algorithm]] whose parameters can be displayed on the command line via 
 +<code bash> 
 +scontrol show config | grep -i priority 
 +</code>
  
-Usage policies are updated regularly in accordance with the needs of the cluster owners and **may change without notice at any time**. At the moment there is an enforced usage limit of 128 CPUs per user that does not apply to the owners. Cluster usage is regularly monitored to prevent resource abuse. +===== Xmaris live state =====  
-===== xmaris live state =====+To monitor live usage of Xmaris you can either
  
-To monitor live usage of xmaris you can either+  * execute slurm's ''sinfo'' (this requires shell access to the cluster, see below) 
 +  * browse to https://xmaris.lorentz.leidenuniv.nl/ganglia/ from any IL workstation
  
-  * ssh into xmaris (see below) and use ''sinfo'' 
-  * use your web browser to visit https://marishub.lorentz.leidenuniv.nl:4433/pun/sys/ganglia-status 
  
-:!: The link above is accessible only within the IL workstations network. +===== How to access Xmaris ===== 
-===== How to access xmaris =====+Access to Xmaris is not granted automatically to all Lorentz Institute members. Instead, a preliminary approval must be granted to you by the cluster owners (read [[|here]]).
  
-Once you have been authorised to use xmarisyou have two ways to access its services:+Once you have been authorised to use Xmaristhere are two ways to access its services:
  
-  - terminal access. +  - using a web browser (**Strongly advised** to the novel HPC user) 
-  - web access via OpenOnDemand. +  - using an SSH client (Expert users) 
 + 
  
 +Both methods can provide terminal access, but connections via web browsers offer you extra services such as sftp (drag-and-drop file transfers), jupyter interactive notebooks, virtual desktops and more at the click of your mouse. **We advise** all users either unfamilair with the GNU/Linux terminal or new to HPC to use a web browser to interact with Xmaris.
  
-==== Terminal access ====+==== Access via an ssh client ====
  
 +The procedure differs on whether you try to connect with a client connected to the IL intranet or not.
  
-Terminal access is provided via login to xmaris' headnode reachable at ''marishead.lorentz.leidenuniv.nl''. For connections from outside the IL network, an [[institute_lorentz:institutelorentz_remoteaccess|ssh tunnel]] into the IL ssh server is needed. +  * When within the IL network, for instance if you are using a Lorentz Institute workstation, you have direct access to Xmaris. Open a terminal and type the command below (substitute username with your own IL username)
  
 +<code bash>
 +ssh xmaris.lorentz.leidenuniv.nl -l username
 +</code>
 + 
 +  * When outside the IL network, for instance from home or using a wireless connection, you must first initiate an ssh tunnel to our SSH server and then connect to Xmaris
 +
 +<code bash>
 +ssh -o ProxyCommand="ssh -W %h:%p username@styx.lorentz.leidenuniv.nl" username@xmaris.lorentz.leidenuniv.nl
 +</code>
  
-{{ :institute_lorentz:terminal.png?nolink&600 |}} 
  
-:!: If you were a maris user prior to the configuration switch to xmaris, you will find out that many terminal functions and programs are not working as expected. This is due to the presence in your maris home directory of old shell initialisation scripts still tied to the STRW sfinx environment. You can override them (after making a backup copy) by replacing their contents with the default  CentOS shell initialisation scripts, for instance for bash these are located in ''/etc/skel/.bashrc'' and ''/etc/skel/.bash_profile''.+|:!: If you were a maris user prior to the configuration switch to xmaris, you might find out that many terminal functions and programs could not be working as expected. This is due to the presence in your xmaris home directory of old shell initialisation scripts still tied to the STRW sfinx environment. You can override them (preferably after making a backup copy) by replacing their contents with the default CentOS shell initialisation scripts, for instance for bash these are located in ''/etc/skel/.bashrc'' and ''/etc/skel/.bash_profile''|
 ==== Web access ==== ==== Web access ====
  
-xmaris services, that is terminal, scheduler/resource manager, jupyter notebooks and monitoring facilities, can be accessed easily via a browser without the need of additional plugins navigating to [[https://marishead.lorentz.leidenuniv.nl:4433|xmaris OpenOnDemand]] +Xmaris services, that is terminal, scheduler/resource manager, jupyter notebooks and monitoring facilities, can be accessed easily via a browser without the need of additional plugins navigating to [[https://xmaris.lorentz.leidenuniv.nl:4433|xmaris OpenOnDemand]].
-{{ :institute_lorentz:ondemand.png?direct&800 }}.  +
-Similarly to a standard terminal access, xmaris OpenOnDemand is available only for connections within the IL subnetwork. IL users  who wish to access OpenOnDemand from their home could instruct their browser to connect via a SOCKS proxy, for instance open a local terminal and type +
  
-<code> +{{ :institute_lorentz:oodxmaris1.png?direct&1000 }}   
-ssh -ND 7777 <your_IL_username>@ssh.lorentz.leidenuniv.nl+ 
 +Similarly to a traditional shell access, Xmaris OpenOnDemand is available only for connections within the __IL intranet__. IL users  who wish to access OpenOnDemand from their remote home locations could for example use the [[:vpn#lorentz_institute|IL VPN]] or instruct their browsers to SOCKS-proxy their connections via our SSH server. 
 +Open a local terminal and type (substitute username with your IL username) 
 + 
 +<code bash
 +ssh -ND 7777 username@ssh.lorentz.leidenuniv.nl
 </code>   </code>  
  
-then in your browser settings find the tab relative to the connection type and instruct the browser to use the SOCKS proxy at ''localhost:7777'' to connect to the internet.+then in your browser settings find the tab relative to the connection type and instruct the browser to use the SOCKS proxy located at ''localhost:7777'' to connect to the internet. Alternatively, use the [[institute_lorentz:remote_workspace|Lorentz Institute Remote Workspace]].
  
-xmaris OnDemand allows you to + 
 +Xmaris OnDemand allows you to 
  
   * Create/edit files and directories.   * Create/edit files and directories.
   * Submit batch jobs to the slurm scheduler/resource manager.   * Submit batch jobs to the slurm scheduler/resource manager.
   * Open a terminal.   * Open a terminal.
-  * Launch interactive jupyter notebooks.+  * Launch interactive applications such as jupyter notebooks, tensorboard, virtual desktops, etc..
   * Monitor cluster usage.   * Monitor cluster usage.
   * Create and launch your very own OnDemand application (read [[https://osc.github.io/ood-documentation/master/app-development/tutorials-passenger-apps.html|here]]).   * Create and launch your very own OnDemand application (read [[https://osc.github.io/ood-documentation/master/app-development/tutorials-passenger-apps.html|here]]).
  
 +:!: Please do not bookmark any other URL than https://xmaris.lorentz.leidenuniv.nl:4433 to connect to OpenOnDemand. Failing to do so can result in connection errors.
 +===== Xmaris Partitions =====
  
-===== xmaris GPUs =====+^Partition^ Number nodes ^ Timelimit ^  Notes^ 
 +|compAMD*| 6 | 15 days | | 
 +|compAMDlong| 3 | 60 days | | 
 +|compIntel| 2 | 5 days and 12 hours| | 
 +|gpuIntel| 1 | 3 days and 12 hours | GPU | 
 +|ibIntel | 8 | 7 days | InfiniBand, Multiprocessing |
  
-Currently only maris075 features GPUs. They are two ''nvidia Tesla P100 16GB''In order to request them you must use the ''--gres'' option, for instance ''srun -p gpuIntel --gres=gpu:1 --pty bash -i''. +*: default partition 
-===== xmaris scientific software =====+ 
 +===== Xmaris GPUs ===== 
 + 
 +^Node^Partition^GPUs^CUDA compatibility^ 
 +|maris075 |gpuIntel|2 x Nvidia Tesla P100 16GB | 6.0| 
 + 
 +Xmaris GPUs must be allocated using slurm'''--gres'' option,  for instance  
 +<code bash> 
 +srun -p gpuIntel --gres=gpu:1 --pty bash -i 
 +</code> 
 +===== Xmaris scientific software =====
  
-xmaris uses [[https://easybuild.readthedocs.io/en/latest/|EasyBuild]] to provide a build environment for its (scientific) software. Pre-installed software can be explored by means of the ''module spider'' command. For instance you can query the system for all modules whose name starts with  `mpi' by executing ''module -r spider '^mpi'''. Installed softwares include+Xmaris uses [[https://easybuild.readthedocs.io/en/latest/|EasyBuild]] to provide a build environment for its (scientific) software. Pre-installed software can be explored by means of the ''module spider'' command. For instanceyou can query the system for all modules whose name starts with  `mpi' by executing ''module -r spider '^mpi'''. Installed softwares include
  
  
Line 131: Line 178:
 |OpenMPI    | Open Source Message Passing Interface Implementation | |OpenMPI    | Open Source Message Passing Interface Implementation |
 |Python     | Programming Language | |Python     | Programming Language |
 +|PyCUDA     | Python wrapper to CUDA |
 |Perl       | Programming Language | |Perl       | Programming Language |
 |R          | R is a Language and Environment for Statistical Computing and Graphics | |R          | R is a Language and Environment for Statistical Computing and Graphics |
 |Singularity | Containers software | |Singularity | Containers software |
 +| Tensorflow | Machine Learning Platform |
 |plc        | The Planck Likelihood Code | |plc        | The Planck Likelihood Code |
 +|cobaya | A code for Bayesian analysis in Cosmology |
 |Clang      | C language family frontend for LLVM | |Clang      | C language family frontend for LLVM |
 +|Graphviz | Graph visualization software |
 +|Octave     | GNU Programming language for scientific computing |
 | Mathematica* | Technical computing system | | Mathematica* | Technical computing system |
  
-* Usage is discouraged because proprietary.+* Usage of proprietary software is discouraged.
  
 +For an up-to-date list of installed software use the ''module avail'' command.
 +Any pre-installed software can be made available in your environment via the ''module load <module_name>'' command.
  
-For an up-to-date list of installed software use the ''module'' command. +It is possible to save a list of modules you use often in a //module collection// to load them just with one command 
-Any pre-installed software can be sourced by means of the ''module load'' command. +<code bash>
- +
-Sometimes it is useful to save a list of modules you use often in a `collection'. Consider the following example +
-<code>+
 module load mod1 mod2 mod3 mod4 mod5 mod6 module load mod1 mod2 mod3 mod4 mod5 mod6
 modules save collection1  modules save collection1 
Line 152: Line 203:
 module savelist module savelist
 </code> </code>
 +==== TensorFlow Notes ====
 +Xmaris has multiple modules that provide TensorFlow. See ''ml avail TensorFlow''.
 +
 +^Module^Hardware^Partition^Additional Ops^
 +|TensorFlow/2.1.0-fosscuda-2019b-Python-3.7.4 | CPU, GPU  |gpuIntel|TensorFlow Quantum|
 +|TensorFlow/1.12.0-fosscuda-2018b-Python-3.6.6 | CPU, GPU |gpuIntel| |
 +|TensorFlow-1.15.0-Miniconda3/4.7.10| CPU| All| |
 +
 +The following example shows how you can create a tensorflow-aware jupyter notebook kernel that you can use for instance via the OpenOnDemand interface 
 +
 +<code bash>
 +# We use maris075 (GPU node) and load the optimised tf module
 +ml load TensorFlow/2.1.0-fosscuda-2019b-Python-3.7.4
 +
 +# We install ipykernel, because necessary to run py notebooks
 +python -m pip install ipykernel --user
 +
 +# We create a kernel called TFQuantum based on python from TensorFlow/2.1.0-fosscuda-2019b-Python-3.7.4
 +python -m ipykernel install --name TFQuantum --display-name "TFQuantum" --user
 +
 +# We edit the kernel such that it does not execute python directly
 +# but via a custom wrapper script
 +cat $HOME/.local/share/jupyter/kernels/tfquantum/kernel.json
 +
 +{
 + "argv": [
 +  "/home/lenocil/.local/share/jupyter/kernels/tfquantum/wrapper.sh",
 +  "-m",
 +  "ipykernel_launcher",
 +  "-f",
 +  "{connection_file}"
 + ],
 + "display_name": "TFQuantum",
 + "language": "python",
 + "metadata": {
 +  "debugger": true
 + }
 +}
 +
 +# The wrapper script will call python but only after loading any
 +# appropriate module
 +cat /home/lenocil/.local/share/jupyter/kernels/tfquantum/wrapper.sh
 +
 +#!/bin/env bash
 +ml load TensorFlow/2.1.0-fosscuda-2019b-Python-3.7.4
 +
 +exec python $@
 +
 +# DONE. tfquantum will appear in the dropdown list of kernels
 +# upon creating a new notebook
 +
 +</code>
 +
 +
 +
 +=== TensorFlow with Graphviz ===
 +<code bash>
 +ml load TensorFlow/2.1.0-fosscuda-2019b-Python-3.7.4
 +pip install --user pydot
 +ml load Graphviz/2.42.2-foss-2019b-Python-3.7.4
 +python -c "import tensorflow as tf;m = tf.keras.Model(inputs=[], outputs=[]);tf.keras.utils.plot_model(m, show_shapes=True)"
 +</code>
 +
 ==== Installing extra software ==== ==== Installing extra software ====
  
-  - Request it via https://helpdesk.lorentz.leidenuniv.nl+If you need to run a software that is not present on Xmaris, you might: 
 + 
 +  - Request its installation via https://helpdesk.lorentz.leidenuniv.nl
   - Install it yourself   - Install it yourself
-    * via EasyBuild (see instructions below on how to setup your EasyBuild environment) +    * via EasyBuild (see instructions below on how to setup your personal EasyBuild environment) 
-    * using your own method  +    * via a traditional //configure/make// procedure 
 + 
 +Whatever installation method you might choose, please note that you **do not have** administrative rights to the cluster. 
  
-:!: You do not have administrative rights to the cluster. 
  
 === Installing software via EasyBuild === === Installing software via EasyBuild ===
  
-Load the EasyBuild module and define a directory in which to store your EasyBuild-built softwares+:!: See also [[:easybuild_environment|Working with EasyBuild]]. 
 + 
 +In order to use EasyBuild to build a software, you must first set up your development environment. This is usually done by 
 + 
 +  * Loading the EasyBuild module 
 +  * Indicating a directory in which to store your EasyBuild-built softwares 
 +  * Specifying EasyBuild's behaviour via EASYBUILD_* environment variables 
 +  * Build a software 
 + 
 +In their simplest form, the steps outlined above can be translated into the following shell commands 
 + 
 +<code bash>
  
-<code> 
-module avail EasyBuild 
 module load EasyBuild module load EasyBuild
 mkdir /marisdata/<uname>/easybuild mkdir /marisdata/<uname>/easybuild
 export EASYBUILD_PREFIX=/marisdata/<uname>/easybuild export EASYBUILD_PREFIX=/marisdata/<uname>/easybuild
-export EASYBUILD_OPTARCH=GENERIC*+export EASYBUILD_OPTARCH=GENERIC 
 + 
 +eb -S ^Miniconda 
 +eb Miniconda2-4.3.21.eb -r 
 </code> </code>
  
-:!: NOTE 1: The environment variable ''EASYBUILD_OPTARCH'' instructs EasyBuild to compile software in a generic way so that it can be used on different CPUs. This is rather convenient in heterogeneous clusters such as xmaris to avoid recompilations of the same softwares on different compute nodes. This convenience comes of course at a cost; the executables so produced will not be as efficient as they would be on a given CPU. For more info read [[https://easybuild.readthedocs.io/en/latest/Controlling_compiler_optimization_flags.ht +|:!: The environment variable ''EASYBUILD_OPTARCH'' instructs EasyBuild to compile software in a generic way so that it can be used on different CPUs. This is rather convenient in heterogeneous clusters such as xmaris to avoid recompilations of the same softwares on different compute nodes. This convenience comes of course at a cost; the executables so produced will not be as efficient as they would be on a given CPU. For more info read [[https://easybuild.readthedocs.io/en/latest/Controlling_compiler_optimization_flags.ht 
-ml|here]].+ml|here]].|
  
-:!: NOTE 2: When compiling OpenBLAS it is not sufficient to define ''EASYBUILD_OPTARCH'' to ''GENERIC'' to achieve portability of the executables. Some extra steps must be taken as described in https://github.com/easybuilders/easybuild/blob/master/docs/Controlling_compiler_optimization_flags.rst. A list of targets supported by OpenBLAS can be found [[https://github.com/xianyi/OpenBLAS/blob/develop/TargetList.txt|here]].+|:!: When compiling OpenBLAS it is not sufficient to define ''EASYBUILD_OPTARCH'' to ''GENERIC'' to achieve portability of the executables. Some extra steps must be taken as described in https://github.com/easybuilders/easybuild/blob/master/docs/Controlling_compiler_optimization_flags.rst. A list of targets supported by OpenBLAS can be found [[https://github.com/xianyi/OpenBLAS/blob/develop/TargetList.txt|here]].|
  
-Search a software to build, build it and make it available to your environment+Then execute 
  
 <code> <code>
-eb -S ^Miniconda 
-eb Miniconda2-4.3.21.eb -r 
 module use /marisdata/<uname>/easybuild/modules/all module use /marisdata/<uname>/easybuild/modules/all
 </code> </code>
  
-:!: ''module use <path>'' will prepend <path> to your  ''MODULEPATH''. Should you want to append it instead, then add the option ''-a''. To remove <path> from ''MODULEPATH'' execute ''module unuse <path>''.+to make available to the ''module'' comamnd any of the softwares built in your EasyBuild userspace. 
 + 
 +|:!: ''module use <path>'' will prepend <path> to your  ''MODULEPATH''. Should you want to append it instead, then add the option ''-a''. To remove <path> from ''MODULEPATH'' execute ''module unuse <path>''.|
  
 Should you want to customise the building process of a given software please read how to implement [[https://easybuild.readthedocs.io/en/latest/Implementing-easyblocks.html|EasyBlocks]] and write [[https://easybuild.readthedocs.io/en/latest/Writing_easyconfig_files.html|EasyConfig]] files  or  Should you want to customise the building process of a given software please read how to implement [[https://easybuild.readthedocs.io/en/latest/Implementing-easyblocks.html|EasyBlocks]] and write [[https://easybuild.readthedocs.io/en/latest/Writing_easyconfig_files.html|EasyConfig]] files  or 
-contact Leonardo Lenoci (HL409b) for a quick tutorial.+contact Leonardo Lenoci (HL409b).
  
-===== How to run a computation =====+== Working with conda modules ==
  
-xmaris runs the slurm scheduler and resource manager. Computation jobs can be submitted as batch jobs or be run interactively. Any other jobs might be terminated without prior notice.+Several conda modules are ready-to-use on maris.  A possible use of these could be to clone  and extend them with your packages of choice. Mind though that if you run ''conda init'', conda will modify your shell initialisation scripts (e.g. ''~/.bashrc'') to load automatically the chosen conda environment. 
 +This causes several problems in all cases in which you are supposed to work in a clean environment. 
 + 
 +The steps below show as an example how you could skip ''conda init'' when activating a conda environment. 
 + 
 +<code> 
 +> ml load Miniconda3/4.7.10 
 +> # note that if you specify prefix, you cannot specify the name 
 +> conda create [--prefix <location where there is plenty of space and you can write to>] [--name TEST] 
 + 
 +# the following fails 
 +> conda activate TEST 
 + 
 +CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'
 +To initialize your shell, run 
 + 
 +    $ conda init <SHELL_NAME> 
 + 
 +Currently supported shells are: 
 +  - bash 
 +  - fish 
 +  - tcsh 
 +  - xonsh 
 +  - zsh 
 +  - powershell 
 + 
 +See 'conda init --help' for more information and options. 
 + 
 +IMPORTANT: You may need to close and restart your shell after running 'conda init'
 + 
 +## do this instead 
 +> source activate TEST  
 +> # or if you used the --prefix option to crete the env 
 +> # source activate <location where there is plenty of space and you can write to> 
 +> ... 
 +> conda deactivate 
 + 
 +</code> 
 +===== How to run a computation on Xmaris ===== 
 + 
 +Xmaris runs the slurm scheduler and resource manager. Computation jobs must be submitted as batch jobs or be run interactively via slurm. Any other jobs will be terminated **without prior notice**. Because this is not a slurm manual, you are encouraged to learn the basics by reading [[https://slurm.schedmd.com/archive/slurm-18.08.6/|the slurm manual]]. Here we only give you a few simple examples.
  
 ==== Batch jobs ==== ==== Batch jobs ====
-  + 
-To submit a batch job to slurm you must first create a shell script which contains enough instructions to request the needed resources to slurm and to execute your program. The script can be written in any known interpreter to the system. In a batch script, slurm instructions are prefixed by the interpreter comment symbol and the word ''SBATCH''+Batch jobs are computation jobs that do not execute interactively.  
-For instance a bash batch script could be+ 
 +To submit a batch job to Xmaris' slurm you must first create a shell script which contains enough instructions to request the needed resources and to execute your program. The script can be written in **any** known interpreter to the system. Slurm instructions are prefixed by the chosen interpreter comment symbol and the word ''SBATCH''
 +An example bash-batch script that will request Xmaris to execute the program ''hostname'' on one node is 
 <code> <code>
 cat test.sh cat test.sh
Line 208: Line 381:
 </code> </code>
  
-Please consult the [[https://slurm.schedmd.com/archive/slurm-18.08.6/|slurm manual]] for all possible options. Batch scripts are then submitted for execution via ''sbatch''+Batch scripts are then submitted for execution via ''sbatch'' 
 <code> <code>
 sbatch test.sh sbatch test.sh
 sbatch: Submitted batch job 738279474774293 sbatch: Submitted batch job 738279474774293
 </code> </code>
 +
 and their status [PENDING|RUNNING|FAILED|COMPLETED] checked using ''squeue''. You can recur to the command ''sstat'' to display useful information about your running job, such as memory consumption etc... and their status [PENDING|RUNNING|FAILED|COMPLETED] checked using ''squeue''. You can recur to the command ''sstat'' to display useful information about your running job, such as memory consumption etc...
  
-ssh shell access to an executing node is automatically granted by slurm and can also be used for debugging purposes.+ssh __shell access to an executing node is__ automatically __granted__ by slurm and can also be used for debugging purposes
 + 
 +Please consult the [[https://slurm.schedmd.com/archive/slurm-18.08.6/|slurm manual]] for all possible ''sbatch'' options
  
 ==== Interactive jobs ==== ==== Interactive jobs ====
  
-Interactive jobs give you nodes shell prompts in interactive mode+Interactive jobs are usually used for debugging purposes and in those cases 
 +in which the computation requires human interaction. Using interactive sessions you can gain shell access to a computation node 
  
 <code> <code>
Line 225: Line 404:
 </code> </code>
  
-===== Parallelism 101 =====+or execute an interactive program
  
-:!: Inexpert users should refrain from attempting to program parallel applications without studying appropriately.+<code> 
 +srun  -p compIntel -N1 -n 1 --mem=4000 --pty python -c "import sys; data = sys.stdin.readlines(); print(data)" -i 
 +Hello world 
 +^D 
 +['Hello world\n'
 +</code> 
 +===== Parallelism 101 =====
  
 A parallel job  runs a calculation whose computational subtasks are run simultaneously. The underlying principle is that  A parallel job  runs a calculation whose computational subtasks are run simultaneously. The underlying principle is that 
Line 245: Line 430:
 ==== Multiprocessing ==== ==== Multiprocessing ====
  
-Multiprocessing usually refers to computations subdivided into tasks that run on multiples nodes. This type of programming increases the resources available to your computation by employing several nodes at the same time. ''MPI (Message Parsing Interface)'' defines the standards (in terms of syntax and rules) to implement multiprocessing in your codes.+Multiprocessing usually refers to computations subdivided into tasks that run on multiples nodes. This type of programming increases the resources available (e.g. more memory) to your computation by employing several nodes at the same time. ''MPI (Message Parsing Interface)'' defines the standards (in terms of syntax and rules) to implement multiprocessing in your codes.
 MPI-enabled applications spawn multiple copies of the program, also called //ranks//, mapping each one of them to a processor. A computation node has usually multiple processors. The MPI interface lets you manage the allocated resources and the communication and synchronisation of the ''ranks''. MPI-enabled applications spawn multiple copies of the program, also called //ranks//, mapping each one of them to a processor. A computation node has usually multiple processors. The MPI interface lets you manage the allocated resources and the communication and synchronisation of the ''ranks''.
  
Line 256: Line 441:
 ===== How to launch a jupyter notebook ===== ===== How to launch a jupyter notebook =====
  
-To launch a jupyter notebook login to [[https://marishead.lorentz.leidenuniv.nl:4433|xmaris OnDemand]], select +To launch a jupyter notebook login to [[https://xmaris.lorentz.leidenuniv.nl:4433|xmaris OnDemand]], select ''Interactive Apps --> Jupyter Notebook'' and specify the resources needed in the form provided, push ''Launch'' and wait until the notebook has launched. 
-''Interactive Apps --> Jupyter Notebook'' and specify the resources needed as in the figure below +Now you can interact with your notebook (click on ''Connect to Jupyter''), open a shell on the executing node (click on ''Host >_hostaname''), and analyse notebook log files for debugging purposes (click on ''Session ID xxxx-xxxx-xxxxx-xxxxxxx-xxx-xx'').
-{{ :institute_lorentz:jupyterform.png?direct&400 |}}+
  
-push ''Launch'' and wait until the notebook has launched (green colour, see below)+If your notebook does not launch in a few seconds take the following actions
  
-{{ :institute_lorentz:jupyterlaunch.png?direct&600 |}}+  * Check the status of your jobs in the queue with ''squeu -u <username>''
 +  * Examine the notebook log files (click on ''Session ID xxxx-xxxx-xxxxx-xxxxxxx-xxx-xx''). 
 +  * If the suggestions above do not help, contact ''support''.
  
 +==== How to launch a jupyter notebook that uses GPUs ====
 +
 +Repeat the steps above but make sure you select an appropriate GPU partition. Moreover, you must add an appropriate CUDA module to the field ''Extra modules needed'' otherwise the connection to the GPUs might not work as expected. For a list of cuda modules you could type in a terminal ''ml spider CUDA''
 +The form field ''Extra modules needed''  can accept more than one module as long as the modules names are separated by a space.
  
-Now you can interact with your notebook (click on ''Connect to Jupyter''), open a shell on the executing node (click on ''Host >_hostaname''), and analyse notebook log files for debugging purposes (click on ''Session ID xxxx-xxxx-xxxxx-xxxxxxx-xxx-xx''). 
  
-:!: If you want your notebook directory to be different than $HOME, please do ''export NOTEBOOKDIR=/marisdata/$LOGNAME'' in your .bashrc+:!: NOTE1: If you want your notebook directory to be different than $HOME, please do ''export NOTEBOOKDIR=/marisdata/$LOGNAME'' in your .bashrc
  
 +:!: NOTE2: Any form fields left empty will assume pre-programmed values. For instance, you do not need to 
 +specify your slurm account because it will default to the account of your PI.
 ==== Launching jupyterlab instead of jupyter notebook ==== ==== Launching jupyterlab instead of jupyter notebook ====
  
Line 274: Line 465:
  
 <code> <code>
-https://marishub.lorentz.leidenuniv.nl:4433/node/maris051.lorentz.leidenuniv.nl/26051/lab?+https://xmaris.lorentz.leidenuniv.nl:4433/node/maris051.lorentz.leidenuniv.nl/26051/lab?
 </code> </code>
  
Line 322: Line 513:
 conda env remove --name py35 conda env remove --name py35
 </code> </code>
 +
 +=== Install a mathematica (wolfram) kernel ===
 +
 +You can set it up following these notes https://github.com/WolframResearch/WolframLanguageForJupyter or follow these steps for a preconfigured setup.
 +
 +  * Open an SSH connection to xmaris
 +  * Run ''/marisdata/WOLFRAM/WolframLanguageForJupyter/configure-jupyter.wls add''
 +  * The ''wolframlanguage'' is now available among your kernels
 +
  
 ==== Debugging jupyter lab/notebook sessions ==== ==== Debugging jupyter lab/notebook sessions ====
Line 331: Line 531:
 ===== xmaris slurm tips ===== ===== xmaris slurm tips =====
  
-:!: This is not a slurm manual, you should always refere to the official documentation (see link below).+:!: This is not a slurm manual, you should always refer to the official documentation (see link below).
  
 xmaris runs the scheduler and resource manager slurm **v18.08.6-2**. Please consult the [[https://slurm.schedmd.com/archive/slurm-18.08.6/|official manual]] for detailed information.  xmaris runs the scheduler and resource manager slurm **v18.08.6-2**. Please consult the [[https://slurm.schedmd.com/archive/slurm-18.08.6/|official manual]] for detailed information. 
  
-:!: The headnode (marishead) is not a compute node. Any user applications running on it will be terminated without notice.+:!: The headnode (xmaris.lorentz.leidenuniv.nl) is not a compute node. Any user applications running on it will be terminated without notice.
  
-Here we report a few useful commands and their outputs to get you started.+Here we report a few useful commands and their outputs to get you started. For the inpatients, look at the following [[https://www.lorentz.leidenuniv.nl/RUL-only/generator/|slurm batch-script generator]] (__NO RESPONSIBILITIES assumed__! Study the script before submitting it.) which is available only from withing UL IPs.
  
 ==== Determine your slurm account name ==== ==== Determine your slurm account name ====
Line 380: Line 580:
 srun --constraint="opteron&highmem"  --pty bash -i srun --constraint="opteron&highmem"  --pty bash -i
 </code> </code>
 +
 +==== Run a multiprocessing application ====
 +
 +Xmaris supports OpenMPI in combination with slurm's ''srun'' and infiniband on all nodes in the ''ibIntel'' partition.
 +First of all, make sure that the ''max locked memory'' is set to ''unlimited'' for your account by executing
 +<code bash>
 +# ulimit -l
 +unlimited
 +</code>
 +
 +If that is NOT the case, please contact the IT support. 
 +
 +To run an MPI application see the session below
 +
 +<code bash>
 +# login to the headnode and request resources
 +$ salloc  -N6  -n6 -p ibIntel --mem 2000
 +salloc: Granted job allocation 564086
 +salloc: Waiting for resource configuration
 +salloc: Nodes maris[078-083] are ready for job
 +# load the needed modules for the app to run
 +$ ml load OpenMPI/4.1.1-GCC-10.3.0 OpenBLAS/0.3.17-GCC-10.3.0
 +# execute the app (note that the default MPI is set to  pmi2)
 +$ srun  ./mpi_example
 +Hello world!  I am process number: 5 on host maris083.lorentz.leidenuniv.nl
 +11.000000 -9.000000 5.000000 -9.000000 21.000000 -1.000000 5.000000 -1.000000 3.000000 
 +Hello world!  I am process number: 4 on host maris082.lorentz.leidenuniv.nl
 +11.000000 -9.000000 5.000000 -9.000000 21.000000 -1.000000 5.000000 -1.000000 3.000000 
 +Hello world!  I am process number: 2 on host maris080.lorentz.leidenuniv.nl
 +11.000000 -9.000000 5.000000 -9.000000 21.000000 -1.000000 5.000000 -1.000000 3.000000 
 +Hello world!  I am process number: 1 on host maris079.lorentz.leidenuniv.nl
 +11.000000 -9.000000 5.000000 -9.000000 21.000000 -1.000000 5.000000 -1.000000 3.000000 
 +Hello world!  I am process number: 3 on host maris081.lorentz.leidenuniv.nl
 +11.000000 -9.000000 5.000000 -9.000000 21.000000 -1.000000 5.000000 -1.000000 3.000000 
 +Hello world!  I am process number: 0 on host maris078.lorentz.leidenuniv.nl
 +11.000000 -9.000000 5.000000 -9.000000 21.000000 -1.000000 5.000000 -1.000000 3.000000 
 +
 +</code>
 +
  
 ===== Suggested readings ===== ===== Suggested readings =====
  
-https://slurm.schedmd.com/archive/slurm-18.08.6/+  * https://slurm.schedmd.com/archive/slurm-21.08.8-2/ 
 +  * https://osc.github.io/ood-documentation/master/ 
 +  * https://www.gnu.org/gnu/linux-and-gnu.en.html 
 +  * https://www.centos.org/ 
 +  * https://www.gnu.org/software/bash/ 
 +  * https://jupyter.org/ 
 +  * http://www.tldp.org/LDP/GNU-Linux-Tools-Summary/GNU-Linux-Tools-Summary.pdf 
 +  * https://easybuild.readthedocs.io/en/latest/ 
 +  * https://docs.conda.io/en/latest/ 
 +  * https://en.wikipedia.org/wiki/Parallel_computing 
 +===== Requesting help =====
  
-https://osc.github.io/ood-documentation/master/+Please use this [[https://helpdesk.lorentz.leidenuniv.nl|helpdesk]] form or email ''support''.
  
-https://www.gnu.org/gnu/linux-and-gnu.en.html+===== Recent Scientific Publications from maris  =====
  
-https://www.centos.org/+==== Quantum physics and Quantum computing ====
  
-https://www.gnu.org/software/bash/+  * [[https://journals.aps.org/pra/pdf/10.1103/PhysRevA.100.010302|Experimental error mitigation via symmetry verification in a variational quantum eigensolver]] 
 +  * [[https://journals.aps.org/pra/pdf/10.1103/PhysRevA.98.062339|Low-cost error mitigation by symmetry verification]] 
 +  * [[https://www.nature.com/articles/s41534-019-0213-4|Calculating energy derivatives for quantum chemistry on a quantum computer]] 
 +  * [[https://www.nature.com/articles/s41534-017-0039-x|Density-matrix simulation of small surface codes under current and projected experimental noise]] 
 +  * [[https://iopscience.iop.org/article/10.1088/1367-2630/aafb8e/pdf|Quantum phase estimation of multiple eigenvalues for small-scale(noisy) experiments]] 
 +  * [[https://onlinelibrary.wiley.com/doi/abs/10.1002/qute.201870015|Adaptive Weight Estimator for Quantum Error Correction in a Time‐Dependent Environment (Adv. Quantum Technol. 1/2018)]] 
 +  * [[https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.123.120502|Fast, High-Fidelity Conditional-Phase Gate Exploiting Leakage Interference in Weakly Anharmonic Superconducting Qubits]] 
 +  * [[https://arxiv.org/abs/2002.07119|Leakage detection for a transmon-based surface code]] 
 +  * [[https://journals.aps.org/prb/abstract/10.1103/PhysRevB.103.094518|Voltage staircase in a current-biased quantum-dot Josephson junction]]
  
-https://jupyter.org/+==== Statistical, nonlinear, biological, condensed matter and soft matter physics ====
  
-http://www.tldp.org/LDP/GNU-Linux-Tools-Summary/GNU-Linux-Tools-Summary.pdf+  * [[https://journals.aps.org/pre/abstract/10.1103/PhysRevE.98.062101|Equivalent-neighbor percolation models in two dimensions: Crossover between mean-field and short-range behavior]] 
 +  * [[https://iopscience.iop.org/article/10.1088/1742-6596/1163/1/012001|Medium-range percolation in two dimensions]] 
 +  * [[https://journals.aps.org/pre/abstract/10.1103/PhysRevE.99.062133|Revisiting the field-driven edge transition of the tricritical two-dimensional Blume-Capel model]] 
 +  * [[https://journals.aps.org/pre/abstract/10.1103/PhysRevE.101.012118|Three-state Potts model on the centered triangular lattice]] 
 +  * [[https://www.pnas.org/content/118/4/e2020525118|Liquid-crystal-based topological photonics]]
  
-https://easybuild.readthedocs.io/en/latest/+==== Particles, fields, gravitation, and cosmology ==== 
 +  * [[https://journals.aps.org/prd/abstract/10.1103/PhysRevD.100.063540|Cosmological data favor Galileon ghost condensate over ΛCDM]] 
 +  * [[https://journals.aps.org/prd/abstract/10.1103/PhysRevD.97.043519|Large-scale structure phenomenology of viable Horndeski theories]] 
 +  * [[https://journals.aps.org/prd/abstract/10.1103/PhysRevD.97.063518|Do current cosmological observations rule out all covariant Galileons?]] 
 +  * [[https://journals.aps.org/prd/abstract/10.1103/PhysRevD.96.063524|Impact of theoretical priors in cosmological analyses: The case of single field quintessence]]
  
-https://docs.conda.io/en/latest/+==== String theory ====
  
-https://en.wikipedia.org/wiki/Parallel_computing +  * [[https://link.springer.com/article/10.1007%2FJHEP01%282020%29151|Isolated zeros destroy Fermi surface in holographic models with a lattice]]
-===== Requesting help =====+
  
-Please use this [[https://helpdesk.lorentz.leidenuniv.nl|helpdesk]] form or email ''support''. 
institute_lorentz/xmaris.1582098794.txt.gz · Last modified: 2020/02/19 07:53 by lenocil