This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionLast revisionBoth sides next revision | ||
institute_lorentz:xmaris [2021/03/19 14:20] – [Web access] lenocil | institute_lorentz:xmaris [2024/02/29 14:16] – [Xmaris Partitions] jansen | ||
---|---|---|---|
Line 13: | Line 13: | ||
:!: External research groups to the Lorentz Institute are strongly encouraged to explore other HPC possibilities, | :!: External research groups to the Lorentz Institute are strongly encouraged to explore other HPC possibilities, | ||
- | Xmaris is optimised for [[https:// | + | Xmaris is optimised for [[https:// |
Xmaris is the successor of the maris cluster, renamed with a prefix '' | Xmaris is the successor of the maris cluster, renamed with a prefix '' | ||
- | [[https:// | + | [[https:// |
===== Xmaris features and expected cluster lifetime ===== | ===== Xmaris features and expected cluster lifetime ===== | ||
- | Xmaris runs CentOS v7.6 and consists for historical reasons of heterogeneous computation nodes. A list of configured nodes and partitions on the cluster can be obtained on the command line using slurm' | + | Xmaris runs CentOS v7 and consists for historical reasons of heterogeneous computation nodes. A list of configured nodes and partitions on the cluster can be obtained on the command line using slurm' |
:!: Because Xmaris features different CPU types that understand different types of instructions (see [[https:// | :!: Because Xmaris features different CPU types that understand different types of instructions (see [[https:// | ||
- | You can display nodes features with '' | + | You can display |
- | < | + | < |
+ | # specific node | ||
sinfo -o " %n %P %t %C %z %m %f" -N -n maris077 | sinfo -o " %n %P %t %C %z %m %f" -N -n maris077 | ||
- | HOSTNAMES | + | # all nodes |
- | maris077 | + | sinfo -o " %n %P %t %C %z %m %f" -N |
+ | # all nodes (more concise) | ||
+ | sinfo -Nel | ||
</ | </ | ||
Line 41: | Line 44: | ||
^Mount Point^ Type ^Notes^ | ^Mount Point^ Type ^Notes^ | ||
|/scratch | HD | **temporary**, | |/scratch | HD | **temporary**, | ||
- | |/marisdata |NetApp| 2TB/user quota, remote| | + | |/marisdata |NetApp| 2TB/user quota, medium-term storage, remote| |
- | |/home |NetApp| 10GB/user quota, remote| | + | |/home |NetApp| 10GB/user quota, medium-term storage, remote| |
+ | |/ | ||
- | Extra efficient scratch spaces are available to all nodes '' | + | |
+ | Extra efficient scratch spaces are available to all nodes on the infiniband network | ||
^Mount Point^ Type^ Notes^ | ^Mount Point^ Type^ Notes^ | ||
- | |/IBSSD| SSD |**temporary**, InfiniBand/ | + | |/IBSSD| SSD |**DISCONTINUED**, InfiniBand/ |
- | |/PIBSSD| SSD|**temporary**, | + | |/PIBSSD| SSD|**temporary**, |
- | * iSER stands for “iSCSI Extensions for RDMA”. It is an extension of the iSCSI protocol that includes RDMA (Remote Dynamic Memory Access) support. BeeGFS is parallel filysystem. We are testing it until mid 2021. | ||
Backup snapshots of ''/ | Backup snapshots of ''/ | ||
Line 56: | Line 60: | ||
xmaris users are strongly advised they delete (or at least move to the shared data disk), if any, their data from the compute nodes scratch disks upon completion of their calculations. All data on the scratch disks __might be cancelled without prior notice__. | xmaris users are strongly advised they delete (or at least move to the shared data disk), if any, their data from the compute nodes scratch disks upon completion of their calculations. All data on the scratch disks __might be cancelled without prior notice__. | ||
- | The home disk ''/ | + | Note that **disk policies might change at any time at the discretion of the cluster owners**. |
---- | ---- | ||
+ | Please also note the following | ||
+ | * xmaris' | ||
+ | * The **OLD** (as in the old maris) ''/ | ||
- | :!: **All data** on the scratch partitions is assumed to be temporary and will be **deleted** upon a node re-installation. | ||
- | :!: When usisng xmaris please note that the home partition | ||
- | |||
- | :!: The **OLD** (as in the old maris) ''/ | ||
==== Xmaris usage policies ==== | ==== Xmaris usage policies ==== | ||
- | Usage policies are updated regularly in accordance with the needs of the cluster owners and **may change at any time without notice**. At the moment there is an __enforced usage limit of 128 CPUs per user__ that does not apply to the owners. Job execution priorities are defined via a complex [[https:// | + | Usage policies are updated regularly in accordance with the needs of the cluster owners and **may change at any time without notice**. At the moment there is an __enforced usage limit of 128 CPUs per user__ that does not apply to the owners. Job execution priorities are defined via a complex [[https:// |
<code bash> | <code bash> | ||
scontrol show config | grep -i priority | scontrol show config | grep -i priority | ||
Line 78: | Line 81: | ||
* execute slurm' | * execute slurm' | ||
- | * visit https:// | + | * browse to https:// |
- | :!: The link above is accessible only within the IL workstations network. | ||
===== How to access Xmaris ===== | ===== How to access Xmaris ===== | ||
Access to Xmaris is not granted automatically to all Lorentz Institute members. Instead, a preliminary approval must be granted to you by the cluster owners (read [[|here]]). | Access to Xmaris is not granted automatically to all Lorentz Institute members. Instead, a preliminary approval must be granted to you by the cluster owners (read [[|here]]). | ||
Line 86: | Line 89: | ||
Once you have been authorised to use Xmaris, there are two ways to access its services: | Once you have been authorised to use Xmaris, there are two ways to access its services: | ||
- | - using an ssh client | + | - using a web browser |
- | - using a web browser | + | - using an SSH client |
+ | |||
- | Both methods can provide terminal access, but connections via web browsers offer you extra services such as sftp (drag-and-drop file transfers) | + | Both methods can provide terminal access, but connections via web browsers offer you extra services such as sftp (drag-and-drop file transfers), jupyter interactive notebooks, virtual desktops and more at the click of your mouse. **We advise** all users either |
==== Access via an ssh client ==== | ==== Access via an ssh client ==== | ||
Line 95: | Line 99: | ||
The procedure differs on whether you try to connect with a client connected to the IL intranet or not. | The procedure differs on whether you try to connect with a client connected to the IL intranet or not. | ||
- | * When within the IL network, for instance if you are using a Lorentz Institute workstation, | + | * When within the IL network, for instance if you are using a Lorentz Institute workstation, |
<code bash> | <code bash> | ||
Line 108: | Line 112: | ||
- | :!: If you were a maris user prior to the configuration switch to xmaris, you might find out that many terminal functions and programs could not be working as expected. This is due to the presence in your xmaris home directory of old shell initialisation scripts still tied to the STRW sfinx environment. You can override them (preferably after making a backup copy) by replacing their contents with the default CentOS shell initialisation scripts, for instance for bash these are located in ''/ | + | |:!: If you were a maris user prior to the configuration switch to xmaris, you might find out that many terminal functions and programs could not be working as expected. This is due to the presence in your xmaris home directory of old shell initialisation scripts still tied to the STRW sfinx environment. You can override them (preferably after making a backup copy) by replacing their contents with the default CentOS shell initialisation scripts, for instance for bash these are located in ''/ |
==== Web access ==== | ==== Web access ==== | ||
Xmaris services, that is terminal, scheduler/ | Xmaris services, that is terminal, scheduler/ | ||
- | {{ : | ||
- | Similarly to a traditional shell access, Xmaris OpenOnDemand is available only for connections within the IL intranet. IL users who wish to access OpenOnDemand from their homes could for example instruct their browsers to SOCKS-proxy their connections via our SSH server. | + | {{ : |
- | Open a local terminal and type | + | |
+ | Similarly to a traditional shell access, Xmaris OpenOnDemand is available only for connections within the __IL intranet__. IL users who wish to access OpenOnDemand from their remote home locations | ||
+ | Open a local terminal and type (substitute username with your IL username) | ||
<code bash> | <code bash> | ||
Line 129: | Line 134: | ||
* Submit batch jobs to the slurm scheduler/ | * Submit batch jobs to the slurm scheduler/ | ||
* Open a terminal. | * Open a terminal. | ||
- | * Launch interactive jupyter notebooks. | + | * Launch interactive |
* Monitor cluster usage. | * Monitor cluster usage. | ||
* Create and launch your very own OnDemand application (read [[https:// | * Create and launch your very own OnDemand application (read [[https:// | ||
Line 141: | Line 146: | ||
|compIntel| 2 | 5 days and 12 hours| | | |compIntel| 2 | 5 days and 12 hours| | | ||
|gpuIntel| 1 | 3 days and 12 hours | GPU | | |gpuIntel| 1 | 3 days and 12 hours | GPU | | ||
- | |ibIntel | 4 | 7 days | InfiniBand, Multiprocessing | | + | |ibIntel | 8 | 7 days | InfiniBand, Multiprocessing | |
*: default partition | *: default partition | ||
Line 150: | Line 155: | ||
|maris075 |gpuIntel|2 x Nvidia Tesla P100 16GB | 6.0| | |maris075 |gpuIntel|2 x Nvidia Tesla P100 16GB | 6.0| | ||
- | Use the '' | + | Xmaris GPUs must be allocated using slurm' |
+ | <code bash> | ||
+ | srun -p gpuIntel --gres=gpu: | ||
+ | </ | ||
===== Xmaris scientific software ===== | ===== Xmaris scientific software ===== | ||
Line 178: | Line 186: | ||
|cobaya | A code for Bayesian analysis in Cosmology | | |cobaya | A code for Bayesian analysis in Cosmology | | ||
|Clang | |Clang | ||
+ | |Graphviz | Graph visualization software | | ||
|Octave | |Octave | ||
| Mathematica* | Technical computing system | | | Mathematica* | Technical computing system | | ||
Line 186: | Line 195: | ||
Any pre-installed software can be made available in your environment via the '' | Any pre-installed software can be made available in your environment via the '' | ||
- | It is possible to save a list of modules you use often in a //module collection// | + | It is possible to save a list of modules you use often in a //module collection// |
- | < | + | < |
module load mod1 mod2 mod3 mod4 mod5 mod6 | module load mod1 mod2 mod3 mod4 mod5 mod6 | ||
modules save collection1 | modules save collection1 | ||
Line 201: | Line 210: | ||
|TensorFlow/ | |TensorFlow/ | ||
|TensorFlow-1.15.0-Miniconda3/ | |TensorFlow-1.15.0-Miniconda3/ | ||
+ | |||
+ | The following example shows how you can create a tensorflow-aware jupyter notebook kernel that you can use for instance via the OpenOnDemand interface | ||
+ | |||
+ | <code bash> | ||
+ | # We use maris075 (GPU node) and load the optimised tf module | ||
+ | ml load TensorFlow/ | ||
+ | |||
+ | # We install ipykernel, because necessary to run py notebooks | ||
+ | python -m pip install ipykernel --user | ||
+ | |||
+ | # We create a kernel called TFQuantum based on python from TensorFlow/ | ||
+ | python -m ipykernel install --name TFQuantum --display-name " | ||
+ | |||
+ | # We edit the kernel such that it does not execute python directly | ||
+ | # but via a custom wrapper script | ||
+ | cat $HOME/ | ||
+ | |||
+ | { | ||
+ | " | ||
+ | "/ | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | ], | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | " | ||
+ | } | ||
+ | } | ||
+ | |||
+ | # The wrapper script will call python but only after loading any | ||
+ | # appropriate module | ||
+ | cat / | ||
+ | |||
+ | #!/bin/env bash | ||
+ | ml load TensorFlow/ | ||
+ | |||
+ | exec python $@ | ||
+ | |||
+ | # DONE. tfquantum will appear in the dropdown list of kernels | ||
+ | # upon creating a new notebook | ||
+ | |||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | === TensorFlow with Graphviz === | ||
+ | <code bash> | ||
+ | ml load TensorFlow/ | ||
+ | pip install --user pydot | ||
+ | ml load Graphviz/ | ||
+ | python -c " | ||
+ | </ | ||
+ | |||
==== Installing extra software ==== | ==== Installing extra software ==== | ||
Line 210: | Line 275: | ||
* via a traditional // | * via a traditional // | ||
- | Whatever installation method you might choose, please note that you do not have administrative rights to the cluster. | + | Whatever installation method you might choose, please note that you **do not have** administrative rights to the cluster. |
=== Installing software via EasyBuild === | === Installing software via EasyBuild === | ||
+ | |||
+ | :!: See also [[: | ||
In order to use EasyBuild to build a software, you must first set up your development environment. This is usually done by | In order to use EasyBuild to build a software, you must first set up your development environment. This is usually done by | ||
- | * Loading the EasyBUild | + | * Loading the EasyBuild |
* Indicating a directory in which to store your EasyBuild-built softwares | * Indicating a directory in which to store your EasyBuild-built softwares | ||
* Specifying EasyBuild' | * Specifying EasyBuild' | ||
Line 224: | Line 291: | ||
In their simplest form, the steps outlined above can be translated into the following shell commands | In their simplest form, the steps outlined above can be translated into the following shell commands | ||
- | < | + | < |
module load EasyBuild | module load EasyBuild | ||
Line 236: | Line 303: | ||
</ | </ | ||
- | :!: The environment variable '' | + | |:!: The environment variable '' |
- | ml|here]]. | + | ml|here]].| |
- | :!: When compiling OpenBLAS it is not sufficient to define '' | + | |:!: When compiling OpenBLAS it is not sufficient to define '' |
Then execute | Then execute | ||
Line 249: | Line 316: | ||
to make available to the '' | to make available to the '' | ||
- | :!: '' | + | |:!: '' |
Should you want to customise the building process of a given software please read how to implement [[https:// | Should you want to customise the building process of a given software please read how to implement [[https:// | ||
Line 263: | Line 330: | ||
< | < | ||
> ml load Miniconda3/ | > ml load Miniconda3/ | ||
- | > conda create --name TEST | + | > # note that if you specify prefix, you cannot specify the name |
+ | > conda create | ||
# the following fails | # the following fails | ||
Line 286: | Line 354: | ||
## do this instead | ## do this instead | ||
- | > source activate TEST | + | > source activate TEST |
+ | > # or if you used the --prefix option to crete the env | ||
+ | > # source activate < | ||
> ... | > ... | ||
> conda deactivate | > conda deactivate | ||
Line 513: | Line 583: | ||
==== Run a multiprocessing application ==== | ==== Run a multiprocessing application ==== | ||
- | < | + | Xmaris supports OpenMPI in combination with slurm' |
- | srun -p ibIntel -N 4 -n4 -c1 --mem=16000 --pty bash -i | + | First of all, make sure that the '' |
- | module load foss | + | < |
- | ulimit -l unlimited | + | # ulimit -l |
- | mpirun --mca btl openib, | + | unlimited |
</ | </ | ||
- | https://www.open-mpi.org/faq/? | + | If that is NOT the case, please contact the IT support. |
+ | |||
+ | To run an MPI application see the session below | ||
+ | |||
+ | <code bash> | ||
+ | # login to the headnode and request resources | ||
+ | $ salloc | ||
+ | salloc: Granted job allocation 564086 | ||
+ | salloc: Waiting for resource configuration | ||
+ | salloc: Nodes maris[078-083] are ready for job | ||
+ | # load the needed modules for the app to run | ||
+ | $ ml load OpenMPI/4.1.1-GCC-10.3.0 OpenBLAS/0.3.17-GCC-10.3.0 | ||
+ | # execute the app (note that the default MPI is set to pmi2) | ||
+ | $ srun ./ | ||
+ | Hello world! | ||
+ | 11.000000 | ||
+ | Hello world! | ||
+ | 11.000000 -9.000000 5.000000 -9.000000 21.000000 -1.000000 5.000000 -1.000000 3.000000 | ||
+ | Hello world! | ||
+ | 11.000000 -9.000000 5.000000 -9.000000 21.000000 -1.000000 5.000000 -1.000000 3.000000 | ||
+ | Hello world! | ||
+ | 11.000000 -9.000000 5.000000 -9.000000 21.000000 -1.000000 5.000000 -1.000000 3.000000 | ||
+ | Hello world! | ||
+ | 11.000000 -9.000000 5.000000 -9.000000 21.000000 -1.000000 5.000000 -1.000000 3.000000 | ||
+ | Hello world! | ||
+ | 11.000000 -9.000000 5.000000 -9.000000 21.000000 -1.000000 5.000000 -1.000000 3.000000 | ||
+ | |||
+ | </ | ||
+ | |||
===== Suggested readings ===== | ===== Suggested readings ===== | ||
- | * https:// | + | * https:// |
* https:// | * https:// | ||
* https:// | * https:// | ||
Line 539: | Line 638: | ||
===== Recent Scientific Publications from maris ===== | ===== Recent Scientific Publications from maris ===== | ||
- | ==== Quantum computing ==== | + | ==== Quantum physics and Quantum computing ==== |
* [[https:// | * [[https:// | ||
Line 549: | Line 648: | ||
* [[https:// | * [[https:// | ||
* [[https:// | * [[https:// | ||
+ | * [[https:// | ||
==== Statistical, | ==== Statistical, |