User Tools

Site Tools


institute_lorentz:institutelorentz_maris

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
institute_lorentz:institutelorentz_maris [2019/01/24 09:07]
lenocil
— (current)
Line 1: Line 1:
-====== Maris Cluster ====== 
- 
-~~NOCACHE~~ 
- 
-Maris is a small computational cluster at the Lorentz Institute, 
-financed by external research grants. Access is primarily for those 
-research groups who have purchased the machines, but there may well be 
-computing time available for others. If you would like to use the Maris 
-cluster, please get in touch with the [[https://​helpdesk.lorentz.leidenuniv.nl/​wiki/​doku.php?​id=institute_lorentz:​institutelorentz_maris#​help|local contact persons]] at the 
-Lorentz Institute to see 
-what resources can be made available for your needs. You can then 
-request access by **__sending an email to Carlo Beenakker__**. 
- 
-The cluster is optimised for [[https://​en.wikipedia.org/​wiki/​Thread_(computing)#​Multithreading|multithreading applications]] and  [[https://​en.wikipedia.org/​wiki/​Embarrassingly_parallel|embarrassingly parallel problems]]. 
- 
-===== How to access Maris ===== 
-Once you have been granted access to the cluster, you must login on to its head node to access it. Whitin the IL network, Maris' head node is reachable at ''​novamaris.lorentz.leidenuniv.nl''​ or trough its aliases ''​maris''​ and ''​mariscluster''​. For connections from outside the IL network, an [[institute_lorentz:​institutelorentz_remoteaccess|ssh tunnel]] into the IL ssh server (''​styx.lorentz.leidenuniv.nl''​) is needed. ​ 
- 
-:!: Note that ssh.lorentz.leidenuniv.nl is an alias of styx. 
- 
-Direct ssh access to any computation node is **disabled**. To open an interactive session to one of the nodes use for example ​ 
- 
-<​code>​ 
-srun <​whatever srun options you'd like> --pty bash 
-</​code>​ 
- 
-Any calculation must be submitted to [[:​slurm_tutorial|slurm]] for execution. If not, the associated process will be terminated without any notice. ​ 
- 
-A summary of the current cluster load/usage is available on [[http://​slurm.lorentz.leidenuniv.nl|here]] (only within the IL network). 
-===== About the cluster ===== 
- 
-Maris' head node and nodes run the GNU/Linux OS Fedora 29. The number of nodes available for calculations is subject to change depending on maintenance works that need to be carried out on a regular basis. You are advised to run ''​sinfo''​ prior to any job submission. The nodes are organized in groups according to their specs and configuration 
- 
-^ CPU node(s) ^ Server Type ^ CPU(s) ^ Clock Speed ^ Cores ^ Threads ^ RAM ^ CPU Family ^ 
-| maris0[04 - 22] | ASUS RS161-E5/​PA2| 2 x AMD 2350 | 2GHz | 8 | | 16 GB | AMD | 
-| maris0[23 - 46] | Supermicro H8DMT | 2 x AMD 2350 | 2GHz | 8 | | 32 GB | ::: | 
-| maris0[47 - 59] | Dell PowerEdge R815 | 4 x AMD 6174 | 2.2GHz | 48 | 48 |128 GB | ::: | 
-| maris060 ​ | Dell PowerEdge R815 | 4 x AMD 6276 | 3.2GHz | 64 | 64 |512 GB | ::: | 
-| maris061 | Dell PowerEdge R815 | 4 x AMD 6174 | 3.2GHz | 48 | 48 | 128 GB | ::: | 
-| maris062 ​ | Dell PowerEdge R815 | 4 x AMD 6174 | 3.2GHz | 48 | 48 | 256 GB | ::: | 
-| maris0[63,​64] | Dell PowerEdge R815 | 4 x AMD 6276 | 3.2GHz | 64 | 64 | 512 GB | ::: |  
-| maris0[65,​67,​68] | Dell PowerEdge R815 | 4 x AMD 6376 | 3.2GHz | 64 | 64 | 256 GB | ::: | 
-| maris0[69-73] | Dell PowerEdge R815 | 4 x AMD 6376 | 3.2GHz | 64 | 64 | 512 GB | ::: | 
-| maris066 | Dell PowerEdge R815 | 4 x AMD 6376 | 3.2GHz | 64 | 64 | 192 GB | ::: | 
-| <​del>​maris074</​del> ​ | Dell PowerEdge R815 | 4 x AMD 6276 | 3.2GHz | 64 | 64 | 192 GB | ::: | 
-| maris076 ​ | Dell PowerEdge R830 | 4 x Intel Xeon E5-4640 v4 | 2.1GHz | 96 | 96 | 512 GB | INTEL | 
-| maris077 ​ | Dell PowerEdge R830 | 4 x Intel Xeon E5-4640 v4 | 2.1GHz | 96 | 96 | 512 GB | ::: | 
-| InfiniBand Subcluster ^^^^^^| ::: | 
-^ CPU node(s) ^ Server Type ^ CPU(s), IB(s) ^ Clock Speed ^ Cores ^ Threads ^ RAM ^ ::: | 
-| maris078 ​ | Dell PowerEdge R840  | 4 x Intel Xeon Gold 6126 | 2.6 GHz | 96 | 96 | 512 GB  | ::: | 
-|:::|:::| 1 x Mellanox EDR |:::​|:::​|:::​|:::​| ::: | 
-| GPU Subcluster ^^^^^^| ::: | 
-^ GPU node(s) ^ Server Type ^ CPU(s), GPU(s) ^ Clock Speed ^ Cores ^ Threads ^ RAM ^ ::: | 
-| maris075 ​ | Dell PowerEdge R730 | 2 x  Intel(R) Xeon(R) CPU E5-2680 v4 | 2.4GHz | 56 | 56 | 256 GB | ::: | 
-|:::|:::| 2 x Tesla P100 16GB|:::​|:::​|:::​|:::​| ::: | 
- 
- 
-:!: The resources allocatable on each node might differ from those in the table above because of the resources necessary to run a node's OS. 
- 
-The nodes have been configured in a similar way to your workstation to maximize productivity although there are substantial differences worth noting. One of these is the home directory. In order to keep the network traffic at a low rate, the /home directory on the Maris cluster is different from the one of your Lorentz [[institute_lorentz:​gnulinux_workstations|GNU/​Linux Workstation]]. In order to access the latter from the head node you can use 
-<code bash> 
-cd /​lorentz/​your_username 
-</​code>​ 
-In a similar fashion, you can access all data[1,​..,​n] on each Lorentz Institute workstation using for instance 
-<code bash> 
-/​net/​workstation_name/​data1 
-</​code>​ 
-Also ''​novamaris''​ and all maris nodes mount at boot two storage devices one available under  the ''/​clusterdata''​ mount point and the other under ''/​marisdata''​. The former **is an old storage device which will be retired upon failure**, the latter is a newly installed (as of 17/05/2016) storage device which you are encouraged to use. 
- 
-NOTE: Some devices get mounted upon access by autofs. Do not get surprised if the  command ''​df''​ does not show the desired output. 
- 
- 
-==== Home directories ==== 
-Maris home directories are mounted under ''/​home''​. This is a dedicated 5.5TB storage system which keeps hourly, daily and weekly snapshots of the home disk. Should you need to access one of the snapshots in the event of a lost file, then access the directory ''​.snapshot''​ and recover the file you need, for instance 
-<​code>​ 
-$ ls /​home/​.snapshot 
-daily.2016-05-16_0010 ​ hourly.2016-05-17_0405 ​ hourly.2016-05-17_0605 ​ hourly.2016-05-17_0805 ​ weekly.2016-05-08_0015 daily.2016-05-17_0010 ​ hourly.2016-05-17_0505 ​ hourly.2016-05-17_0705 hourly.2016-05-17_0905 ​ weekly.2016-05-15_0015 
-$ cd /​home/​.snapshot/​hourly.2016-05-17_0805/<​username>​ 
-</​code>​ 
- 
-10 GB quotas are enforced on the home disk. Please use /marisdata to __temporarily__ store large datafiles. 
- 
-==== Data storage ==== 
- 
-Maris has currently two storage disks mounted respectively under /​clusterdata and /marisdata. 
- 
-The plan is to gradually replace /​clusterdata with its newer counterpart ​ /marisdata. As of 17/05/2016 you are encouraged to use **ONLY** /marisdata to store your data because **/​clusterdata is no longer maintained and all data on it will get permanently lost in case of hardware failure**. In fact, you are encouraged to move all data you deem important from /​clusterdata to /marisdata as soon as possible. ''/​marisdata''​ has 37TB available. ​ 
- 
-Please remember that /marisdata is supposed to act as a __temporary__ storage device during your calculations. You are encouraged not to use it as an archive disk. 
- 
-==== The InfiniBand subcluster ==== 
- 
-maris078 has an EDR InfiniBand (IB) connector to and SSD storage server to achieve high I/O rates during calculations. The SSD storage server disks are mounted under ''/​IBSSD''​ and configured as a RAID0 (no redundancy for data!). The protocol used is //​[[https://​en.wikipedia.org/​wiki/​ISCSI_Extensions_for_RDMA|iSER]]//​. 
- 
-<code bash> 
- df -h /IBSSD/ 
-Filesystem ​     Size  Used Avail Use% Mounted on 
-/​dev/​sdc ​       5.1T   ​89M ​ 4.8T   1% /IBSSD 
- 
-</​code>​ 
-==== Compilers and libraries ==== 
-We try to minimize the differences between your Lorentz workstation and Maris so you can be as productive as possible. A variety of compilers and libraries are available for you to use in a similar fashion to your workstation. Explore which modules are available by typing ''​module avail''​. For extra information take also a look at [[linux:​compilers|this]] page and read the manual pages: ''​man module''​. 
- 
-If you use MPI, do not forget to use  the ''​openmpi-slurm/​4.0.0''​ module by which you can launch your MPI applications using slurm'​s ''​srun''​. 
- 
-==== Running a calculation ==== 
- 
-All calculations must be submitted and executed through ''​slurm''​. A typical pseudo-session would look like 
- 
-If this is the first time you use slurm, please have a look at this overview of  [[institute_lorentz:​institutelorentz_maris_slurm|maris'​ slurm configuration]] and this [[:​slurm_tutorial|short guide to slurm]] before submitting your calculations for execution. 
- 
- 
-:!: **NOTE that calculations executed outside slurm'​s control ​ will be terminated without notice. 
-** 
- 
-==== Working with Python ==== 
-Maris has a large list of pre-installed python modules available for you to use. Nonetheless,​ it is possible to install new modules or hack existing ones as long as they are (re)installed in a location for which you have writing credentials. ​ You can find an extensive guide with examples [[:​working_with_python|here]]. 
- 
-Further readings: [[https://​virtualenv.pypa.io/​en/​stable/​|virtualenv]] and [[https://​conda.io/​docs/​|conda]]. ​ 
- 
-:!: The package manager ''​conda''​ seems not to perform efficiently on large anaconda environments. 
-===== Help ===== 
-Help with cluster issues can be requested through the [[https://​helpdesk.lorentz.leidenuniv.nl|helpdesk]]. It is strongly advised ​ you discuss your problem also with other Maris users because they might have helpful tips. Here follows a list of Lorentz Institute members that could be consulted for tips 
-  ​ 
-  * Thomas O'​Brien (Office 259, Telephone: 5534) 
-  * IT Support (HL 40[7-9], Telephone:​8484) 
  
institute_lorentz/institutelorentz_maris.1548320851.txt.gz ยท Last modified: 2019/01/24 09:07 by lenocil