This is an old revision of the document!
The Hercules Cluster in Garching
The hercules cluster consists of 184 compute nodes and a 250TB GPFS file system. To access the cluster, please apply for an account at http://www.rzg.mpg.de/userspace/forms/onlineregistrationform email benutzerberatung@rzg.mpg.de. Your will receive an email with instructions once your request is approved by our computer department.
From the MPIfR network, you can log on using ssh <user-name>@hercules01.bc.rzg.mpg.de
. hercules02/hercules03 serve as standby to hercules01. Note that your user name on hercules may differ from that in MPIfR.
Using Kerberos authentication
From your MPIfR machine, get a valid kerberos ticket, by invoking a new Kerberos environment:
ramesh@pc20181 ~ $ kpagsh ramesh@pc20181 ~ $ kinit rameshk@IPP-GARCHING.MPG.DE rameshk@IPP-GARCHING.MPG.DE's Password: ramesh@pc20181 ~ $
Once the above is done (with rameshk replaced by your user name), from the same terminal, you can log on to hercules without a password.
ramesh@pc20181 ~ $ ssh rameshk@hercules01.bc.rzg.mpg.de
Data transfer to Hercules
using _SLOW_ rsync over internet
There are multiple ways to transfer data to hercules, which do or do not use the fast link between Bonn and Garching. If you don't need a fast transfer then you can simply copy your data "the normal way":
rsync -Pav <data> <user-name>@hercules01.bc.rzg.mpg.de:<destination-dir-hercules>
NOTE: this method will prompt you for your MPCDF password!
using the __**FAST**__ 10Gb line on hgw
If transfer speed is of the essence use this approach to copy data to hercules. The easiest way is to copy your data directly copy to the hercules mounts on the gateway in Bonn (hgw or herculesgw). The hercules disk is visible as /media/hercules
on hgw. Your own directory is listed under /media/hercules/u/<user-name>
. Please note that only those having an hercules account are allowed to log in to herculesgw (or how) and that data can only be copied to your own home directory or from hercules:/hercules/results/<user>
directories.
ramesh@hgw ~ $ rsync -av --progress /fpra/timing/01/rktest/ /media/hercules/results/rameshk/. sending incremental file list rsync: chgrp "/media/hercules/results/rameshk/." failed: Operation not permitted (1) ./ 2019-09-16-19:20:50.fil 21,069,824 0% 6.58MB/s 0:35:52
The /p file system
All data in any of the directories on the /p file system are automatically migrated to the archive in Garching. How to check where this data resides and how to recall it again check the following help page:
http://www.mpcdf.mpg.de/services/data/backup-archive/archives
Support
For help on modules/software/cluster, please email Christian (christian.guggenberger@rzg.mpg.de) or Markus (mjr@rzg.mpg.de).
VNC
To use VNC with the Hercules cluster, see:
http://www.mpcdf.mpg.de/services/network/vnc/vnc-at-the-mpcdf
Installing Python librairies
To install standard python librairies, see:
http://www.mpcdf.mpg.de/about-mpcdf/publications/bits-n-bytes?BB-View=192&BB-Document=150
Sample script
Here's an example snippet which can be submitted with qsub
:
### shell
#$ -S /bin/bash
### join stdout and stderr
#$ -j y
### change to current work dir
#$ -cwd
### do not send email reports
#$ -m n
### request parallel env with 8 cpus
#$ -pe openmp 8
### wallclock 2 hours
#$ -l h_rt=7200
### virtual limit per job 20GB
#$ -l h_vmem=20G
date
The CPU count specified with #$ -pe openmp XYZ
can be varied from 1-24.
#$ -pe openmp
can be ommitted, but then, one cpu is assumed.
h_rt
is mandatory and can be as much as 12 days (288:00:00).
h_vmem
is optional; if not present, 7G is set as default.
Currently, there is no CPU-binding enforced; in other words, if users use more cpus (e.g. create more threads) than requested, they'll steal CPU time from other jobs.
Docker/Singularity Tutorial
To help you to run your own programs (or specific versions of them) on the Hercules cluster using Docker and Singularity.