This is an old revision of the document!


The Hercules Cluster in Garching

The hercules cluster consists of 184 compute nodes and a 250TB GPFS file system. To access the cluster, please apply for an account at http://www.rzg.mpg.de/userspace/forms/onlineregistrationform. Your will receive an email with instructions once your request is approved by our computer department.

From the MPIfR network, you can log on using ssh <user-name>@hercules01.bc.rzg.mpg.de. hercules02/hercules03 serve as standby to hercules01. Note that your user name on hercules may differ from that in MPIfR.

Using Kerberos authentication

From your MPIfR machine, get a valid kerberos ticket, by invoking a new Kerberos environment:

ramesh@pc20181 ~ $ kpagsh
ramesh@pc20181 ~ $ kinit rameshk@IPP-GARCHING.MPG.DE
rameshk@IPP-GARCHING.MPG.DE's Password:
ramesh@pc20181 ~ $

Once the above is done (with rameshk replaced by your user name), from the same terminal, you can log on to hercules without a password.

ramesh@pc20181 ~ $ ssh rameshk@hercules01.bc.rzg.mpg.de

Data transfer to Hercules

There are multiple ways to transfer data to hercules, which do or do not use the fast link between Bonn and Garching. If you don't need a fast transfer then you can simply copy your data "the normal way":

rsync -Pav <data> <user-name>@hercules01.bc.rzg.mpg.de:<destination-dir-hercules>

NOTE: this method will prompt you for your MPCDF password!

.

If transfer speed is of the essence there are two approaches to copy data to hercules. The easiest way to copy your data is to directly copy it to the hercules gateway in Bonn (hgw or herculesgw), and use the directory /media/hercules. Your own directory is listed under /media/hercules/u/<user-name>. From there your data will automatically sync to your home directory on hercules. Please note that only those having an hercules account are allowed to log in to herculesgw and that data can only be copied to your own home directory. This method is therefore only suited for smaller data sets due to the limited storage space in your home directory on hercules.


.

The second method transfers data directly to hercules via the hercules gateway. In order to use the 10GbE line to Garching, transfer data from any of the local machines hooked to the 10GbE network (eg. miraculix/miraculix2/verleihnix/archivesrv). On any of these machines create an ssh tunnel to hercules using:

ssh -f -N -L <local-port>:hercules01.bc.rzg.mpg.de:22 <user-name>@hgw

where the <local-port> should be a free, not in use port on the machine you run this command. To check if the port is available:

lsof | grep <local-port>

NOTE: if you get a "bind: Cannot assign requested address" ERROR,
force ssh to use ipv4 with an additional "-4" option.

Now you can transfer data through this port using:

rsync -Pav -e "ssh -p <local-port>" <data> <user-name>@localhost:<destination-dir-hercules>

To simplify this copy process add the following to your ~/.ssh/config file:

Host htun
  Hostname localhost
  HostKeyAlias htun
  User <user-name>
  Port <local-port>

This addition allows you to copy data to hercules with a much simpler command like, similar to a standard data transfer:

rsync -Pav <data> <user-name>@htun:<destination-dir-hercules>

Once all data has been copied to hercules it is advisable to close the ssh tunnel again. Therefore, log into the machine you opened the tunnel at and identify the PID (second column) of the open tunnel using:

ps -aef | grep -i ssh

Then close/kill the tunnel with:

kill -9 <PID>

NOTE: for password-less data transfer you need to add your public ssh-key from the hercules gateway to hercules.
Log into the gateway and type "ssh-keygen -t rsa", choose a filename in which the private and public key should be
stored and just hit <Enter> when you are asked for a passphrase. Then add the just created public key to hercules
using "ssh-copy-id <user-name>@hercules01.bc.rzg.mpg.de". If successful, you should now be able to log into hercules
from the hercules gateway without having to give a password. You can then also password-less transfer data via the
hercules gateway with the above given method.

The /p file system

All data in any of the directories on the /p file system are automatically migrated to the archive in Garching. How to check where this data resides and how to recall it again check the following help page:

http://www.mpcdf.mpg.de/services/data/backup-archive/archives

Support

For help on modules/software/cluster, please email Christian (christian.guggenberger@rzg.mpg.de) or Markus (mjr@rzg.mpg.de).

VNC

To use VNC with the Hercules cluster, see:

http://www.mpcdf.mpg.de/services/network/vnc/vnc-at-the-mpcdf

Installing Python librairies

Sample script

Here's an example snippet which can be submitted with qsub:

### shell
#$ -S /bin/bash
### join stdout and stderr
#$ -j y
### change to current work dir
#$ -cwd
### do not send email reports
#$ -m n
### request parallel env with 8 cpus
#$ -pe openmp 8
### wallclock 2 hours
#$ -l h_rt=7200
### virtual limit per job 20GB
#$ -l h_vmem=20G
date

The CPU count specified with #$ -pe openmp XYZ can be varied from 1-24. #$ -pe openmp can be ommitted, but then, one cpu is assumed.

h_rt is mandatory and can be as much as 12 days (288:00:00). h_vmem is optional; if not present, 7G is set as default.

Currently, there is no CPU-binding enforced; in other words, if users use more cpus (e.g. create more threads) than requested, they'll steal CPU time from other jobs.

Docker/Singularity Tutorial

To help you to run your own programs (or specific versions of them) on the Hercules cluster using Docker and Singularity.

docksing.pdf

 
computing/garchingcomputing.1499440743.txt.gz · Last modified: 2017/07/07 17:19 by henning     Back to top