Setup NGCompute at University of Canterbury
From BeSTGRID
Contents |
[edit] Create VM
Follow general rules - Vladimir:Bootstrapping a virtual machine
[edit] Setup PBS
[edit] Local PBS configuration
- Download torque-2.1.8.tar.gz from http://www.clusterresources.com/pages/products/torque-resource-manager.php
- ./configure && make && make install
- /etc/profile.d/pbs.sh:
PBS_HOME=/var/spool/torque/ export PBS_HOME
- Add to /etc/services
pbs 15000/tcp # added by Vladimir Mencl pbs_dis 15001/tcp # added by Vladimir Mencl pbs_dis 15001/udp # added by Vladimir Mencl pbs_mom 15002/tcp # added by Vladimir Mencl pbs_mom 15003/udp # added by Vladimir Mencl pbs_mom 15003/tcp # added by Vladimir Mencl pbs_sched 15004/tcp # added by Vladimir Mencl
- Define a single cluster (non-shared) node with four CPUs in $PBS_HOME/server_priv/nodes
ngcompute np=4
- Create configuration for pbs_mom in $PBS_HOME/mom_priv/nodes
$pbsserver ngcompute.canterbury.ac.nz
(even though pbs_mom was supposed to read servername from $PBS_HOME/server_name, it did not report the node as online - works now).
- Initialize db
pbs_server -t create
- start server
pbs_mom pbs_sched pbs_server # skip running for the first time
- Qmgr:
set server managers=vme28@ngcompute.canterbury.ac.nz # note: must match hostname reverse lookup # BIG NOTE: /etc/hosts has temporary entry to ensure correct reverse lookup create queue small active queue small set queue small queue_type=execution set queue small enabled=true set server scheduling=true set queue small started=true set server default_queue=small # set PBS queue restrictions set queue small resources_max.cput=168:00:00 set queue small resources_default.cput=72:00:00 set queue small resources_default.nodes=1 # needed: otherwise, pbs_sched keeps crashing set server submit_hosts = ngcompute.canterbury.ac.nz set server submit_hosts += grid.canterbury.ac.nz set server submit_hosts += ng2.canterbury.ac.nz # does not really work (see below), but we want these machines to submit
[edit] Permit grid and ng2 to submit jobs to ngcompute
- permit the hosts to submit jobs: submit_hosts property does not work (ignored??), instead, use /etc/hosts.equiv
grid.canterbury.ac.nz grid.giga.canterbury.ac.nz ng2.canterbury.ac.nz ng2.giga.canterbury.ac.nz
- permit ngcompute to return the results (stdout, stderr) to the submission hosts. This means to allow password-less logins for all users from ngcompute to grid and ng2. A lot of ssh configuring:
- ngcompute must know grid (to be willing to open connection)
- put /etc/ssh/ssh_host_rsa_key.pub from each machine to /etc/ssh/ssh_known_hosts on ngcompute.
- ngcompute must know grid (to be willing to open connection)
grid.giga.canterbury.ac.nz ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAuMCkTx4GK2s..... grid.canterbury.ac.nz ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAuMCkTx4GK2s..... ng2.canterbury.ac.nz ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAyVtstC7bL1i..... ng2.giga.canterbury.ac.nz ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAyVtstC7bL1i.....
- grid and ng2 must know host key of ngcompute to authenticate:
- put /etc/ssh/ssh_host_rsa_key.pub from ngcompute to /etc/ssh/ssh_known_hosts on each machine (grid and ng2)
- grid and ng2 must know host key of ngcompute to authenticate:
ngcompute.giga.canterbury.ac.nz ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEA0WT6K2yh78W..... ngcompute.canterbury.ac.nz ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEA0WT6K2yh78W.....
- grid and ng2 must trust ngcompute to allow logins:
- put ngcompute into the list of trusted hosts on both grid and ng2 - file /etc/ssh/shosts.equiv
- grid and ng2 must trust ngcompute to allow logins:
+ngcompute.giga.canterbury.ac.nz +ngcompute.canterbury.ac.nz
- enable host based authentication
- change /etc/ssh/sshd_config on both grid and ng2 to include
- enable host based authentication
HostbasedAuthentication yes
- allow users to use ssh-keysign (suid-app) and client-side host-based authentication:
- change /etc/ssh/ssh_config on ngcompute to include
- allow users to use ssh-keysign (suid-app) and client-side host-based authentication:
EnableSSHKeysign yes HostbasedAuthentication yes
[edit] Rationale on configuration
- To properly allocate jobs to the 4 available CPUs, host has to be set up as cluster (non-shared) with np=4. Initially, pbs_sched kept crashing, and it is necessary for each job to specify the number of nodes it needs. This has been achieved with resources_default.nodes=1. (Individually, job's resources can be specified like qsub -l nodes=1).
- For pbs_server to return results to the submitting machine (no overriding solution for shared filesystem was found), it was necessary to permit ssh access from ngcompute to submission hosts - grid and ng2. To permit submission from these hosts, they have to be included in /etc/hosts.equiv on ngcompute. This however does not permit password-less logins from these hosts to ngcompute - as long as HostbasedAuthentication is not turned on in sshd_config.
[edit] Install MPI
[edit] Choosing an MPI implementation
First, I thought I would use LAM/MPI - because it's free, and it's available as an RPM package in CentOS. However, LAM/MPI needs custom startup procedure (lamboot / lamhalt), and, more importantly, due to the way lamd is started by lamboot, CPU usage accounting does not work with LAM/MPI.
Lamd is started as an orphaned process, and consequently, cpu usage of lamd (and of the computation processes started by lamd) does not get into job's cpu total time.
MPICH2 does not suffer from these problems, and it was chosen as the preferred MPI implementation (at least for ngcompute)
[edit] Installing MPICH2
Key trick - make gforker the default Process Manager:
./configure --with-pm=gforker:mpd:remshell make make install
To start a program, no daemon startup/shutdown is necessary - only use
mpiexec -np "`cat $PBS_NODEFILE | wc -l `" program
Issue: PATH in PBS
MPICH2 installs into /usr/local/bin, and this directory is _not_ in the default PATH available for PBS jobs. PBS defaults (in $PBS_HOME/pbs_environment) contain only /bin:/usr/bin, and /etc/profile does not add /usr/local/bin.
The solution was to create </tt>/etc/profile.d/mpich2.sh</tt>:
#!/bin/sh export MPICH2_HOME=/usr/local MPICH2_BIN="$MPICH2_HOME/bin" if ! echo $PATH | /bin/egrep -q "(^|:)$MPICH2_HOME($|:)" ; then PATH=$PATH:$MPICH2_BIN fi
[edit] Old: installing LAM/MPI
http_proxy=http://gridws1:3128 yum install lam
To run MPI jobs, use:
qsub -N mrbayes2 -l nodes=1:ppn=2 run-mrbayes-mp.sh
with run-mrbayes-mp.sh containing
lamboot $PBS_NODEFILE mpiexec -ssi rpi sysv C ~/inst/mrbayes-3.1.2-lammpi/mb mytaxa.NEX lamclean lamhalt
Note: SSI RPI modules available are:
- tcp - TCP communication
- crtcp - Checkpointable Restartable TCP
- sysv - SHM with semaphore blocking - for overcommitted nodes
- usysv - SHM and spinlocks - more efficient on non-overcommitted nodes
More information (and gm and lamd) can be found in lamssi_rpi(7)
The recommended configuration is sysv - shared memory is more efficient for single-node setup, and we want blocking synchronization here - otherwise, the other virtual machines would be unnecessarily starving.
Hack: changed /etc/lam/lam-conf.lamd to report time usage:
/usr/bin/time lamd $inet_topo $debug $session_prefix $session_suffix
[edit] Allow incoming mail
In /etc/mail/sendmail.cf change the following:
O DaemonPortOptions=Port=smtp,Addr=0.0.0.0, Name=MTA # O DaemonPortOptions=Port=smtp,Addr=127.0.0.1, Name=MTA
Reason: PBS occasionally send email back to submitting user.
[edit] Start up PBS server automatically
/etc/rc.d/init.d/pbs-server
#!/bin/bash
case $1 in
start)
/usr/local/sbin/pbs_mom
/usr/local/sbin/pbs_sched
sleep 1
/usr/local/sbin/pbs_server
### /usr/local/bin/qmgr -c "set server scheduling = True"
;;
stop)
/usr/local/bin/qterm -t quick
sleep 3
killall pbs_server pbs_sched pbs_mom
;;
esac
[edit] Setup NFS client
To mount directories provided by grid, the following has to be turned on:
Edit /etc/fstab
grid.canterbury.ac.nz:/export/home /home nfs fg,retry=20,hard 0 0 grid.canterbury.ac.nz:/export/opt/shared /opt/shared nfs fg,retry=20,hard 0 0
Arrange for mount points (empty /home and /opt/shared) to exist and then run:
chkconfig portmap on chkconfig netfs on service portmap start service netfs start
[edit] Job accounting reporting
To report grid usage to Grid Operations Center (according to the Accounting Plan, a script has to be called daily (after midnight).
- Create:
/usr/local/sbin/send_grid_usage
#! /bin/bash # This script emails the grid usage report for yesterday # to the grid operations centre. # It should be called from crontab daily # Please note the email subject, its is in the format of : # <cluster_name> <site_Name> <date> # Please keep this consistent, we need it like that. . /etc/profile.d/pbs.sh YESTERDAY=`date --date=yesterday +%Y%m%d` # cd /usr/spool/PBS/server_priv/accounting; cd $PBS_HOME/server_priv/accounting; grep Grid_ "$YESTERDAY" | mail -s "ngcompute NZ-Cant $YESTERDAY" grid_pulse@arcs.org.au logger -t GridAccounting "Grid usage from $PBS_HOME/server_priv/accounting/$YESTERDAY emailed to grid_pulse@vpac.org"
- Note: When this job is run from cron, the PBS_HOME envrionment variable is not set - hence, it's necessary to source either /etc/profile (complete environment), or at least /etc/profile.d/pbs.sh
- Create /etc/cron.d/send_grid_usage.cron
3 1 * * * root /usr/local/sbin/send_grid_usage
- service crond restart
[edit] Documenttation
- PBS: http://www.clusterresources.com/wiki/doku.php?id=torque:torque_wiki
- OpenSSH host-based authentication: http://cert.uni-stuttgart.de/doc/openssh/host-based.php and http://tiger.la.asu.edu/Quick_Ref/OpenSSH_quickref.pdf
- Tricks:
set server operators += username@headnode
- Notes
- No need to configure permissions for pbs_mon - localhost and gethostname are always permitted.
- Important: start pbs_server as LAST:
pbs_mom && sleep 1 && pbs_sched && sleep 1 && pbs_server && echo "PBS started"
- To restart server: qterm -t quick (kill pbs server) and restart it
- NIS setup is documented at http://www.tldp.org/HOWTO/NIS-HOWTO/ ... and don't do it, use locally replicated users griduser gridadmin. (On FC5, NIS is spread over, ypserv-2.19-0, ypbind-1.19-1, and yp-tools-2.9-0).
