Doc:Code-Aster-Cluster-Config

From CAELinuxWiki
Jump to: navigation, search

To use MPI with Code-Aster in CAELinux 2011, you don't need (and should not install) MPICH2 or anything else. Everything is already there. Code-Aster 11.0 is already compiled using openMPI libraries (and having several MPI libraries installed in the system may create configuration problems).

Personnally, this is the way I proceed, starting from 2 PC with a fresh install of CAELinux 2011 (even if using LiveDVD/liveUSB mode) so here is a small "How To" for you:

1) setup network to have interconnection: I use Network Manager to setup static IP adresses. set hostnames:


on machine 1:

sudo hostname caepc1

on machine 2:

sudo hostname caepc2


2) edit /etc/hosts of both machines to define host/ip relationships


sudo nano /etc/hosts


add such lines after 127.0.1.1 xxxx :


192.168.0.1 caepc1

192.168.0.2 caepc2


3) edit your configuration settings directly in /opt/aster110/etc/codeaster/aster-mpihosts

for example (use OpenMPI syntax):


caepc1 slots=1

caepc2 slots=1


4) optional: if you have more than 8Gb Ram per node or more than 16 cores in the cluster, edit also /opt/aster110/etc/codeaster/asrun to tune "interactif_memmax" = max memory per node and "interactif_mpi_nbpmax" = number of cores in the cluster


(optional) passwords: if using liveVD/liveUSB mode, you need to set a password for the default user caelinux. so on each node, run in a terminal "passwd" (default password is empty) to set a new password


5) ssh setup: you need ssh login without passwords between the two hosts: on first node, run

scp /home/caelinux/.ssh/id* caepc2:/home/caelinux/.ssh/

scp /home/caelinux/.ssh/authorized* caepc2:/home/caelinux/.ssh/

ssh-keyscan caepc1 >> /home/caelinux/.ssh/known_hosts

ssh-keyscan caepc2 >> /home/caelinux/.ssh/known_hosts

scp /home/caelinux/.ssh/known_hosts caepc2:/home/caelinux/.ssh/


6) setup a shared temp directory with NFS on node 1


sudo mkdir /srv/shared_tmp

sudo chmod a+rwx /srv/shared_tmp

sudo nano /etc/exports


then add the following line and save:


/srv/shared_tmp    *(rw,async)


then


sudo exportfs -a


Now create the mount point and mount the shared folder, run this on all nodes:

sudo mkdir /mnt/shared_tmp

sudo chmod a+rwx /mnt/shared_tmp

sudo mount -t nfs -o rw,rsize=8192,wsize=8192 caepc1:/srv/shared_tmp /mnt/shared_tmp


7) setup Aster config to use this shared temp directory:

nano /opt/aster110/eetc/codeaster/asrun


edit the line with "shared_tmp" as follows:


shared_tmp : /mnt/shared_tmp


then save


8) Open ASTK , go in server and refresh; create your Job,

select Options

ncpus=1 (no openMP) ,

mpi_nbcpu= total number of cores to use (nb_noeu*cores_per_host)

mpi_nbnoeud = number of compute nodes


And finally it should run on several nodes!!


Actually , the hard point is that you NEED to have shared tmp folder to run the jobs on a cluster.