PHOENICS
As suggested, the entire PHOENICS 2011 package is installed on ANDY and users can run the X11 version of the PHOENICS Commander display tool from ANDY's head node if they have connected using 'ssh -X andy.csi.cuny.edu' where the '-X' option ensures that X11 images are passed back to the original client. Doing this from outside the College of Staten Island campus where the CUNY HPC Center is located may produce poor results because the X11 traffic will have to be forwarded through the HPC Center gateway system. CUNY has also licensed a number of seats for office-local desktop installations of PHOENICS (for either Windows or Linux) so that this should not be necessary. Job preparation and post-processing work is generally most efficiently accomplished on the local desktop using the Windows version of PHOENICS VR, which can be run directly or from PHOENICS Commander.
A rough general outline of the PHOENICS work cycle is:
1. The user runs VR Editor (preprocessor) on their workstation (or on ANDY) and perhaps selects a library case (e.g. 274) making changes to this case to match his/her specific requirements. 2. The user leaves the VR editor where input files 'q1' and 'eardat' are created. If the user is preprocessing from their desktop, these files would then be transferred to ANDY using the 'scp' command or via the 'PuTTy' utility for Windows. 3. The user runs the solver on ANDY (typically the parallel version, 'parexe') from their working directory using the SLURM batch submit script presented below. This script reads the files 'q1' and 'eardat' (and potentially some other input files) and writes the key output files 'phi' and 'result'. 4. The user copies these output files back to their desktop (or not) and runs VR Viewer (postprocessor) which reads the graphics output file 'phi', or the user views tabular results manually in the 'result' file.
POLIS, available in Linux and Windows, has further useful information on running PHOENICS including tutorials, viewing documentation, and on all PHOENICS commands and topics here [1]. Graphical monitoring should be deactivated during parallel runs in ANDY's batch queue. To do this users should place two leading spaces in front of the command TSTSWP in the 'q1' file. The TSTSWP command is present in most library cases, including case 274 which is a useful test case. Graphical monitoring can be left turned on when running sequential 'earexe' on the desktop. This gives useful realtime information on sweeps, values, and the convergence progress.
Details on the use of the display and non-parallel PHOENICS tools can be found at the CHAM website and in the CHAM Encyclopaedia here [2].
The process of setting up a PHOENICS working directory and running the parallel version of 'earth' (parexe) on ANDY is described below. As a first step, users would typically create a directory called 'phoenics' in their $HOME directory as follows:
cd; mkdir phoenics
Next, the default PHOENICS installation root directory (version 2011 is the current default) named above should be symbolically linked to the 'lp36' subdirectory:
cd phoenics ln -s /share/apps/phoenics/default ./lp36
The user must then generate the required input files for the 'earth' module which, as mentioned above in the PHOENICS work cycle section, are the 'q1' and 'eardat' files created by the VR Editor. These can be generated on ANDY, but it is generally easier to do this from the user's desktop installation of PHOENICS.
Because the current default version of PHOENICS, version 2011, was built with an earlier, older and now no longer default version of MPI, users must use the modules command to unload the current defaults and load the previous set before submitting the PHOENICS SLURM script below. This is a fairly simple procedure:
$module list Currently Loaded Modulefiles: 1) pbs/11.3.0.121723 2) cuda/5.0 3) intel/13.0.1.117 4) openmpi/1.6.3_intel $ $module unload intel/13.0.1.117 $module unload openmpi/1.6.3_intel $ $module load intel/12.1.3.293 $ $module load openmpi/1.5.5_intel Note: Intel compilers will be set to version 12.1.3.293 $ $module list Currently Loaded Modulefiles: 1) pbs/11.3.0.121723 2) cuda/5.0 3) intel/12.1.3.293 4) openmpi/1.5.5_intel
Once the input files are created and placed (transferred to) into the working directory and the older modules have been loaded on ANDY, the following PBS Pro batch script can be used run the job on ANDY. The progress of the job can be tracked with the PBS 'qstat' command.
#!/bin/bash #SBATCH --partition production_gdr #SBATCH --job-name phx_test #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --mem=2880 # Find out name of master execution host (compute node) echo -n ">>>> SLURM Master compute node is: " hostname # Take a look at the set of compute nodes that SLURM gave you echo $SLURM_NODEFILE cat $SLURM_NODEFILE # You must explicitly change to the working directory in SLURM cd $SLURM_SUBMIT_DIR # Just point to the parallel executable to run echo ">>>> Begin PHOENICS MPI Parallel Run ..." echo "" echo "mpirun -np 8 -machinefile $SLURM_NODEFILE ./lp36/d_earth/parexe" mpirun -np 8 ./lp36/d_earth/parexe echo "" echo ">>>> End PHOENICS MPI Parallel Run ..."
The job can be submitted with:
qsub 8Proc.job
Constructing a SLURM batch script is described in detail elsewhere in this Wiki document, but in short this script requests the QDR Infiniband production queue ('production_qdr') which runs the job on the side of ANDY with the fastest interconnect. It asks for 8 processors (cores) each with 2880 Mbytes of memory and allows SLURM to select those processors based on least loaded criteria. Because this is just an 8 processor job, it could be packed onto a single physical node on ANDY for better scaling using '-l place=pack' but this would delay its start by SLURM as SLURM would have to locate a completely free node.
During the run, 'parexe' creates (N-1) directories (named Proc00#) where N is the number of processors requested (note: if the Proc00# directories do not exist already they will be created, but there will be an error message in the SLURM error log, which can be ignored). The output from process zero is written into the working directory from which the script was submitted. The output from each of the other MPI processes is written into its associated 'Proc00#' directory. Upon successful completion, the 'result' file should show that the requested number of iterations (sweeps) was completed and print the starting and ending wall-clock time. At this point, the results (the 'phi' and 'results' files) from the SLURM parallel job can be copied back to the users desk top for post processing.
NOTE: A bug is present in the non-graphical, batch version of PHOENICS that is used on the CUNY HPC Clusters. This problem does not occur in Windows runs. To avoid the problem a go-around modification to the 'q1' input file is required. The problem occurs only in jobs that require SWEEP counts that are greater than 10,000 (i.e. SWEEP=20000). Users requesting larger SWEEP counts must include the following in their 'q1' input files to avoid having their jobs terminated at 10,000 SWEEPS.
USTEER=F
This addition forces a bypass of the graphical IO monitoring capability in PHOENICS and prevents that section of code from capping the SWEEP count at 10,000 SWEEPs.
Finally, PHOENICS has been licensed broadly by the CUNY HPC Center, and it can provide activation keys for any desktop copies whose annual activation keys expire.