GARLI

From HPCC Wiki
Jump to navigation Jump to search

At the CUNY HPC Center, GARLI is installed on on ANDY. GARLI has both a serial and MPI parallel version that takes its input from a simple text configuration file ('garli.conf') and a '.nex' sequence file ('rana.nex' for instance). Like other applications on ANDY, GARLI path and environment variables are controlled using the modules utility. To include all required environmental variables and the path to the GARLI executable run the modules load command (the modules utility is discussed in detail above):

module load garli

Below is an example SLURM script that will run the frog ('rana.nex') test case provided with the distribution archive (/share/apps/garli/default/examples/basic). Users can copy the necessary files from this location.

#!/bin/bash
#SBATCH --partition production
#SBATCH --job-name GARLI_mpi
#SBATCH --nodes=2
#SBATCH --ntasks=1
#SBATCH --mem=2880

# Find out name of master execution host (compute node)
echo -n ">>>> SLURM Master compute node is: "
hostname

# You must explicitly change to the working directory in SLURM
cd $SLURM_SUBMIT_DIR

# Use 'mpirun' and point to the MPI parallel executable to run
echo ">>>> Begin GARLI MPI Run ..."
mpirun -np 2 -machinefile $SLURM_NODEFILE garli_mpi > garli_mpi.out 2>&1
echo ">>>> Begin GARLI MPI Run ..."

This script can be dropped in to a file (say garli_mpi.job) and started with the command:

qsub garli_mpi.job

Running the 'rana.nex' test case should take less than 15 minutes and will produce SLURM output and error files beginning with the job name 'GARLI_mpi'. The primary GARLI application results will be written into the user-specified file at the end of the GARLI command line after the greater-than sign. Here it is named 'garli_mpi.out'. The expression '2>&1' combines Unix standard output from the program with Unix standard error. Users should always explicitly specify the name of the application's output file in this way to ensure that it is written directly into the user's working directory which has much more disk space than the SLURM spool directory on /var.

Details on the meaning of the SLURM script are covered in the SLURM section above. The most important lines here are the '#SBATCH --nodes=1 ntasks=1 mem=2880'. The first instructs SLURM to select 1 resource 'chunk' with 1 processor (core) and 2,880 MBs of memory in it for the job. The second instructs SLURM to place this job wherever the least used resources are found (freely). The master compute node that SLURM finally selects to run your job will be printed in the SLURM output file by the 'hostname' command.