GENOMEPOP2

From HPCC Wiki
Jump to navigation Jump to search

The CUNY HPC Center has installed GenomePop2 (/share/apps/genomepop) on ANDY. GenomePop2 is a serial code that reads all of its input parameters from a file in the user's working directory called 'GP2Input.txt'. How to set up such a file is explained in the How-To section at the GenomePop2 web-site here [1]. The following SLURM batch script runs the third example given in the How-To which defines different SNPs ancestral alleles in different populations.

NOTE: This program has also been installed and can be found at '/share/apps/genomepop/1.0.6/bin/genomepop1'

#!/bin/bash
#SBATCH --partition production
#SBATCH --job-name GENPOP2_serial
#SBATCH --nodes=1
#SBATCH --ntasks=1

# Find out name of master execution host (compute node)
echo -n ">>>> SLURM Master compute node is: "
hostname

# You must explicitly change to the working directory in SLURM
cd $SLURM_SUBMIT_DIR

# Just point to the serial executable to run
echo ">>>> Begin GENPOP2 Serial Run ..."
echo ""
/share/apps/genomepop/default/bin/genomepop2
echo ""
echo ">>>> End   GENPOP2 Serial Run ..."

This script can be dropped in to a file (say genomepop2.job) and started with the command:

qsub genomepop2.job

This test case should take less than a minute to run and will produce SLURM output and error files beginning with the job name 'GENPOP2_serial'. Details on the meaning of the SLURM script are covered above in the SLURM section. The most important lines are the '#SBATCH --nodes=1:ntasks=1'. The first instructs SLURM to select 1 resource 'chunk' with 1 processor (core) and 1,920 MBs of memory in it for the job. The second instructs SLURM to place this job wherever the least used resources are found (freely). The master compute node that SLURM finally selects to run the job will be printed in the SLURM output file by the 'hostname' command.

While it is not visible in this SLURM script, your customized 'GP2Input.txt' file MUST be present in the working directory for the job. When the job completes, GenomePop2 will have created a subdirectory called 'GP2_Results' with the results files in it. One could easily adapt this script to run GenomePop version 1.