GAUSSIAN09

From HPCC Wiki
Revision as of 17:33, 17 October 2022 by James (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

As a commercially license application, running Gaussian09 at the CUNY HPC Center comes with some restrictions


1. A Gaussian09 job can only run in parallel within a single-node at CUNY.   
2. Gaussian09 users must comply with the licensing and citation requirements
   of Gaussian, Inc.  The following citation must be included in publications that
   relied on Gaussian 09 computations at the CUNY HPC Center.
  

Gaussian [03,09], Revision C.02, M. J. Frisch, G. W. Trucks, H. B. Schlegel, 
G. E. Scuseria, M. A. Robb, J. R. Cheeseman, J. A. Montgomery, Jr., T. Vreven, 
K. N. Kudin, J. C. Burant, J. M. Millam, S. S. Iyengar, J. Tomasi, V. Barone, 
B. Mennucci, M. Cossi, G. Scalmani, N. Rega, G. A. Petersson, H. Nakatsuji, 
M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, 
Y. Honda, O. Kitao, H. Nakai, M. Klene, X. Li, J. E. Knox, H. P. Hratchian, 
J. B. Cross, V. Bakken, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, 
O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski, P. Y. Ayala, 
K. Morokuma, G. A. Voth, P. Salvador, J. J. Dannenberg, V. G. Zakrzewski, 
S. Dapprich, A. D. Daniels, M. C. Strain, O. Farkas, D. K. Malick, A.D. Rabuck, 
K. Raghavachari, J. B. Foresman, J. V. Ortiz, Q. Cui, A. G. Baboul, S. Clifford, 
J. Cioslowski, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, 
R. L. Martin, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, 
M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, C. Gonzalez, 
and J. A. Pople, Gaussian, Inc., Wallingford CT, 2004.

3. The Gaussian 09 license prohibits the publication of comparative benchmark data.
 

A SLURM script must be used in order to run a Gaussian09 job. HPCC main cluster Penzias supports Gaussian jobs demanding up to 64GB of memory. Any job necessitating more than 24GB but less than 64GB must be submitted via partition "partgaularge" and can utilize up to 16 cores. All other jobs can utilize up to 8 cores and must be submitted via partition "production_gau". The users should place job(s) according to their simulation requirement.

Gaussian Scratch File Storage Space

If a single Gaussian job is using all the cores on a particular node (this is often the case) then that node's entire local scratch space is available to that job, assuming files from previous jobs have been cleaned up. In order to use Gaussian scratch space, the users must not edit their SLURM scripts in order to place Gaussian scratch files anywhere other than the directories used in the recommended scripts. In particular, users MUST NOT place their scratch files in their home directories. The home directory is backed up to tape and backing up large integrals files to tape will unnecessarily waste backup tapes and increase backup time-to-completion.

Users are encouraged to ensure that their scratch file data is removed after each completed Gaussian run. The example SLURM script below for submitting Gaussian jobs includes a final line to remove scratch files, but this is not always successful. You may have to manually remove your scratch files. The examples script prints out both the node where the job was run and the unique name of each Gaussian job's scratch directory. Please police your own use of Gaussian scratch space by going to '/scratch/gaussian/g09_scr' and looking for directories that begin with your name and the date that the directory was created. As a good rule of thumb, you can request 100 GBs per core in the SLURM script, although this is NOT guaranteed to be enough. Finally, Gaussian users should note that Gaussian scratch files are NOT backed up. Users are encouraged to save their checkpoint files in their SLURM 'working directories' and copy them on '/global/u/username' if they will be needed for future work as they are temporary. From this location, they will be backed up.

NOTE: If other users have failed to clean up after themselves, and you request the maximum amount of Gaussian scratch space, it may not be available and your job may sit in the queue.

Gaussian SLURM Job Submission

As noted, Gaussian parallel jobs are limited by the number of cores on a single compute node. Eight (8) is the maximum processor (core) count on small nodes and the memory per core is limited to 2880 MB/core. On large nodes the maximum processor cores is 16 and the memory per core is 3688MB/core. Here, we provide a simple Gaussian input file (a Hartree-Fock geometry optimization of methane), and the companion SLURM batch submit script that would allocate 4 cores on a single compute node and 400 GBytes of compute-node-local storage.

The Gaussian 09 methane input deck is:

%chk=methane.chk
%mem=8GB
%nproc=4
# hf/6-31g

Title Card Required

0 1
 C                  0.80597015   -1.20895521    0.00000000
 H                  1.16262458   -2.21776521    0.00000000
 H                  1.16264299   -0.70455702    0.87365150
 H                  1.16264299   -0.70455702   -0.87365150
 H                -0.26402985   -1.20894202    0.00000000

END

Notice that we have explicitly requested 8 GBytes of memory with the '%mem=8GB' directive. The input file also instructs Gaussian to use 4 processors which will ensure that all of Gaussian's parallel executables (i.e. links) will run in SMP mode with 4 cores. For this simple methane geometry optimization, requesting these resources (both here and in the SLURM script) is a bit extravagant, but both the input file and script can be adapted to other more substantial molecular systems running more accurate calculations. User's can make pro-rated adjustments to the resources requested in BOTH the Gaussian input deck and SLURM submit script to run jobs on 2, 4, or 8 cores. Here is the Gaussian SLURM script named g09.job which is intended to be used on G09:

#!/bin/csh
# This script runs a '''small memory''' 4-cpu (core) Gaussian 09 job
# with the 4 cpus packed onto a single compute node 
# and max 3GB per core
#SBATCH --partition production_gau
#SBATCH --job-name methane_opt
#SBATCH --nodes=1
#SBATCH --ntasks=4
#SBATCH --mem=11520mb
 
# print out name of master execution host (compute node)
echo ""
echo -n ">>>> SBATCH Master compute node is: "
hostname

# set the G09 root directory

setenv g09root /share/apps/gaussian/g09_E01

# set the name and location of the G09 scratch directory
# on the compute node.  This is where one needs to go
# to remove left-over script files.

setenv MY_SCRDIR `whoami;date '+%m.%d.%y_%H:%M:%S'`
setenv MY_SCRDIR `echo $MY_SCRDIR | sed -e 's; ;_;'`

setenv GAUSS_SCRDIR  /state/partition1/gaussian/g09_scr/${MY_SCRDIR}_$$
mkdir -p $GAUSS_SCRDIR

echo $GAUSS_SCRDIR

# run the G09 setup script

source $g09root/g09/bsd/g09.login

# users must explicitly change to their working directory with SLURM

cd $SLURM_SUBMIT_DIR

# start the G09 job

$g09root/g09/g09 methane.input

# remove the scratch directory before terminating

/bin/rm -r $GAUSS_SCRDIR

echo 'Job is done!'

To run the job, one must use the standard SLURM job submission command as follows:

sbatch g09.job

Users may choose to run jobs with fewer processors (cores, cpus) and smaller storage space requests than this sample job. This includes one processor jobs and others using a fraction of a compute node (2 processors, 4 processors, 6 processors). On a busy system, these smaller jobs may start sooner that those requesting a full 8 processors packed on a single node. Selecting the most efficient combination of processors, memory, and storage will ensure that resources will not be wasted and will be available to allocate to the next job submitted. For large jobs (up to 64GB) the users may use the following script named g09_large.job:

#!/bin/csh
# This script runs a '''small memory''' 16-cpu (core) Gaussian 09 job
# with the 4 cpus packed onto a single compute node 
# and max 4GB per core
#SBATCH --partition partgaularge
#SBATCH --job-name methane_opt
#SBATCH --nodes=1
#SBATCH --ntasks=16
#SBATCH --mem=58980mb
 
# print out name of master execution host (compute node)
echo ""
echo -n ">>>> SBATCH Master compute node is: "
hostname

# set the G09 root directory

setenv g09root /share/apps/gaussian/g09_E01

# set the name and location of the G09 scratch directory
# on the compute node.  This is where one needs to go
# to remove left-over script files.

setenv MY_SCRDIR `whoami;date '+%m.%d.%y_%H:%M:%S'`
setenv MY_SCRDIR `echo $MY_SCRDIR | sed -e 's; ;_;'`

setenv GAUSS_SCRDIR /state/partition1/gaussian/g09_scr/${MY_SCRDIR}_$$ 
mkdir -p $GAUSS_SCRDIR

echo $GAUSS_SCRDIR

# run the G09 setup script

source $g09root/g09/bsd/g09.login

# users must explicitly change to their working directory with SLURM

cd $SLURM_SUBMIT_DIR

# start the G09 job

$g09root/g09/g09 methane.input

# remove the scratch directory before terminating

/bin/rm -r $GAUSS_SCRDIR

echo 'Job is done!'

To run the job, one must use the the following SLURM job submission command:

sbatch g09_large.job