Running Jobs: Difference between revisions

From HPCC Wiki
Jump to navigation Jump to search
Line 1: Line 1:
__TOC__
__TOC__
==Running jobs on Penzias, Appel and Karle==
==Running jobs on Penzias, Appel and Karle==
The jobs on these servers <u>must and can start only from common file system called '''scratch'''</u>. This file system is not a main file system and does not hold home directories for users.  Consequently, users must prepare the set of job related files in their scratch directory before submit a job. The set includes:
The jobs on these servers <u>must and can start only from common file system called '''scratch'''</u>. This file system is not a main file system and does not hold home directories for users.  Consequently, users must prepare the set of job related files in their <font face="courier">'''/scratch/<font color="red"><userid>''' directory before submit a job. The set includes:


* Input files for the job;
* Parameter files for the job (if applicable);
*


*Input files for the job;
*Parameter(s) file(s) for the job (if applicable);
*Correct job submission script.


Typically the process involves the following:
===Input files===
Input files can be locally generated or directly transferred to <font face="courier">'''/scratch/<font color="red"><userid>'''. using file transfer node. However HPCC recommends to transfer files to user's home directory first before copy needed files from user's home to  <font face="courier">'''/scratch/<font color="red"><userid>''' 


Typically the process involves the following:
:• Having input files within your <font face="courier">'''/scratch/<font color="red"><userid>'''</font color></font> directory on the HPC system you wish to use.
:• Having input files within your <font face="courier">'''/scratch/<font color="red"><userid>'''</font color></font> directory on the HPC system you wish to use.
:• Set up environment for the job.  
:• Set up environment for the job.
:• Creating a job submit script that identifies the input files, the application program you wish to use, the compute resources needed to execute the job, and information on where you wish to write your output files.
:• Creating a job submit script that identifies the input files, the application program you wish to use, the compute resources needed to execute the job, and information on where you wish to write your output files.
:• Submitting the job script.
: • Submitting the job script.
:• Saving output to the DSMS.
:• Saving output to the DSMS.


Line 19: Line 22:


===Input file on '''/scratch'''===
===Input file on '''/scratch'''===
The general case is that you will have input files that have data on which you wish to operate.  To compute or work on these files, they must be stored within the <font face="courier">'''/scratch/<font color="red"><userid>'''</font color></font> directory of the HPC system you wish to use.  These files can come from any of the following sources:
The general case is that you will have input files that have data on which you wish to operate.  To compute or work on these files, they must be stored within the <font face="courier">'''/scratch/<font color="red"><userid>'''</font> directory of the HPC system you wish to use.  These files can come from any of the following sources:
:• Users can create files using a text editor. Note that Microsoft Word is word processing software and is not a text editor suitable for job. Users must use a Linux/Unix based text editor such as Vi/Vim, pico, nano, or edit.
:• Users can create files using a text editor. Note that Microsoft Word is word processing software and is not a text editor suitable for job. Users must use a Linux/Unix based text editor such as Vi/Vim, pico, nano, or edit.
:• Users can copy files from their local storage (i.e. disk on user's laptop) to user's directory in /scratch and/or user's home directory in DSMS. HPCC recommend the files to be copied to home directories first and then staged on /scratch. For this operation users can use file transfer node (cea) or Globus online. Note that file transfer via '''ftp''' is not supported.
:• Users can copy files from their local storage (i.e. disk on user's laptop) to user's directory in /scratch and/or user's home directory in DSMS. HPCC recommend the files to be copied to home directories first and then staged on /scratch. For this operation users can use file transfer node (cea) or Globus online. Note that file transfer via '''ftp''' is not supported.
 
==Inroduction==


== Inroduction ==
''' 
'''SLURM''' is open source scheduler and batch system which is implemented at HPCC.
'''SLURM''' is open source scheduler and batch system which is implemented at HPCC.
Currently SLURM is used only for Penzias’ job management but the use of SLURM will be
Currently SLURM is used only for Penzias’ job management but the use of SLURM will be
Line 41: Line 44:
If the files are in <font face="courier">'''/global/u'''</font>
If the files are in <font face="courier">'''/global/u'''</font>


  cd /scratch/<font color="red"><userid></font color>
  cd /scratch/<font color="red"><userid></font>
  mkdir <font color="red"><job_name></font color> && cd <font color="red"><job_name></font color>
  mkdir <font color="red"><job_name></font> && cd <font color="red"><job_name></font>
  cp /global/u/<font color="red"><userid></font color>/<font color="red"><myTask</font color>/a.out ./
  cp /global/u/<font color="red"><userid></font>/<font color="red"><myTask</font>/a.out ./
  cp /global/u/<font color="red"><userid></font color>/<font color="red"><myTask</font color>/<font color="red"><mydatafile></font color> ./
  cp /global/u/<font color="red"><userid></font>/<font color="red"><myTask</font>/<font color="red"><mydatafile></font> ./


If the files are in SR (cunyZone):
If the files are in SR (cunyZone):


  cd /scratch/<font color="red"><userid></font color>
  cd /scratch/<font color="red"><userid></font>
  mkdir <font color="red"><job_name></font color> && cd <font color="red"><job_name></font color>
  mkdir <font color="red"><job_name></font> && cd <font color="red"><job_name></font>
  iget <font color="red">myTask</font color>/a.out ./
  iget <font color="red">myTask</font>/a.out ./
  iget <font color="red">myTask</font color>/<font color="red"><mydatafile></font color> ./
  iget <font color="red">myTask</font>/<font color="red"><mydatafile></font> ./


===Set up job environment===
=== Set up job environment===
Users must load the proper environment before start any job. The loaded environment wil be automatically exported to compute nodes at the time of execution. Users must use modules to load environment. For example to
Users must load the proper environment before start any job. The loaded environment wil be automatically exported to compute nodes at the time of execution. Users must use modules to load environment. For example to
load environment for default version of GROMACS one must type:
load environment for default version of GROMACS one must type:
Line 69: Line 72:
More information about modules is provided in "Modules and available third party software" section below.
More information about modules is provided in "Modules and available third party software" section below.
   
   
===Running jobs on HPC systems running SLURM scheduler ===
===Running jobs on HPC systems running SLURM scheduler===
To be able to schedule your job for execution and to actually run your job on one or more compute nodes, SLURM  needs to be instructed about your job’s parameters. These instructions are typically stored in a “job submit script”. In this section, we describe the information that needs to be included in a job submit script. The submit script typically includes  
To be able to schedule your job for execution and to actually run your job on one or more compute nodes, SLURM  needs to be instructed about your job’s parameters. These instructions are typically stored in a “job submit script”. In this section, we describe the information that needs to be included in a job submit script. The submit script typically includes  
:• job name  
:• job name
:• queue name
:• queue name
:• what compute resources (number of nodes, number of cores and the amount of memory, the amount of local scratch disk storage (applies to Andy, Herbert, and Penzias), and the number of GPUs) or other resources a job will need
:• what compute resources (number of nodes, number of cores and the amount of memory, the amount of local scratch disk storage (applies to Andy, Herbert, and Penzias), and the number of GPUs) or other resources a job will need
:• packing option
:• packing option
:• actual commands that need to be executed (binary that needs to be run, input\output redirection, etc.).
:• actual commands that need to be executed (binary that needs to be run, input\output redirection, etc.).
Line 81: Line 84:


  #!/bin/bash
  #!/bin/bash
  #SBATCH --partition <font color="red"><queue_name></font color>
  #SBATCH --partition <font color="red"><queue_name></font>
  #SBATCH -J <font color="red"><job_name></font color>
  #SBATCH -J <font color="red"><job_name></font>
  #SBATCH --mem <font color="red"><????></font color>
  #SBATCH --mem <font color="red"><????></font>
   
   
  # change to the working directory  
  # change to the working directory  
  cd $SLURM_WORKDIR
  cd $SLURM_WORKDIR
   
   
  echo ">>>> Begin <font color="red"><job_name></font color>"
  echo ">>>> Begin <font color="red"><job_name></font>"
   
   
  # actual binary (with IO redirections) and required input  
  # actual binary (with IO redirections) and required input  
  # parameters is called in the next line
  # parameters is called in the next line
  mpirun -np <font color="red"><cpus></font color> <font color="red"><Program Name> <input_text_file></font color> > <font color="red"><output_file_name></font color> 2>&1
  mpirun -np <font color="red"><cpus></font> <font color="red"><Program Name> <input_text_file></font> > <font color="red"><output_file_name></font> 2>&1




Line 102: Line 105:
Explanation of SLURM attributes and parameters:
Explanation of SLURM attributes and parameters:


:'''<font face="courier">--partition <font color="red"><queue_name></font color></font>''' Available main queue is “production” unless otherwise instructed.
:'''<font face="courier">--partition <font color="red"><queue_name></font></font>''' Available main queue is “production” unless otherwise instructed.
::• “production” is the normal queue for processing your work on Penzias.
::• “production” is the normal queue for processing your work on Penzias.
::• “development” is used when you are testing an application.  Jobs submitted to this queue can not request more than 8 cores or use more than 1 hour of total CPU time.  If the job exceeds these parameters, it will be automatically killed. “Development” queue has higher priority and thus jobs in this queue have shorter wait time.  
::• “development” is used when you are testing an application.  Jobs submitted to this queue can not request more than 8 cores or use more than 1 hour of total CPU time.  If the job exceeds these parameters, it will be automatically killed. “Development” queue has higher priority and thus jobs in this queue have shorter wait time.
::• “interactive” is used for quick interactive tests. Jobs submitted into this queue run in an interactive terminal session on one of the compute nodes. They can not use more than 4 cores or use more than a total of 15 minutes of compute time.
::• “interactive” is used for quick interactive tests. Jobs submitted into this queue run in an interactive terminal session on one of the compute nodes. They can not use more than 4 cores or use more than a total of 15 minutes of compute time.


:'''<font face="courier">-J <font color="red"><job_name></font color></font>''' The user must assign a name to each job they run.  Names can be up to 15 alphanumeric characters in length.
:'''<font face="courier">-J <font color="red"><job_name></font></font>''' The user must assign a name to each job they run.  Names can be up to 15 alphanumeric characters in length.
   
   
:'''<font face="courier">--ntasks=<font color="red"><cpus></font color></font>'''  The number of cpus (or cores) that the user wants to use.
:'''<font face="courier">--ntasks=<font color="red"><cpus></font></font>'''  The number of cpus (or cores) that the user wants to use.


::• Note:  SLURM refers to “cores” as “cpus”; currently HPCC clusters maps one thread per one core.  
::• Note:  SLURM refers to “cores” as “cpus”; currently HPCC clusters maps one thread per one core.


:'''<font face="courier">--mem <font color="red"><mem> </font color></font>'''  This parameter is required. It specifies how much memory is needed per job.  
:'''<font face="courier">--mem <font color="red"><mem> </font></font>'''  This parameter is required. It specifies how much memory is needed per job.  


:'''<font face="courier">--gres <font color="red"><gpu:2></font color></font>'''  The number of graphics processing units that the user wants to use on a node (This parameter is only available on PENZIAS).  
:'''<font face="courier">--gres <font color="red"><gpu:2></font></font>'''  The number of graphics processing units that the user wants to use on a node (This parameter is only available on PENZIAS).
  gpu:2 denotes requesting 2 GPU's.  
  gpu:2 denotes requesting 2 GPU's.  




Line 139: Line 141:
----
----


'''<font face="courier">mpirun -np <font color="red"><total tasks or total cpus></font color></font>'''.  This script line is only to be used for MPI jobs and defines the total number of cores required for the parallel MPI job.
'''<font face="courier">mpirun -np <font color="red"><total tasks or total cpus></font></font>'''.  This script line is only to be used for MPI jobs and defines the total number of cores required for the parallel MPI job.


The Table 2 below shows the maximum values of the various SLURM parameters by system.  Request only the resources you need as requesting maximal resources will delay your job.
The Table 2 below shows the maximum values of the various SLURM parameters by system.  Request only the resources you need as requesting maximal resources will delay your job.


   
   
====Serial Jobs====
====Serial Jobs ====
For serial jobs,''' <font face="courier"> --nodes 1</font>''' and '''<font face="courier"> --ntasks 1 </font>''' should be used.
For serial jobs,''' <font face="courier"> --nodes 1</font>''' and '''<font face="courier"> --ntasks 1 </font>''' should be used.


Line 152: Line 154:
  #
  #
  #SBATCH --partition production
  #SBATCH --partition production
  #SBATCH -J <font color="red"><job_name></font color>
  #SBATCH -J <font color="red"><job_name></font>
  #SBATCH --nodes 1
  #SBATCH --nodes 1
  #SBATCH --ntasks 1
  #SBATCH --ntasks 1
Line 160: Line 162:
   
   
  # Run my serial job
  # Run my serial job
  <font color="red"></path/to/your_binary></font color> > <font color="red"><my_output></font color> 2>&1
  <font color="red"></path/to/your_binary></font> > <font color="red"><my_output></font> 2>&1


====OpenMP and Threaded Parallel jobs====
====OpenMP and Threaded Parallel jobs====
OpenMP jobs can only run on a single virtual node.  Therefore, for OpenMP jobs,''' <font face="courier">place=pack</font>''' and '''<font face="courier">select=1</font>''' should be used;  '''<font face="courier">ncpus</font>''' should be set to '''<font face="courier">[2, 3, 4,… n]</font>''' where '''<<font face="courier">n</font>''' must be less than or equal to the number of cores on a virtual compute node.
OpenMP jobs can only run on a single virtual node.  Therefore, for OpenMP jobs,''' <font face="courier">place=pack</font>''' and '''<font face="courier">select=1</font>''' should be used;  '''<font face="courier">ncpus</font>''' should be set to '''<font face="courier">[2, 3, 4,… n]</font>''' where '''<<font face="courier">n</font>''' must be less than or equal to the number of cores on a virtual compute node.


Typically, OpenMP jobs will use the '''<font face="courier"><font color="red"><mem></font color></font>''' parameter and may request up to all the available memory on a node.  
Typically, OpenMP jobs will use the '''<font face="courier"><font color="red"><mem></font></font>''' parameter and may request up to all the available memory on a node.  




Line 173: Line 175:
  #SBATCH --ntasks 1
  #SBATCH --ntasks 1
  #SBATCH --nodes 1
  #SBATCH --nodes 1
  #SBATCH --mem=<font color="red"><mem></font color>
  #SBATCH --mem=<font color="red"><mem></font>
  #SBATCH -c 4
  #SBATCH -c 4
   
   
Line 187: Line 189:
  fi
  fi
    
    
  mpirun -np <font color="red"></path/to/your_binary></font color> > <font color="red"><my_output></font color> 2>&1
  mpirun -np <font color="red"></path/to/your_binary></font> > <font color="red"><my_output></font> 2>&1
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             mpirun -np 16 <font color="red"></path/to/your_binary></font color> > <font color="red"><my_output></font color> 2>&1
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             mpirun -np 16 <font color="red"></path/to/your_binary></font> > <font color="red"><my_output></font> 2>&1


====MPI Distributed Memory Parallel Jobs====
==== MPI Distributed Memory Parallel Jobs====
For an MPI job, '''<font face="courier">select=</font>''' and '''<font face="courier">ncpus=</font>''' can be one or more, with '''<font face="courier">np= >/=1</font>'''.
For an MPI job, '''<font face="courier">select=</font>''' and '''<font face="courier">ncpus=</font>''' can be one or more, with '''<font face="courier">np= >/=1</font>'''.


Line 198: Line 200:
  #
  #
  #SBATCH --partition production
  #SBATCH --partition production
  #SBATCH -J <font color="red"><job_name></font color>
  #SBATCH -J <font color="red"><job_name></font>
  #SBATCH --ntasks 16
  #SBATCH --ntasks 16
  #SBATCH --nodes 16
  #SBATCH --nodes 16
  #SBATCH --mem=<font color="red"><mem></font color>
  #SBATCH --mem=<font color="red"><mem></font>


   
   
Line 209: Line 211:
  # Run my 16-core MPI job
  # Run my 16-core MPI job
   
   
  mpirun -np 16 <font color="red"></path/to/your_binary></font color> > <font color="red"><my_output></font color> 2>&1
  mpirun -np 16 <font color="red"></path/to/your_binary></font> > <font color="red"><my_output></font> 2>&1




Line 219: Line 221:
  #  
  #  
  #SBATCH --partition production
  #SBATCH --partition production
  #SBATCH -J <font color="red"><job_name></font color>
  #SBATCH -J <font color="red"><job_name></font>
  #SBATCH --ntasks l
  #SBATCH --ntasks l
  #SBATCH --gres gpu:1
  #SBATCH --gres gpu:1
Line 231: Line 233:
   
   
  # Run my GPU job on a single node using 1 CPU and 1 GPU.
  # Run my GPU job on a single node using 1 CPU and 1 GPU.
  <font color="red"></path/to/your_binary></font color> >  <font color="red"><my_output></font color> 2>&1
  <font color="red"></path/to/your_binary></font> >  <font color="red"><my_output></font> 2>&1


====Submitting jobs for execution====
====Submitting jobs for execution====
Line 238: Line 240:




The command to submit your “job submit script” ('''<font face="courier"><font color="red"><job.script></font color></font>''') is:
The command to submit your “job submit script” ('''<font face="courier"><font color="red"><job.script></font></font>''') is:
  sbatch <font color="red"><job.script></font color>
  sbatch <font color="red"><job.script></font>


===Running jobs on shared memory systems===
=== Running jobs on shared memory systems===


<font color="red">This section in in development</font color>
<font color="red">This section in in development</font>


===Saving output files and clean-up===
===Saving output files and clean-up ===
Normally you expect certain data in the output files as a result of a job. There are a number of things that you may want to do with these files:
Normally you expect certain data in the output files as a result of a job. There are a number of things that you may want to do with these files:


:• Check the content of these outputs and discard them. In such case you can simply delete all unwanted data with '''<font face="courier">rm</font>''' command.
:• Check the content of these outputs and discard them. In such case you can simply delete all unwanted data with '''<font face="courier">rm</font>''' command.
:• Move output files to your local workstation. You can use '''<font face="courier">scp</font>''' for small amounts of data and/or '''GlobusOnline''' for larger data transfers.  
:• Move output files to your local workstation. You can use '''<font face="courier">scp</font>''' for small amounts of data and/or '''GlobusOnline''' for larger data transfers.
:• You may also want to store the outputs at the HPCC resources. In this case you can either move your outputs to '''<font face="courier">/global/u</font>''' or to '''SR1''' storage resource.  
:• You may also want to store the outputs at the HPCC resources. In this case you can either move your outputs to '''<font face="courier">/global/u</font>''' or to '''SR1''' storage resource.


In all cases your /scratch/<userid> directory is expected to be empty. Output files stored inside '''<font face="courier">/scratch/<font color="red"><userid></font color></font>''' can be purged at any moment (except for files that are currently being used in active jobs) located under the '''<font face="courier">/scratch/<font color="red"><userid></font color>/<font color="red"><job_name></font color></font>''' directory.
In all cases your /scratch/<userid> directory is expected to be empty. Output files stored inside '''<font face="courier">/scratch/<font color="red"><userid></font></font>''' can be purged at any moment (except for files that are currently being used in active jobs) located under the '''<font face="courier">/scratch/<font color="red"><userid></font>/<font color="red"><job_name></font></font>''' directory.

Revision as of 21:31, 20 June 2023

Running jobs on Penzias, Appel and Karle

The jobs on these servers must and can start only from common file system called scratch. This file system is not a main file system and does not hold home directories for users. Consequently, users must prepare the set of job related files in their /scratch/<userid> directory before submit a job. The set includes:


  • Input files for the job;
  • Parameter(s) file(s) for the job (if applicable);
  • Correct job submission script.

Input files

Input files can be locally generated or directly transferred to /scratch/<userid>. using file transfer node. However HPCC recommends to transfer files to user's home directory first before copy needed files from user's home to /scratch/<userid>


Typically the process involves the following:

• Having input files within your /scratch/<userid> directory on the HPC system you wish to use.
• Set up environment for the job.
• Creating a job submit script that identifies the input files, the application program you wish to use, the compute resources needed to execute the job, and information on where you wish to write your output files.
• Submitting the job script.
• Saving output to the DSMS.

These steps are explained below.

Input file on /scratch

The general case is that you will have input files that have data on which you wish to operate. To compute or work on these files, they must be stored within the /scratch/<userid> directory of the HPC system you wish to use. These files can come from any of the following sources:

• Users can create files using a text editor. Note that Microsoft Word is word processing software and is not a text editor suitable for job. Users must use a Linux/Unix based text editor such as Vi/Vim, pico, nano, or edit.
• Users can copy files from their local storage (i.e. disk on user's laptop) to user's directory in /scratch and/or user's home directory in DSMS. HPCC recommend the files to be copied to home directories first and then staged on /scratch. For this operation users can use file transfer node (cea) or Globus online. Note that file transfer via ftp is not supported.

Inroduction

SLURM is open source scheduler and batch system which is implemented at HPCC. Currently SLURM is used only for Penzias’ job management but the use of SLURM will be expanded to other servers in the future.

SLURM commands:

Slurm commands resemble the commands used in Portable Batch System (SLURM). The below table compares the most common SLURM and SLURM Pro commands.

File:SLURM.png


A few examples follow:

If the files are in /global/u

cd /scratch/<userid>
mkdir <job_name> && cd <job_name>
cp /global/u/<userid>/<myTask/a.out ./
cp /global/u/<userid>/<myTask/<mydatafile> ./

If the files are in SR (cunyZone):

cd /scratch/<userid>
mkdir <job_name> && cd <job_name>
iget myTask/a.out ./
iget myTask/<mydatafile> ./

Set up job environment

Users must load the proper environment before start any job. The loaded environment wil be automatically exported to compute nodes at the time of execution. Users must use modules to load environment. For example to load environment for default version of GROMACS one must type:

module load gromacs

The list of available modules can be seen with command

module avail

The list of loaded modules can be seen with command

module list

More information about modules is provided in "Modules and available third party software" section below.

Running jobs on HPC systems running SLURM scheduler

To be able to schedule your job for execution and to actually run your job on one or more compute nodes, SLURM needs to be instructed about your job’s parameters. These instructions are typically stored in a “job submit script”. In this section, we describe the information that needs to be included in a job submit script. The submit script typically includes

• job name
• queue name
• what compute resources (number of nodes, number of cores and the amount of memory, the amount of local scratch disk storage (applies to Andy, Herbert, and Penzias), and the number of GPUs) or other resources a job will need
• packing option
• actual commands that need to be executed (binary that needs to be run, input\output redirection, etc.).


A pro forma job submit script is provided below.

#!/bin/bash
#SBATCH --partition <queue_name>
#SBATCH -J <job_name>
#SBATCH --mem <????>

# change to the working directory 
cd $SLURM_WORKDIR

echo ">>>> Begin <job_name>"

# actual binary (with IO redirections) and required input 
# parameters is called in the next line
mpirun -np <cpus> <Program Name> <input_text_file> > <output_file_name> 2>&1



Note: The #SLURM string must precede every SLURM parameter. # symbol in the beginning of any other line designates a comment line which is ignored by SLURM

Explanation of SLURM attributes and parameters:

--partition <queue_name> Available main queue is “production” unless otherwise instructed.
• “production” is the normal queue for processing your work on Penzias.
• “development” is used when you are testing an application. Jobs submitted to this queue can not request more than 8 cores or use more than 1 hour of total CPU time. If the job exceeds these parameters, it will be automatically killed. “Development” queue has higher priority and thus jobs in this queue have shorter wait time.
• “interactive” is used for quick interactive tests. Jobs submitted into this queue run in an interactive terminal session on one of the compute nodes. They can not use more than 4 cores or use more than a total of 15 minutes of compute time.
-J <job_name> The user must assign a name to each job they run. Names can be up to 15 alphanumeric characters in length.
--ntasks=<cpus> The number of cpus (or cores) that the user wants to use.
• Note: SLURM refers to “cores” as “cpus”; currently HPCC clusters maps one thread per one core.
--mem <mem> This parameter is required. It specifies how much memory is needed per job.
--gres <gpu:2> The number of graphics processing units that the user wants to use on a node (This parameter is only available on PENZIAS).
gpu:2 denotes requesting 2 GPU's. 



Special note for MPI users

Parameters are defined can significantly affect the run time of a job. For example, assume you need to run a job that requires 64 cores. This can be scheduled in a number of different ways. For example,

#SBATCH --nodes 8 
#SBATCH --ntasks 64

will freely place the 8 job chunks on any nodes that have 8 cpus available. While this may minimize communications overhead in your MPI job, SLURM will not schedule this job until 8 nodes each with 8 free cpus becomes available. Consequently, the job may wait longer in the input queue before going into execution.

#SBATCH --nodes 32
#SBATCH --ntasks 2

will freely place 32 chunks of 2 cores each. There will possibly be some nodes with 4 free chunks (and 8 cores) and there may be nodes with only 1 free chunk (and 2 cores). In this case, the job ends up being more sparsely distributed across the system and hence the total averaged latency may be larger then in case with nodes 8, ntasks 64



mpirun -np <total tasks or total cpus>. This script line is only to be used for MPI jobs and defines the total number of cores required for the parallel MPI job.

The Table 2 below shows the maximum values of the various SLURM parameters by system. Request only the resources you need as requesting maximal resources will delay your job.


Serial Jobs

For serial jobs, --nodes 1 and --ntasks 1 should be used.

#!/bin/bash
#
# Typical job script to run a serial job in the production queue
#
#SBATCH --partition production
#SBATCH -J <job_name>
#SBATCH --nodes 1
#SBATCH --ntasks 1

# Change to working directory
cd $SLURM_SUBMIT_DIR

# Run my serial job
</path/to/your_binary> > <my_output> 2>&1

OpenMP and Threaded Parallel jobs

OpenMP jobs can only run on a single virtual node. Therefore, for OpenMP jobs, place=pack and select=1 should be used; ncpus should be set to [2, 3, 4,… n] where <n must be less than or equal to the number of cores on a virtual compute node.

Typically, OpenMP jobs will use the <mem> parameter and may request up to all the available memory on a node.


#!/bin/bash
#SBATCH -J Job_name
#SBATCH --partition production
#SBATCH --ntasks 1
#SBATCH --nodes 1
#SBATCH --mem=<mem>
#SBATCH -c 4

# Set OMP_NUM_THREADS to the same value as -c
# with a fallback in case it isn't set.
# SLURM_CPUS_PER_TASK is set to the value of -c, but only if -c is explicitly set

omp_threads=1
if [ -n "$SLURM_CPUS_PER_TASK" ];
	 omp_threads=$SLURM_CPUS_PER_TASK
else
	 omp_threads=1
fi
  
mpirun -np </path/to/your_binary> > <my_output> 2>&1
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           mpirun -np 16 </path/to/your_binary> > <my_output> 2>&1

MPI Distributed Memory Parallel Jobs

For an MPI job, select= and ncpus= can be one or more, with np= >/=1.

#!/bin/bash
#
# Typical job script to run a distributed memory MPI job in the production queue requesting 16 cores in 16 nodes.
#
#SBATCH --partition production
#SBATCH -J <job_name>
#SBATCH --ntasks 16
#SBATCH --nodes 16
#SBATCH --mem=<mem>


# Change to working directory
cd $SLURM_SUBMIT_DIR

# Run my 16-core MPI job

mpirun -np 16 </path/to/your_binary> > <my_output> 2>&1


GPU-Accelerated Data Parallel Jobs

#!/bin/bash
#
# Typical job script to run a 1 CPU, 1 GPU batch job in the production queue
# 
#SBATCH --partition production
#SBATCH -J <job_name>
#SBATCH --ntasks l
#SBATCH --gres gpu:1
#SBATCH --mem <fond color="red"><mem></fond color>
# Find out which compute node the job is using
hostname

# Change to working directory
cd $SLURM_SUBMIT_DIR

# Run my GPU job on a single node using 1 CPU and 1 GPU.
</path/to/your_binary> >  <my_output> 2>&1

Submitting jobs for execution

NOTE: We do not allow users to run any production job on the login-node. It is acceptable to do short compiles on the login node, but all other jobs must be run by handing off the “job submit script” to SLURM running on the head-node. SLURM will then allocate resources on the compute-nodes for execution of the job.


The command to submit your “job submit script” (<job.script>) is:

sbatch <job.script>

Running jobs on shared memory systems

This section in in development

Saving output files and clean-up

Normally you expect certain data in the output files as a result of a job. There are a number of things that you may want to do with these files:

• Check the content of these outputs and discard them. In such case you can simply delete all unwanted data with rm command.
• Move output files to your local workstation. You can use scp for small amounts of data and/or GlobusOnline for larger data transfers.
• You may also want to store the outputs at the HPCC resources. In this case you can either move your outputs to /global/u or to SR1 storage resource.

In all cases your /scratch/<userid> directory is expected to be empty. Output files stored inside /scratch/<userid> can be purged at any moment (except for files that are currently being used in active jobs) located under the /scratch/<userid>/<job_name> directory.