Using Modules to Run Your Applications

From HPCC Wiki
Jump to navigation Jump to search

Modules is a software package that provides for the fast and convenient management of the components of a user's environment via modulefiles. When executed by the module command each module file fully configures the environment for its associated application or application group. The modules configuration language allows for the management of applications environment conflicts and dependencies as well. The modules software allows users to load (and unload and reload) an application and/or system environment that is specific to their needs and avoids the need to set and manage a large, one-size-fits-all, generic environment for everyone at login. Modules is the default approach to managing the user applications environment. CUNY HPC Center system BOB, currently used almost entirely for Gaussian jobs will NOT be reconfigured with the modules software. Module version 3.2.9 is the default on the CUNY HPC Center systems.

  • Modules, Learning by Example
    • Example 1, Basic Non-Cray System
    • Example 2, Less Basic From SALK (Cray System)

Using the module package users can easily set a collection of environmental variables that are specific to their compilation, parallel programming, and/or application requirements on the HPC Center's systems. The modules system also makes it convenient to advance or regress compiler, parallel programming, or applications versions when defaults are found to have bugs or performance issues. Whatever the task, the modules package can adjust the environment in an orderly way altering or setting of such environmental variables as PATH, MANPATH, LD_LIBRARY_PATH, etc. and providing some basic descriptive information about the application version being loaded and purpose of the modules file through the module help facility.

In addition to each application-specific modulefile, the module package functions through the use of a collection of sub-commands given after the initial module command itself as in "module list" for instance. All these module sub- command are described in detail in the module man page ("man module"), but a list of some of the more important and commonly used sub-commands is provided here:

Module sub-commands:

list
load
unload
switch
avail
show
help
purge


Modules, Learning by Example

The best way to explain how to use the modules package and its sub-command is to consider some simple examples of a typical workflows that involve modules. Here are two examples. Again, for a more complete description of the modules package please refer to "man module".

Example 1, Basic Non-Cray System

Consider the unmodified PATH variable right after login to one of the CUNY HPC Center systems.

Without any custom or local environmental path settings, it would look something like this with no compiler, parallel programming model, or application-specific information in it:

username@service0:~> echo $PATH | tr -s ':' '\n'
/scratch/username/bin
/usr/local/bin
/usr/bin
/bin
/usr/bin/X11
/usr/X11R6/bin
/usr/games
/opt/c3/bin

We take note that there appears to be no path to the application that we are interested in running which is Wolfram's Mathematica in this example. Typing "which math" to find Mathematica ("math" is the command-line name for Mathematica) at the terminal yields:

 
username@service0:~>  which math
which: no math in (/scratch/username/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/opt/c3/bin)

The Mathematica executable "math" is not found in the default PATH variabl defined by the system at login. Modules can be used to remedy this problem by adding the required path. To check which module files (if any) are already loaded into our environment, we are can type the "module list" sub-command at the terminal prompt:

username@service0:~> module list
No Modulefiles Currently Loaded.
username@service0:~>

No modules loaded. So the module file for Mathematica has not been loaded and it is no surprise that the Mathematica command-line "math" was not found. The next question is has the HPC Center installed Mathematica on this system and created a module file for it? To find this out we use the "module avail" sub-command:

username@service0:~> module avail
---------------------------- /share/apps/modules/default/modulefiles_UserApplications --------------------------------------

adf/2012.01(default)         cesm/1.0.3                   hoomd/0.9.2(default)         ncar/5.2.0_NCL(default)      pgi/12.3(default)
auto3dem/4.02(default)       cesm/1.0.4(default)          intel/12.1.3.293(default)    nwchem/6.1.1(default)        phoenics/2009(default)
autodock/4.2.3(default)      cuda/5.0(default)            ls-dyna/6.0.0(default)       octopus/4.0.0(default)       r/2.14.1(default)
beagle/0.2(default)          gromacs/4.5.5_32bit          mathematica/8.0.4(default)   openmpi/1.5.5_intel(default) wrf/3.4.0(default)
best/2.2L(default)           gromacs/4.5.5_64bit(default) matlab/R2012a(default)       openmpi/1.5.5_pgi

--------------------------------- /share/apps/modules/default/modulefiles_System -------------------------------------------

module-info   modules       version/3.2.9

The listing shows all available module files on this system, both those that are user-application related and those that are more system related. As shown in the output, these two types of module files are stored in different directories. Looking through the application list, there is a module for Mathematica version 8.0.4, which is also happens to be the default. On this system, the modules package has only just been installed, and therefore only one version of each application has been adapted to the module system and that version is the default.

The module file that is responsible for setting up correct environment needed to run Mathematica can now be loaded:

module load mathematica

Because there is only one version and it is the default, there is no need to include the version-specific extension to load it. To explicitly load version 8.0.4 (or any other specific and non-default version) one would use:

module load mathematica/8.0.4

After loading, the environmental PATH variable includes the path to Mathematica:

username@service0:~> echo $PATH | tr -s ':' '\n'
/scratch/username/bin
/usr/local/bin
/usr/bin
/bin
/usr/bin/X11
/usr/X11R6/bin
/usr/games
/opt/c3/bin
/share/apps/mathematica/8.0.4/Executables

This can be verified by rerunning the "which math" command:

username@service0:~> which math
/share/apps/mathematica/8.0.4/Executables/math

Once the head or login node enviroment variables are properly set, one can create a SLURM script to run an Mathematica job on a compute node and ensure that the head or login node environment just set is passed on to the compute nodes by using the "#SLURM -V" option inside you SLURM script:

#!/bin/bash
#SLURM -N mmat8_serial1
#SLURM -q production
#SLURM -l select=1:ncpus=1:mem=1920mb
#SLURM -l place=free
#SLURM -V

# Find out name of master execution host (compute node)
echo -n ">>>> SLURM Master compute node is: "
hostname

# You must explicitly change to the working directory in SLURM
cd $SLURM_O_WORKDIR

# Just point to the serial executable to run
echo ">>>> Begin Mathematica Serial Run ..."
echo ""
math -run <test_run.nb > output
echo ""
echo ">>>> End   Mathematica Serial Run ..."

Since the PATH variable in the login environment is now includes the location of the Mathematica executable and the "#SLURM -V" option ensures that this is passed to the compute node that the job is run on, the last line of the SLURM script will be executed without environment-related problems.

Example 2, Less Basic From SALK (Cray System)

As do all of the systems at the CUNY HPC Center, the Cray SALK has multiple compiler, parallel programming

models, libraries, and applications. In addition, SALK uses a custom high-performance interconnect with its own libraries, has its own compiler suite and compiling system, and many other custom libraries. Setting up and/or tearing down a given environment that makes all this work correctly is more complicated that it is on the other systems at the HPC Center. Modules simplifies this process tremendously for the user.

Here is an example of how to swap out the default Cray compiler environment on SALK and swap in the compiler suite from the Portland Group including all the right MPI libraries from Cray. In this case, we swap in a new release of the Portland Group compilers, not the current default on the Cray, and the version of the NETCDF library that has been compiled with the Portland group.

Having logged into SALK, we determine what modules have been load by default with "module list":

user@salk:~> module list
Currently Loaded Modulefiles:
  1) modules/3.2.6.6
  2) nodestat/2.2-1.0400.31264.2.5.gem
  3) sdb/1.0-1.0400.32124.7.19.gem
  4) MySQL/5.0.64-1.0000.5053.22.1
  5) lustre-cray_gem_s/1.8.6_2.6.32.45_0.3.2_1.0400.6453.5.1-1.0400.32127.1.90
  6) udreg/2.3.1-1.0400.4264.3.1.gem
  7) ugni/2.3-1.0400.4374.4.88.gem
  8) gni-headers/2.1-1.0400.4351.3.1.gem
  9) dmapp/3.2.1-1.0400.4255.2.159.gem
 10) xpmem/0.1-2.0400.31280.3.1.gem
 11) hss-llm/6.0.0
 12) Base-opts/1.0.2-1.0400.31284.2.2.gem
 13) xtpe-network-gemini
 14) cce/8.0.7
 15) acml/5.1.0
 16) xt-libsci/11.1.00
 17) pmi/3.0.0-1.0000.8661.28.2807.gem
 18) rca/1.0.0-2.0400.31553.3.58.gem
 19) xt-asyncpe/5.13
 20) atp/1.5.1
 21) PrgEnv-cray/4.0.46
 22) xtpe-mc8
 23) cray-mpich2/5.5.3
 24) SLURM/11.3.0.121723

From the list, we see that the Cray Programming Environment ("PrgEnv-cray/4.0.46") and the Cray Compiler environment are loaded ("cce/8.0.7") by default among other things (SLURM, MPICH, etc.). To unload these Cray modules and load in the Portland Group (PGI) equivalents we need to know the names of the PGI modules. The "module avail" command will tell us this:

user@salk:~> module avail
.
.
(several sections of output removed)
.
.
------------------------------------------------ /opt/modulefiles -----------------------------------------------------
Base-opts/1.0.2-1.0400.31284.2.2.gem(default)     gcc/4.1.2                                         SLURM/11.2.0.113417
PrgEnv-cray/3.1.61                                gcc/4.2.4                                         SLURM/11.3.0.121723(default)
PrgEnv-cray/4.0.46(default)                       gcc/4.4.2                                         petsc/3.1.08
PrgEnv-gnu/3.1.61                                 gcc/4.4.4                                         petsc/3.1.09
PrgEnv-gnu/4.0.46(default)                        gcc/4.5.1                                         petsc-complex/3.1.08
PrgEnv-intel/3.1.61                               gcc/4.5.2                                         petsc-complex/3.1.09
PrgEnv-intel/4.0.46(default)                      gcc/4.5.3                                         pgi/12.10
PrgEnv-pathscale/3.1.61                           gcc/4.6.1                                         pgi/12.3
PrgEnv-pathscale/4.0.46(default)                  gcc/4.7.1(default)                                pgi/12.6(default)
PrgEnv-pgi/3.1.61                                 hss-llm/6.0.0(default)                            pgi/12.8
PrgEnv-pgi/4.0.46(default)                        intel/12.1.1.256                                  wrf/3.3.0
acml/4.4.0                                        intel/12.1.4.319(default)                         wrf/3.4.0(default)
acml/5.1.0(default)                               intel/12.1.5.339                                  xt-asyncpe/5.01
admin-modules/1.0.2-1.0400.31284.2.2.gem(default) java/jdk1.6.0_24                                  xt-asyncpe/5.05
amber/12(default)                                 java/jdk1.7.0_03(default)                         xt-asyncpe/5.13(default)
cce/8.0.7(default)                                mazama/6.0.0(default)                             xt-libsci/11.0.00
chapel/1.4.0                                      modules/3.2.6.6(default)                          xt-libsci/11.0.04
chapel/1.5.0(default)                             mrnet/3.0.0(default)                              xt-libsci/11.1.00(default)
fftw/2.1.5.3                                      pathscale/4.0.12.1(default)                       xt-papi/4.2.0
fftw/3.2.2.1(default)                             pathscale/4.0.9                                   xt-papi/4.3.0(default)
fftw/3.3.0.1                                      SLURM/11.1.0.111761

There are several versions of the PGI compilers and two version of the PGI Programming Environment for the Cray (SALK). We are interested in loading PGI's 12.10 release (not the default which is "pgi/12.6") and the most current release of the PGI programming environment ("PrgEnv-pgi/4.0.46"), which is the default. The PGI programming environment for the Cray provides all the environmental settings required to use the PGI compilers on the Cray which includes a number of custom libraries.

Here is a series of module commands to unload the Cray defaults, load the PGI modules mentioned, and load version 4.2.0 of NETCDF compiled with the PGI compilers.

user@salk:~> module unload PrgEnv-cray
user@salk:~> module load PrgEnv-pgi
user@salk:~> module unload pgi
user@salk:~> module load pgi/12.10
user@salk:~> 
user@salk:~> module load netcdf/4.2.0
user@salk:~>
user@salk;~> cc -V
/opt/cray/xt-asyncpe/5.13/bin/cc: INFO: Compiling with CRAYPE_COMPILE_TARGET=native.

pgcc 12.10-0 64-bit target on x86-64 Linux 
Copyright 1989-2000, The Portland Group, Inc.  All Rights Reserved.
Copyright 2000-2012, STMicroelectronics, Inc.  All Rights Reserved.

Several comments about this series of command are perhaps useful. First, the first three commands do not include version numbers and will therefore load or unload the current default versions. In the third line, we unload the default version of the PGI compiler (version 12.6) which is loaded with the rest of the PGI Programming Environment in the second line. We then load the non-default and more recent release from PGI, version 12.10 in the fourth line. Later, we load NETCDF version 4.2.0 which, because we have already loaded the PGI Programming Environment, will load the version of NETCDF 4.2.0 compiled with the PGI compilers. Finally, we check to see which compiler the Cray "cc" compiler wrapper actually invokes after this sequence of module commands. We see that indeed "pgcc" version 12.10 is being used.

We can confirm all this by again entering "module list". Notice that the Cray-related compiler modules have been replaced by those from PGI and that NETCDF version 4.2.0 has been loaded. We are ready to use new PGI compiler suite based environment. It is left as an exercise to the reader to figure out how the series of commands listed above could have been shortened by using the "module swap" sub- command.

user@salk:~> module list
Currently Loaded Modulefiles:
  1) modules/3.2.6.6
  2) nodestat/2.2-1.0400.31264.2.5.gem
  3) sdb/1.0-1.0400.32124.7.19.gem
  4) MySQL/5.0.64-1.0000.5053.22.1
  5) lustre-cray_gem_s/1.8.6_2.6.32.45_0.3.2_1.0400.6453.5.1-1.0400.32127.1.90
  6) udreg/2.3.1-1.0400.4264.3.1.gem
  7) ugni/2.3-1.0400.4374.4.88.gem
  8) gni-headers/2.1-1.0400.4351.3.1.gem
  9) dmapp/3.2.1-1.0400.4255.2.159.gem
 10) xpmem/0.1-2.0400.31280.3.1.gem
 11) hss-llm/6.0.0
 12) Base-opts/1.0.2-1.0400.31284.2.2.gem
 13) xtpe-network-gemini
 14) xtpe-mc8
 15) cray-mpich2/5.5.3
 16) SLURM/11.3.0.121723
 17) xt-libsci/11.1.00
 18) pmi/3.0.0-1.0000.8661.28.2807.gem
 19) xt-asyncpe/5.13
 20) atp/1.5.1
 21) PrgEnv-pgi/4.0.46
 22) pgi/12.10
 23) hdf5/1.8.8
 24) netcdf/4.2.0