ULHPC Software/Modules Environment¶
The UL HPC facility provides a large variety of scientific applications to its user community, either domain-specific codes and general purpose development tools which enable research and innovation excellence across a wide set of computational fields. -- see software list.
We use the Environment Modules / LMod framework which provided the module
utility on Compute nodes
to manage nearly all software.
There are two main advantages of the module
approach:
- ULHPC can provide many different versions and/or installations of a single software package on a given machine, including a default version as well as several older and newer version.
- Users can easily switch to different versions or installations
without having to explicitly specify different paths. With modules,
the
MANPATH
and related environment variables are automatically managed.
ULHPC modules are in practice automatically generated by Easybuild.
EasyBuild (EB for short) is a software build and installation framework that allows you to manage (scientific) software on High Performance Computing (HPC) systems in an efficient way. A large number of scientific software are supported (at least 2175 supported software packages since the 4.3.2 release) - see also What is EasyBuild?.
For several years now, Easybuild is used to manage the ULHPC User Software Set and generate automatically the module files available to you on our computational resources in either prod
(default) or devel
(early development/testing) environment -- see ULHPC Toolchains and Software Set Versioning.
This enables users to easily extend the global Software Set with their own local software
builds, either performed within their global home
directory or (better) in a shared project
directory though Easybuild, which generate automatically module files compliant with the ULHPC module setup.
Environment modules and LMod¶
Environment Modules are a standard and well-established technology across HPC sites, to permit developing and using complex software and libraries build with dependencies, allowing multiple versions of software stacks and combinations thereof to co-exist.
It brings the module
command which is used to manage environment variables such as PATH
, LD_LIBRARY_PATH
and MANPATH
, enabling the easy loading and unloading of application/library profiles and their dependencies.
Why do you need [Environment] Modules?
When users login to a Linux system, they get a login shell and the shell uses Environment variables to run commands and applications. Most common are:
PATH
: colon-separated list of directories in which your system looks for executable files;MANPATH
: colon-separated list of directories in whichman
searches for the man pages;LD_LIBRARY_PATH
: colon-separated list of directories in which your system looks for for ELF /*.so
libraries at execution time needed by applications.
There are also application specific environment variables such as CPATH
, LIBRARY_PATH
, JAVA_HOME
, LM_LICENSE_FILE
, MKLROOT
etc.
A traditional way to setup these Environment variables is by customizing the shell initialization files: i.e. /etc/profile
, .bash_profile
, and .bashrc
This proves to be very impractical on multi-user systems with various applications and multiple application versions installed as on an HPC facility.
To overcome the difficulty of setting and changing the Environment variables, the TCL/C Environment Modules were introduced over 2 decades ago. The Environment Modules package is a tool that simplify shell initialization and lets users easily modify their environment during the session with modulefiles.
- Each modulefile contains the information needed to configure the shell for an application. Once the Modules package is initialized, the environment can be modified on a per-module basis using the
module
command which interprets modulefiles. Typically modulefiles instruct themodule
command to alter or set shell environment variables such asPATH
,MANPATH
, etc. - Modulefiles may be shared by many users on a system (as done on the ULHPC clusters) and users may have their own collection to supplement or replace the shared modulefiles.
Modules can be loaded and unloaded dynamically and atomically, in an clean fashion. All popular shells are supported, including bash
, ksh
, zsh
, sh
, csh
, tcsh
, fish
, as well as some scripting languages such as perl
, ruby
, tcl
, python
, cmake
and R
. Modules are useful in managing different versions of applications.
Modules can also be bundled into metamodules that will load an entire suite of different applications -- this is precisely the way we manage the ULHPC Software Set
Tcl/C Environment Modules (Tmod) vs. Tcl Environment Modules vs. Lmod
There exists several implementation of the module
tool:
- Tcl/C Environment Modules (3.2.10 \leq version < 4), also called
Tmod
: the seminal (old) implementation - Tcl-only variant of Environment modules (version \geq 4), previously called
Modules-Tcl
- (recommended) Lmod, a Lua based Environment Module System
- Lmod ("L" stands for Lua) provides all of the functionality of TCL/C Environment Modules plus more features:
- support for hierarchical module file structure
MODULEPATH
is dynamically updated when modules are loaded.- makes loaded modules inactive and active to provide sane environment.
- supports for hidden modules
- support for optional usage tracking (implemented on ULHPC facilities)
- Lmod ("L" stands for Lua) provides all of the functionality of TCL/C Environment Modules plus more features:
- In particular, Lmod enforces the following safety features that are not always guaranted with the other tools:
- The One Name Rule: Users can only have one version active
- Users can only load one compiler or MPI stack at a time (through the
family(...)
directive)
The ULHPC Facility relies on Lmod -- the associated Modulefiles being automatically generated by Easybuild.
The ULHPC Facility relies on Lmod, a Lua-based Environment module system that easily handles the MODULEPATH
Hierarchical problem. In this context, the module
command supports the following subcommands:
Command | Description |
---|---|
module avail |
Lists all the modules which are available to be loaded |
module spider <pattern> |
Search for |
module load <mod1> [mod2...] |
Load a module |
module unload <module> |
Unload a module |
module list |
List loaded modules |
module purge |
Unload all modules (purge) |
module display <module> |
Display what a module does |
module use <path> |
Prepend the directory to the MODULEPATH environment variable |
module unuse <path> |
Remove the directory from the MODULEPATH environment variable |
What is module
?
module
is a shell function that modifies user shell upon load of a modulefile.
It is defined as follows
$ type module
module is a function
module ()
{
eval $($LMOD_CMD bash "$@") && eval $(${LMOD_SETTARG_CMD:-:} -s sh)
}
module
is NOT a program
At the heart of environment modules interaction resides the following components:
- the
MODULEPATH
environment variable, which defines a colon-separated list of directories to search for modulefiles modulefile
(see an example) associated to each available software.
Example of ULHPC toolchain/foss
(auto-generated) Modulefile
$ module show toolchain/foss
-------------------------------------------------------------------------------
/opt/apps/resif/iris/2019b/broadwell/modules/all/toolchain/foss/2019b.lua:
-------------------------------------------------------------------------------
help([[
Description
===========
GNU Compiler Collection (GCC) based compiler toolchain, including
OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.
More information
================
- Homepage: https://easybuild.readthedocs.io/en/master/Common-toolchains.html#foss-toolchain
]])
whatis("Description: GNU Compiler Collection (GCC) based compiler toolchain, including
OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.")
whatis("Homepage: https://easybuild.readthedocs.io/en/master/Common-toolchains.html#foss-toolchain")
whatis("URL: https://easybuild.readthedocs.io/en/master/Common-toolchains.html#foss-toolchain")
conflict("toolchain/foss")
load("compiler/GCC/8.3.0")
load("mpi/OpenMPI/3.1.4-GCC-8.3.0")
load("numlib/OpenBLAS/0.3.7-GCC-8.3.0")
load("numlib/FFTW/3.3.8-gompi-2019b")
load("numlib/ScaLAPACK/2.0.2-gompi-2019b")
setenv("EBROOTFOSS","/opt/apps/resif/iris/2019b/broadwell/software/foss/2019b")
setenv("EBVERSIONFOSS","2019b")
setenv("EBDEVELFOSS","/opt/apps/resif/iris/2019b/broadwell/software/foss/2019b/easybuild/toolchain-foss-2019b-easybuild-devel")
(reminder): the module
command is ONLY available on the compute nodes, NOT on the access front-ends.
In particular, you need to be within a job to load ULHPC or private modules.
ULHPC $MODULEPATH
¶
By default, the MODULEPATH
environment variable holds a single searched directory holding the optimized builds prepared for you by the ULHPC Team.
The general format of this directory is as follows:
/opt/apps/resif/<cluster>/<version>/<arch>/modules/all
where:
<cluster>
depicts the name of the cluster (iris
oraion
). Stored as$ULHPC_CLUSTER
.<version>
corresponds to the ULHPC Software set release (aligned with Easybuid toolchains release), i.e.2019b
,2020a
etc. Stored as$RESIF_VERSION_{PROD,DEVEL,LEGACY}
depending on the Production / development / legacy ULHPC software set version<arch>
is a lower-case strings that categorize the CPU architecture of the build host, and permits to easyli identify optimized target architecture. It is stored as$RESIF_ARCH
.- On Intel nodes:
broadwell
(default),skylake
- On AMD nodes:
epyc
- On GPU nodes:
gpu
- On Intel nodes:
Cluster | Arch. $RESIF_ARCH |
$MODULEPATH Environment variable |
---|---|---|
Iris | broadwell (default) |
/opt/apps/resif/iris/<version>/broadwell/modules/all |
Iris | skylake |
/opt/apps/resif/iris/<version>/skylake/modules/all |
Iris | gpu |
/opt/apps/resif/iris/<version>/gpu/modules/all |
Aion | epyc (default) |
/opt/apps/resif/aion/<version>/{epyc}/modules/all |
- On skylake nodes, you may want to use the optimized modules for
skylake
- On GPU nodes, you may want to use the CPU-optimized builds for
skylake
(in addition to thegpu
-enabled softwares)
ACM PEARC'21: RESIF 3.0
If you are interested to know more on the wey we setup and deploy the User Software Environment on ULHPC systems through the RESIF 3 framework, you can refer to the below article presented during the ACM PEARC'21 conference, on July 22, 2021.
ACM Reference Format | ORBilu entry | OpenAccess | ULHPC blog post | slides | Github:
Sebastien Varrette, Emmanuel Kieffer, Frederic Pinel, Ezhilmathi Krishnasamy, Sarah Peter, Hyacinthe Cartiaux, and Xavier Besseron. 2021. RESIF 3.0: Toward a Flexible & Automated Management of User Software Environment on HPC facility. In Practice and Experience in Advanced Research Computing (PEARC '21). Association for Computing Machinery (ACM), New York, NY, USA, Article 33, 1–4. https://doi.org/10.1145/3437359.3465600
Module Naming Schemes¶
What is a Module Naming Scheme?
The full software and module install paths for a particular software package are determined by the active module naming scheme along with the general software and modules install paths specified by the EasyBuild configuration.
You can list the supported module naming schemes of Easybuild using:
$ eb --avail-module-naming-schemes
List of supported module naming schemes:
EasyBuildMNS
CategorizedHMNS
MigrateFromEBToHMNS
HierarchicalMNS
CategorizedModuleNamingScheme
CategorizedModuleNamingScheme
.
Module Naming Schemes on ULHPC system
ULHPC modules are organised through the Categorized Naming Scheme
Format: <category>/<name>/<version>-<toolchain><versionsuffix>
This means that the typical module hierarchy has as prefix a category level, taken out from one of the supported software category or module class:
$ eb --show-default-moduleclasses
Default available module classes:
base: Default module class
astro: Astronomy, Astrophysics and Cosmology
bio: Bioinformatics, biology and biomedical
cae: Computer Aided Engineering (incl. CFD)
chem: Chemistry, Computational Chemistry and Quantum Chemistry
compiler: Compilers
data: Data management & processing tools
debugger: Debuggers
devel: Development tools
geo: Earth Sciences
ide: Integrated Development Environments (e.g. editors)
lang: Languages and programming aids
lib: General purpose libraries
math: High-level mathematical software
mpi: MPI stacks
numlib: Numerical Libraries
perf: Performance tools
quantum: Quantum Computing
phys: Physics and physical systems simulations
system: System utilities (e.g. highly depending on system OS and hardware)
toolchain: EasyBuild toolchains
tools: General purpose tools
vis: Visualization, plotting, documentation and typesetting
It follows that the ULHPC software modules are structured according to the organization depicted below (click to enlarge).
ULHPC Toolchains and Software Set Versioning¶
We offer a YEARLY release of the ULHPC Software Set based on Easybuid release of toolchains -- see Component versions (fixed per release) in the foss and intel toolchains. However, count at least 6 months of validation/import after EB release before ULHPC release
An overview of the currently available component versions is depicted below:
Name | Type | 2019b (legacy ) |
2020a | 2020b (prod ) |
2021a | 2021b (devel ) |
---|---|---|---|---|---|---|
GCCCore |
compiler | 8.3.0 | 9.3.0 | 10.2.0 | 10.3.0 | 11.2.0 |
foss |
toolchain | 2019b | 2020a | 2020b | 2021a | 2021b |
intel |
toolchain | 2019b | 2020a | 2020b | 2021a | 2021b |
binutils | 2.32 | 2.34 | 2.35 | 2.36 | 2.37 | |
Python | 3.7.4 (and 2.7.16) | 3.8.2 (and 2.7.18) | 3.8.6 | 3.9.2 | 3.9.6 | |
LLVM | compiler | 9.0.1 | 10.0.1 | 11.0.0 | 11.1.0 | 12.0.1 |
OpenMPI | MPI | 3.1.4 | 4.0.3 | 4.0.5 | 4.1.1 | 4.1.2 |
Once on a node, the current version of the ULHPC Software Set in production is stored in $RESIF_VERSION_PROD
.
You can use the variables $MODULEPATH_{LEGACY,PROD,DEVEL}
to access or set the MODULEPATH
command with the appropriate value. Yet we have define utility scripts to facilitate your quick reset of the module environment, i.e., resif-load-swset-{legacy,prod,devel}
and resif-reset-swset
For instance, if you want to use the legacy software set, proceed as follows in your launcher scripts:
resif-load-swset-legacy # Eq. of export MODULEPATH=$MODULEPATH_LEGACY
# [...]
# Restore production settings
resif-load-swset-prod # Eq. of export MODULEPATH=$MODULEPATH_PROD
If on the contrary you want to test the (new) development software set, i.e., the devel
version, stored in $RESIF_VERSION_DEVEL
:
resif-load-swset-devel # Eq. of export MODULEPATH=$MODULEPATH_DEVEL
# [...]
# Restore production settings
resif-reset-swset # As resif-load-swset-prod
(iris only) Skylake Optimized builds
Skylake optimized build can be loaded on regular nodes using
resif-load-swset-skylake # Eq. of export MODULEPATH=$MODULEPATH_PROD_SKYLAKE
sbatch -C skylake [...]
) to take benefit from it.
Note that this action is not required on GPU nodes.
GPU Optimized builds vs. CPU software set on GPU nodes
On GPU nodes, be aware that the default MODULEPATH holds two directories:
- GPU Optimized builds (i.e. typically against the
{foss,intel}cuda
toolchains) stored under/opt/apps/resif/<cluster>/<version>/gpu/modules/all
- CPU Optimized builds (ex: skylake on Iris)) stored under
/opt/apps/resif/<cluster>/<version>/skylake/modules/all
You may want to exclude CPU builds to ensure you take the most out of the GPU accelerators. In that case, you may want to run:
# /!\ ADAPT <version> accordingly
module unuse /opt/apps/resif/${ULHPC_CLUSTER}/${RESIF_VERSION_PROD}/skylake/modules/all
Using Easybuild to Create Custom Modules¶
Just like we do, you probably want to use Easybuild to complete the existing software set with your own modules and software builds.
See Building Custom (or missing) software documentation for more details.
Creating a Custom Module Environment¶
You can modify your environment so that certain modules are loaded
whenever you log in.
Use module save [<name>]
and module restore [<name>]
for that purpose -- see Lmod documentation on User collections
You can also create and install your own modules for your convenience or
for sharing software among collaborators.
See the modulefile documentation for
details of the required format and available commands.
These custom modulefiles can be made visible to the module
command by
module use /path/to/the/custom/modulefiles
Warning
- Make sure the UNIX file permissions grant access to all users who want to use the software.
- Do not give write permissions to your home directory to anyone else.
Note
The module use
command adds new directories before
other module search paths (defined as $MODULEPATH
), so modules
defined in a custom directory will have precedence if there are
other modules with the same name in the module search paths. If
you prefer to have the new directory added at the end of
$MODULEPATH
, use module use -a
instead of module use
.
Module FAQ¶
Is there an environment variable that captures loaded modules?
Yes, active modules can be retrieved via $LOADEDMODULES
, this environment variable is
automatically changed to reflect active loaded modules that is reflected via module list
.
If you want to access modulefile path for loaded modules you can retrieve via $_LM_FILES