Skip to content

ULHPC Software/Modules Environment

The UL HPC facility provides a large variety of scientific applications to its user community, either domain-specific codes and general purpose development tools which enable research and innovation excellence across a wide set of computational fields. -- see software list.

We use the Environment Modules / LMod framework which provided the module utility on Compute nodes to manage nearly all software.
There are two main advantages of the module approach:

  1. ULHPC can provide many different versions and/or installations of a single software package on a given machine, including a default version as well as several older and newer version.
  2. Users can easily switch to different versions or installations without having to explicitly specify different paths. With modules, the MANPATH and related environment variables are automatically managed.

ULHPC modules are in practice automatically generated by Easybuild.

EasyBuild (EB for short) is a software build and installation framework that allows you to manage (scientific) software on High Performance Computing (HPC) systems in an efficient way. A large number of scientific software are supported (at least 2175 supported software packages since the 4.3.2 release) - see also What is EasyBuild?.

For several years now, Easybuild is used to manage the ULHPC User Software Set and generate automatically the module files available to you on our computational resources in either prod (default) or devel (early development/testing) environment -- see ULHPC Toolchains and Software Set Versioning. This enables users to easily extend the global Software Set with their own local software builds, either performed within their global home directory or (better) in a shared project directory though Easybuild, which generate automatically module files compliant with the ULHPC module setup.

Environment modules and LMod

Environment Modules are a standard and well-established technology across HPC sites, to permit developing and using complex software and libraries build with dependencies, allowing multiple versions of software stacks and combinations thereof to co-exist.

It brings the module command which is used to manage environment variables such as PATH, LD_LIBRARY_PATH and MANPATH, enabling the easy loading and unloading of application/library profiles and their dependencies.

Why do you need [Environment] Modules?

When users login to a Linux system, they get a login shell and the shell uses Environment variables to run commands and applications. Most common are:

  • PATH: colon-separated list of directories in which your system looks for executable files;
  • MANPATH: colon-separated list of directories in which man searches for the man pages;
  • LD_LIBRARY_PATH: colon-separated list of directories in which your system looks for for ELF / *.so libraries at execution time needed by applications.

There are also application specific environment variables such as CPATH, LIBRARY_PATH, JAVA_HOME, LM_LICENSE_FILE, MKLROOT etc.

A traditional way to setup these Environment variables is by customizing the shell initialization files: i.e. /etc/profile, .bash_profile, and .bashrc This proves to be very impractical on multi-user systems with various applications and multiple application versions installed as on an HPC facility.

To overcome the difficulty of setting and changing the Environment variables, the TCL/C Environment Modules were introduced over 2 decades ago. The Environment Modules package is a tool that simplify shell initialization and lets users easily modify their environment during the session with modulefiles.

  • Each modulefile contains the information needed to configure the shell for an application. Once the Modules package is initialized, the environment can be modified on a per-module basis using the module command which interprets modulefiles. Typically modulefiles instruct the module command to alter or set shell environment variables such as PATH, MANPATH, etc.
  • Modulefiles may be shared by many users on a system (as done on the ULHPC clusters) and users may have their own collection to supplement or replace the shared modulefiles.

Modules can be loaded and unloaded dynamically and atomically, in an clean fashion. All popular shells are supported, including bash, ksh, zsh, sh, csh, tcsh, fish, as well as some scripting languages such as perl, ruby, tcl, python, cmake and R. Modules are useful in managing different versions of applications. Modules can also be bundled into metamodules that will load an entire suite of different applications -- this is precisely the way we manage the ULHPC Software Set

Tcl/C Environment Modules (Tmod) vs. Tcl Environment Modules vs. Lmod

There exists several implementation of the module tool:

  • Tcl/C Environment Modules (3.2.10 \leq version < 4), also called Tmod: the seminal (old) implementation
  • Tcl-only variant of Environment modules (version \geq 4), previously called Modules-Tcl
  • (recommended) Lmod, a Lua based Environment Module System
    • Lmod ("L" stands for Lua) provides all of the functionality of TCL/C Environment Modules plus more features:
      • support for hierarchical module file structure
      • MODULEPATH is dynamically updated when modules are loaded.
      • makes loaded modules inactive and active to provide sane environment.
      • supports for hidden modules
      • support for optional usage tracking (implemented on ULHPC facilities)
  • In particular, Lmod enforces the following safety features that are not always guaranted with the other tools:
    1. The One Name Rule: Users can only have one version active
    2. Users can only load one compiler or MPI stack at a time (through the family(...) directive)

The ULHPC Facility relies on Lmod -- the associated Modulefiles being automatically generated by Easybuild.

The ULHPC Facility relies on Lmod, a Lua-based Environment module system that easily handles the MODULEPATH Hierarchical problem. In this context, the module command supports the following subcommands:

Command Description
module avail Lists all the modules which are available to be loaded
module spider <pattern> Search for among available modules (Lmod only)
module load <mod1> [mod2...] Load a module
module unload <module> Unload a module
module list List loaded modules
module purge Unload all modules (purge)
module display <module> Display what a module does
module use <path> Prepend the directory to the MODULEPATH environment variable
module unuse <path> Remove the directory from the MODULEPATH environment variable
What is module?

module is a shell function that modifies user shell upon load of a modulefile. It is defined as follows

$ type module
module is a function
module ()
{
    eval $($LMOD_CMD bash "$@") && eval $(${LMOD_SETTARG_CMD:-:} -s sh)
}
In particular, module is NOT a program

At the heart of environment modules interaction resides the following components:

  • the MODULEPATH environment variable, which defines a colon-separated list of directories to search for modulefiles
  • modulefile (see an example) associated to each available software.
Example of ULHPC toolchain/foss (auto-generated) Modulefile
$ module show toolchain/foss
-------------------------------------------------------------------------------
   /opt/apps/resif/iris/2019b/broadwell/modules/all/toolchain/foss/2019b.lua:
-------------------------------------------------------------------------------
help([[
Description
===========
GNU Compiler Collection (GCC) based compiler toolchain, including
 OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.

More information
================
 - Homepage: https://easybuild.readthedocs.io/en/master/Common-toolchains.html#foss-toolchain
]])
whatis("Description: GNU Compiler Collection (GCC) based compiler toolchain, including
 OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.")
whatis("Homepage: https://easybuild.readthedocs.io/en/master/Common-toolchains.html#foss-toolchain")
whatis("URL: https://easybuild.readthedocs.io/en/master/Common-toolchains.html#foss-toolchain")
conflict("toolchain/foss")
load("compiler/GCC/8.3.0")
load("mpi/OpenMPI/3.1.4-GCC-8.3.0")
load("numlib/OpenBLAS/0.3.7-GCC-8.3.0")
load("numlib/FFTW/3.3.8-gompi-2019b")
load("numlib/ScaLAPACK/2.0.2-gompi-2019b")
setenv("EBROOTFOSS","/opt/apps/resif/iris/2019b/broadwell/software/foss/2019b")
setenv("EBVERSIONFOSS","2019b")
setenv("EBDEVELFOSS","/opt/apps/resif/iris/2019b/broadwell/software/foss/2019b/easybuild/toolchain-foss-2019b-easybuild-devel")

(reminder): the module command is ONLY available on the compute nodes, NOT on the access front-ends.
In particular, you need to be within a job to load ULHPC or private modules.

ULHPC $MODULEPATH

By default, the MODULEPATH environment variable holds a single searched directory holding the optimized builds prepared for you by the ULHPC Team. The general format of this directory is as follows:

/opt/apps/resif/<cluster>/<version>/<arch>/modules/all

where:

  • <cluster> depicts the name of the cluster (iris or aion). Stored as $ULHPC_CLUSTER.
  • <version> corresponds to the ULHPC Software set release (aligned with Easybuid toolchains release), i.e. 2019b, 2020a etc. Stored as $RESIF_VERSION_{PROD,DEVEL,LEGACY} depending on the Production / development / legacy ULHPC software set version
  • <arch> is a lower-case strings that categorize the CPU architecture of the build host, and permits to easyli identify optimized target architecture. It is stored as $RESIF_ARCH.
    • On Intel nodes: broadwell (default), skylake
    • On AMD nodes: epyc
    • On GPU nodes: gpu
Cluster Arch. $RESIF_ARCH $MODULEPATH Environment variable
Iris broadwell (default) /opt/apps/resif/iris/<version>/broadwell/modules/all
Iris skylake /opt/apps/resif/iris/<version>/skylake/modules/all
Iris gpu /opt/apps/resif/iris/<version>/gpu/modules/all
Aion epyc (default) /opt/apps/resif/aion/<version>/{epyc}/modules/all
  • On skylake nodes, you may want to use the optimized modules for skylake
  • On GPU nodes, you may want to use the CPU-optimized builds for skylake (in addition to the gpu-enabled softwares)

ACM PEARC'21: RESIF 3.0

If you are interested to know more on the wey we setup and deploy the User Software Environment on ULHPC systems through the RESIF 3 framework, you can refer to the below article presented during the ACM PEARC'21 conference, on July 22, 2021.

ACM Reference Format | ORBilu entry | OpenAccess | ULHPC blog post | slides | Github:
Sebastien Varrette, Emmanuel Kieffer, Frederic Pinel, Ezhilmathi Krishnasamy, Sarah Peter, Hyacinthe Cartiaux, and Xavier Besseron. 2021. RESIF 3.0: Toward a Flexible & Automated Management of User Software Environment on HPC facility. In Practice and Experience in Advanced Research Computing (PEARC '21). Association for Computing Machinery (ACM), New York, NY, USA, Article 33, 1–4. https://doi.org/10.1145/3437359.3465600

Module Naming Schemes

What is a Module Naming Scheme?

The full software and module install paths for a particular software package are determined by the active module naming scheme along with the general software and modules install paths specified by the EasyBuild configuration.

You can list the supported module naming schemes of Easybuild using:

$ eb --avail-module-naming-schemes
List of supported module naming schemes:
    EasyBuildMNS
    CategorizedHMNS
    MigrateFromEBToHMNS
    HierarchicalMNS
    CategorizedModuleNamingScheme
See Flat vs. Hierarchical module naming scheme for an illustrated explaination of the difference between two extreme cases: flat or 3-level hierarchical. On ULHPC systems, we selected an intermediate scheme called CategorizedModuleNamingScheme.

Module Naming Schemes on ULHPC system

ULHPC modules are organised through the Categorized Naming Scheme
Format: <category>/<name>/<version>-<toolchain><versionsuffix>

This means that the typical module hierarchy has as prefix a category level, taken out from one of the supported software category or module class:

$ eb --show-default-moduleclasses
Default available module classes:

    base:      Default module class
    astro:     Astronomy, Astrophysics and Cosmology
    bio:       Bioinformatics, biology and biomedical
    cae:       Computer Aided Engineering (incl. CFD)
    chem:      Chemistry, Computational Chemistry and Quantum Chemistry
    compiler:  Compilers
    data:      Data management & processing tools
    debugger:  Debuggers
    devel:     Development tools
    geo:       Earth Sciences
    ide:       Integrated Development Environments (e.g. editors)
    lang:      Languages and programming aids
    lib:       General purpose libraries
    math:      High-level mathematical software
    mpi:       MPI stacks
    numlib:    Numerical Libraries
    perf:      Performance tools
    quantum:   Quantum Computing
    phys:      Physics and physical systems simulations
    system:    System utilities (e.g. highly depending on system OS and hardware)
    toolchain: EasyBuild toolchains
    tools:     General purpose tools
    vis:       Visualization, plotting, documentation and typesetting

It follows that the ULHPC software modules are structured according to the organization depicted below (click to enlarge).

ULHPC Toolchains and Software Set Versioning

We offer a YEARLY release of the ULHPC Software Set based on Easybuid release of toolchains -- see Component versions (fixed per release) in the foss and intel toolchains. However, count at least 6 months of validation/import after EB release before ULHPC release

An overview of the currently available component versions is depicted below:

Name Type 2019b (legacy) 2020a 2020b (prod) 2021a 2021b (devel)
GCCCore compiler 8.3.0 9.3.0 10.2.0 10.3.0 11.2.0
foss toolchain 2019b 2020a 2020b 2021a 2021b
intel toolchain 2019b 2020a 2020b 2021a 2021b
binutils 2.32 2.34 2.35 2.36 2.37
Python 3.7.4 (and 2.7.16) 3.8.2 (and 2.7.18) 3.8.6 3.9.2 3.9.6
LLVM compiler 9.0.1 10.0.1 11.0.0 11.1.0 12.0.1
OpenMPI MPI 3.1.4 4.0.3 4.0.5 4.1.1 4.1.2

Once on a node, the current version of the ULHPC Software Set in production is stored in $RESIF_VERSION_PROD. You can use the variables $MODULEPATH_{LEGACY,PROD,DEVEL} to access or set the MODULEPATH command with the appropriate value. Yet we have define utility scripts to facilitate your quick reset of the module environment, i.e., resif-load-swset-{legacy,prod,devel} and resif-reset-swset

For instance, if you want to use the legacy software set, proceed as follows in your launcher scripts:

resif-load-swset-legacy   # Eq. of export MODULEPATH=$MODULEPATH_LEGACY
# [...]
# Restore production settings
resif-load-swset-prod     # Eq. of export MODULEPATH=$MODULEPATH_PROD

If on the contrary you want to test the (new) development software set, i.e., the devel version, stored in $RESIF_VERSION_DEVEL:

resif-load-swset-devel  # Eq. of export MODULEPATH=$MODULEPATH_DEVEL
# [...]
# Restore production settings
resif-reset-swset         # As resif-load-swset-prod
(iris only) Skylake Optimized builds

Skylake optimized build can be loaded on regular nodes using

resif-load-swset-skylake  # Eq. of export MODULEPATH=$MODULEPATH_PROD_SKYLAKE
You MUST obviously be on a Skylake node (sbatch -C skylake [...]) to take benefit from it. Note that this action is not required on GPU nodes.

GPU Optimized builds vs. CPU software set on GPU nodes

On GPU nodes, be aware that the default MODULEPATH holds two directories:

  1. GPU Optimized builds (i.e. typically against the {foss,intel}cuda toolchains) stored under /opt/apps/resif/<cluster>/<version>/gpu/modules/all
  2. CPU Optimized builds (ex: skylake on Iris)) stored under /opt/apps/resif/<cluster>/<version>/skylake/modules/all

You may want to exclude CPU builds to ensure you take the most out of the GPU accelerators. In that case, you may want to run:

# /!\ ADAPT <version> accordingly
module unuse /opt/apps/resif/${ULHPC_CLUSTER}/${RESIF_VERSION_PROD}/skylake/modules/all

Using Easybuild to Create Custom Modules

Just like we do, you probably want to use Easybuild to complete the existing software set with your own modules and software builds.

See Building Custom (or missing) software documentation for more details.

Creating a Custom Module Environment

You can modify your environment so that certain modules are loaded whenever you log in. Use module save [<name>] and module restore [<name>] for that purpose -- see Lmod documentation on User collections

You can also create and install your own modules for your convenience or for sharing software among collaborators. See the modulefile documentation for details of the required format and available commands. These custom modulefiles can be made visible to the module command by

module use /path/to/the/custom/modulefiles

Warning

  1. Make sure the UNIX file permissions grant access to all users who want to use the software.
  2. Do not give write permissions to your home directory to anyone else.

Note

The module use command adds new directories before other module search paths (defined as $MODULEPATH), so modules defined in a custom directory will have precedence if there are other modules with the same name in the module search paths. If you prefer to have the new directory added at the end of $MODULEPATH, use module use -a instead of module use.

Module FAQ

Is there an environment variable that captures loaded modules?

Yes, active modules can be retrieved via $LOADEDMODULES, this environment variable is

automatically changed to reflect active loaded modules that is reflected via module list. If you want to access modulefile path for loaded modules you can retrieve via $_LM_FILES


Last update: November 13, 2024