Data processing
Alphabetical list of available ULHPC software belonging to the 'data' category.
To load a software of this category, use: module load data/<software>[/<version>]
Software | Versions | Swsets | Architectures | Clusters | Description |
---|---|---|---|---|---|
Arrow | 0.16.0 | 2019b | broadwell, skylake | iris | Apache Arrow (incl. PyArrow Python bindings)), a cross-language development platform for in-memory data. |
DB_File | 1.855 | 2020b | broadwell, epyc, skylake | aion, iris | Perl5 access to Berkeley DB version 1.x. |
GDAL | 3.0.2, 3.2.1 | 2019b, 2020b | broadwell, skylake, gpu, epyc | iris, aion | GDAL is a translator library for raster geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single abstract data model to the calling application for all supported formats. It also comes with a variety of useful commandline utilities for data translation and processing. |
HDF5 | 1.10.5, 1.10.7 | 2019b, 2020b | broadwell, skylake, gpu, epyc | iris, aion | HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data. |
HDF | 4.2.15 | 2020b | broadwell, epyc, skylake, gpu | aion, iris | HDF (also known as HDF4) is a library and multi-object file format for storing and managing data between machines. |
LAME | 3.100 | 2019b, 2020b | broadwell, skylake, gpu, epyc | iris, aion | LAME is a high quality MPEG Audio Layer III (MP3) encoder licensed under the LGPL. |
XML-LibXML | 2.0201, 2.0206 | 2019b, 2020b | broadwell, skylake, epyc | iris, aion | Perl binding for libxml2 |
dask | 2021.2.0 | 2020b | broadwell, epyc, skylake, gpu | aion, iris | Dask natively scales Python. Dask provides advanced parallelism for analytics, enabling performance at scale for the tools you love. |
h5py | 2.10.0, 3.1.0 | 2019b, 2020b | broadwell, skylake, gpu, epyc | iris, aion | HDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a versatile, mature scientific software library designed for the fast, flexible storage of enormous amounts of data. |
netCDF-Fortran | 4.5.2, 4.5.3 | 2019b, 2020b | broadwell, skylake, epyc | iris, aion | NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. |
netCDF | 4.7.1, 4.7.4 | 2019b, 2020b | broadwell, skylake, gpu, epyc | iris, aion | NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. |
scikit-learn | 0.23.2 | 2020b | broadwell, epyc, skylake, gpu | aion, iris | Scikit-learn integrates machine learning algorithms in the tightly-knit scientific Python world, building upon numpy, scipy, and matplotlib. As a machine-learning module, it provides versatile tools for data mining and analysis in any field of science and engineering. It strives to be simple and efficient, accessible to everybody, and reusable in various contexts. |
Last update: November 13, 2024