wiki:HPC_deploy

Version 3 (modified by Roan Kanninga, 8 years ago) (diff)

--

Downloading and deploying tools is done with a tool called EasyBuild. This helps to get consistent installing and deploying of tools on the cluster.

For most of the tools there is already an EasyBuild file existing. These files are stored on github and are called easyconfigs and can be found here

If there is not an easybuild file (.eb) on github and there is no eb file on the cluster (/apps/sources/EasyBuild/custom), we have to create one ourselves.

First an example of an custom EasyBuild file created for deploying NGS_DNA pipeline on the cluster. Below the code there will be the explanation of all the steps in the script.

name = 'NGS_DNA'
version = '3.1.2'
namelower = name.lower()
homepage = 'https://github.com/molgenis/molgenis-pipelines'
description = """This distribution already contains several pipelines/protocols/parameter files which you can use 'out-of-the-box' to align and impute your NGS data using MOLGENIS Compute."""

toolchain = {'name': 'dummy', 'version': 'dummy'}
easyblock = 'Tarball'

#dependencies
molname = 'Molgenis-Compute'
molversion = 'v15.04.1-Java-1.7.0_80'
versionsuffix = '-%s-%s' % (molname,molversion)
dependencies = [(molname,molversion)]

source_urls = [('http://github.com/molgenis/molgenis-pipelines/releases/download/%s/' % (version))]
sources = [('%s-%s.tar.gz' % (name, version))]

sanity_check_paths = {
    'files': ['workflow.csv', 'parameters.csv'],
    'dirs': []
}

moduleclass = 'bio'

name and version are pretty clear
homepage and description; are recommended when releasing a future release of the tool.
toolchain; can be left like it is in the example (only when multiple tools need to be installed before this, you should use toolchain (see manual online of Easybuild).
easyblock; this is the type of data, tar.gz = ‘Tarball’ , executable = ‘Binary’ . All the different easyblocks are here
If there any dependencies (in this case Molgenis-Compute), you put it in the name of your eb file (name will look like this: NGS_DNA-3.1.2-Molgenis-Compute-v15.04.1-Java-1.7.0_80)
Using variables instead of typing the same string 5 times is done with %s and then between () the name of the variable.
One necessary step is to set sanity_check_paths, this is a check whether the file is unpacked/installed correctly.
All the installed eb configs are put automatically in the /apps/modules/all folder, but with moduleclass you can specify an extra module path.
N.B. when typing the command module avail on the cluster will only display the non-all modules. So specifying an extra moduleclass is necessary to find your module back in module avail