= SOP for central deployment of software and reference data sets = [[TOC()]] == Deploying (reference) data sets == == Deploying software == * Deployment of ''system'' software (packages from the repos of the Linux distros we use) is handled by our sys admins and beyond the scope of this SOP. * Deployment of bioinformatics software is handled by the bioinformaticians from the ''depad'' group. If you want to become part of the ''depad'' group [wiki:Contact contact the helpdesk]. The depad group uses [https://hpcugent.github.io/easybuild/ EasyBuild], which uses !EasyConfigs as recipes to enforce consistent, reproducible installations. In a nutshell an EasyConfig deployment recipe can handle the following steps: * (Bootstrap an installation and satisfy dependencies) * Download the (source) code * Verify checksums of the downloads * Unpack the downloads * Configure the build * Compile the code * Run sanity checks to verify the build was Ok * Install the (compiled) code together with it's !EasyBuild log * Generate a module file for use with a module system to configure the environment at runtime. The locations where we store source code, deployed apps, their accompanying module files, etc. are documented in the [wiki:HPC_storage#Software Storage SOP]. For many apps !EasyBuild !EasyConfigs are already available; These files are stored * On [https://github.com/hpcugent/easybuild-easyconfigs/tree/master/easybuild/easyconfigs https://github.com/hpcugent/easybuild-easyconfigs/tree/master/easybuild/easyconfigs] If there is not an easybuild file (.eb) on github and there is no eb file on the cluster (/apps/sources/EasyBuild/custom), we have to create one ourselves. First an example of an custom !EasyBuild file created for deploying NGS_DNA pipeline on the cluster. Below the code there will be the explanation of all the steps in the script. {{{ name = 'NGS_DNA' version = '3.1.2' namelower = name.lower() homepage = 'https://github.com/molgenis/molgenis-pipelines' description = """This distribution already contains several pipelines/protocols/parameter files which you can use 'out-of-the-box' to align and impute your NGS data using MOLGENIS Compute.""" toolchain = {'name': 'dummy', 'version': 'dummy'} easyblock = 'Tarball' #dependencies molname = 'Molgenis-Compute' molversion = 'v15.04.1-Java-1.7.0_80' versionsuffix = '-%s-%s' % (molname,molversion) dependencies = [(molname,molversion)] source_urls = [('http://github.com/molgenis/molgenis-pipelines/releases/download/%s/' % (version))] sources = [('%s-%s.tar.gz' % (name, version))] sanity_check_paths = { 'files': ['workflow.csv', 'parameters.csv'], 'dirs': [] } moduleclass = 'bio' }}} * '''name''' and '''version''' are pretty clear [[BR]] * '''homepage''' and '''description'''; are recommended when releasing a future release of the tool. [[BR]] * '''toolchain'''; can be left like it is in the example (only when multiple tools need to be installed before this, you should use toolchain (see manual online of Easybuild). [[BR]] * '''easyblock'''; this is the type of data, tar.gz = ‘Tarball’ , executable = ‘Binary’ . All the different easyblocks are described [https://github.com/hpcugent/easybuild-easyblocks/tree/master/easybuild/easyblocks/generic here] [[BR]] * If there any '''dependencies''' (in this case Molgenis-Compute), you put it in the name of your eb file (name will look like this: NGS_DNA-3.1.2-Molgenis-Compute-v15.04.1-Java-1.7.0_80) [[BR]] * '''Using variables''' instead of typing the same string 5 times is done with %s and then between () the name of the variable. [[BR]] * One necessary step is to set '''sanity_check_paths''', this is a check whether the file is unpacked/installed correctly. [[BR]] * All the installed eb configs are put automatically in the /apps/modules/all folder, but with '''moduleclass''' you can specify an extra module path. '''N.B.''''''' when typing the command module avail on the cluster will only display the non-all modules. So specifying an extra moduleclass is necessary to find your module back in module avail [[BR]] For installing more advanced tools like R please read the [https://hpcugent.github.io/easybuild/ documentation] online or have a look in our custom scripts on the cluster === installing a new tool === * Create YOURFILE.eb file in /apps/sources/EasyBuild/custom/ * module load !EasyBuild * eb YOURFILE.eb Before you can execute the installed !EasyBuild file the module needs to be synced to the storage and nodes: {{{ sudo -u umcg-envsync bash }}} A new environment will be loaded, afterwards sync the new module by executing: {{{ hpc-environment-sync.bash -m / }}} To sync new resources use "-r" instead of "-m". To see the full list of options in the sync script use: {{{ hpc-environment-sync.bash -h }}} === running an already existing .eb file === * eb YOURFILE.eb == FAQ == === error: The module file is already there === * rerunning an .eb file that already is installed will result in this error . To overwrite the module file run with '''''-f''''' argument. === I try to rerun an already installed tool, but my sources are not updated === * !EasyBuild will first always check if there is source code in /apps/sources, when it is not there then it will try to download the file. Removing the source code will solve this problem