| | 6 | |
| | 7 | === Where to put reference data === |
| | 8 | |
| | 9 | Reference data sets available to all (Hence not group specific data) can be deployed ''as-is'' in: |
| | 10 | {{{ |
| | 11 | /apps/data/${provider}/${data_set}/$version/ |
| | 12 | }}} |
| | 13 | When reference data must be modified for example because it must be indexed / reformatted for use with specific version of software, you must put the derived version in a sub dir to indicate it is not the original. When it was modified for a specific version of an app you could for example create additional sub dirs like this: |
| | 14 | {{{ |
| | 15 | /apps/data/${provider}/${data_set}/$version/${app}/${version}/ |
| | 16 | }}} |
| | 17 | Always add a {{{/apps/data/${provider}/${data_set}/$version/README}}} with at least details on: |
| | 18 | * What the source location of the data was. |
| | 19 | * When it was download. |
| | 20 | * If a derived flavor was created: how the data was modified (link to eLabjournal and/or code in our GitHub repos) and for what purpose. |
| | 21 | |
| | 22 | {{{#!comment |
| | 23 | TODO: add example for GRCh38 |
| | 24 | |
| | 25 | /apps/data/GRC/GRCh/38/ * data gedownload "as is"; hooguit uitgepakt |
| | 26 | /apps/data/GRC/GRCh/38/BWA/0.7.12-goolf-1.7.20/ * een setje relatieve symlinks naar de referentie sequenties: ../../uitgepakte referentie fasta seqs |
| | 27 | * bwa indices voor deze referentie |
| | 28 | }}} |
| | 29 | |
| | 30 | === Syncing deployed reference data to nodes === #SyncRefData |
| | 31 | |
| | 32 | Before you can use reference data on cluster nodes it needs to be synced to various places. |
| | 33 | * Switch to the ''envsync'' user: |
| | 34 | {{{ |
| | 35 | $> sudo -u umcg-envsync bash |
| | 36 | }}} |
| | 37 | * Now sync the reference data by specifying the path to the data set relative to /apps/data/ (or specify the complete absolute path if you like to type). The sync will work recursively. |
| | 38 | {{{ |
| | 39 | $> hpc-environment-sync.bash -r ReferenceData/ |
| | 40 | }}} |
| | 41 | or |
| | 42 | {{{ |
| | 43 | $> hpc-environment-sync.bash -r /apps/data/ReferenceData/ |
| | 44 | }}} |
| | 45 | * For a full list of options use the commandline help: |
| | 46 | {{{ |
| | 47 | hpc-environment-sync.bash -h |
| | 48 | }}} |