Context Navigation

Changes between Version 6 and Version 7 of MolgenisProcessing

Timestamp:: 2010-10-16T15:23:01+02:00 (15 years ago)
Author:: Morris Swertz
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

MolgenisProcessing

-                      v6
+                      v7
  * Export R data annotation packages?
 == Notes ==
+== PBS best practices ==
+* We use Freemarker to define templates
+* We use shell scripts to execute jobs
+* PBS supports dependencies
+* Each step should check for completion and have proper return values (so PBS knows and can cancel dependent jobs)
+Overview:
+  * We use Freemarker to define templates of jobs
+  * We generate for each job one <job>.sh
+  * We generate one submit.sh for the whole workflow
+  * The whole workflow behaves like 'make': it can recover from failure where it left of
+  * The workflow shares one working directory with conventions to ease inter-step variable passing
+Main ingredients:
+* '''The workflow works on a data blackboard'''
+  * The whole workflow uses the same working directory (= blackboard architecture pattern)
+  * We use standard file names to reduce inter-step parameter passing (= convention over configuration)
+  * Naming convention: <unit of analysis>_<name of step>.<ext>
+  * For example in NGS lane (unit) alignment (step): {{{<flowcell_lane>_<pairedalign>.bam}}}
+* '''Make style submit.sh'''
+  * Each line puts one command in the qsub queue
+  * We solve dependency ordering using {{{-W depend=afterok:job1:job2}}} option
+  * Use of proper return values will ensure dependent jobs are canceled on fail
+* '''Recoverable steps job<step>.sh'''
+  * We generate a .sh file for each job including standard logging
+  * Each script checks if the output is already there (otherwise it can be skipped)
+  * Each script checks if it has produced its output (otherwise return error)
+  * N.B. check file existence using {{{if ! test -h FILE return -1}}}