| 1 | [[TOC()]] |
| 2 | = Compute Roadmap = |
| 3 | |
| 4 | This pages describes plans to make Compute even better. |
| 5 | |
| 6 | NB: the features below have still to be put on a release schedule! |
| 7 | |
| 8 | == Finish creation of tests for all important configurations == |
| 9 | * local, pbs, grid |
| 10 | * impute, align |
| 11 | |
| 12 | == Make it easier to insert additional step/remove a step == |
| 13 | * requires seperation between 'per protocol input/output parameters' and 'workflow links' |
| 14 | * should not be extra work; could do automatic mapping? |
| 15 | * should be solved in the workflow.csv (instead of control flow, do data flow, i.e. create list of output-input edges) |
| 16 | |
| 17 | == Auto generate the list of parameters that you need == |
| 18 | * could automatigically be filled from templates or, if we have it, the data flow? |
| 19 | |
| 20 | == Better error reporting on input == |
| 21 | * doesn't list which variable is missing |
| 22 | * templates are missing |
| 23 | * syntax checking of CSV files |
| 24 | |
| 25 | == Monitoring of progress == |
| 26 | * how far is the analysis |
| 27 | * succesfully or wrongly, during running (could be done with #end macro) |
| 28 | * add a job at the end that creates report of results (for commandline) |
| 29 | |
| 30 | == Monitoring of success and resource usage == |
| 31 | * have harmonized method to report 'success' or 'error', incl message + runtime |
| 32 | * include also stuff like max, min etc. |
| 33 | |
| 34 | == Make transparant/unimportant for the user which backend is actually used == |
| 35 | * system decides where it (can) run: cluster, pbs, grid, local |
| 36 | * needs flexible file manager that can 'stage' data for any backend |
| 37 | * like to restart on other backend |
| 38 | |
| 39 | == Transparent server to stage data == |
| 40 | * to easily move pipeline to other storage. |
| 41 | |
| 42 | == Restart from specific step == |
| 43 | * remove files so far |
| 44 | * reduce problem by using *tmp file and and only 'mv' if step succesful |
| 45 | * however, this doesn't work if we want to restart |
| 46 | * could use folders per step, so you could delete the folders from step onwards |
| 47 | * can be solved by good practice |
| 48 | |
| 49 | == Store pilot job id in the task in the database == |
| 50 | * I need to know which pilot jobs have died, and which tasks were associated with it |
| 51 | * Then I can re-release tasks to they can be done by another pilot |
| 52 | |
| 53 | == Like to have a 'heartbeat' for jobs == |
| 54 | * so I can be sure a (pilot) job is still alive |
| 55 | * could us a 'background' process that pings back to database |
| 56 | * could also be used for pbs jobs |
| 57 | |
| 58 | == Add putFile -force to manual == |
| 59 | |
| 60 | == Enable easy merging of workflows, merging of parameters == |
| 61 | * Easily combine protocols from multiple workflows |
| 62 | * wants less parameter files |
| 63 | * meanwhile allow multiple worksheets |
| 64 | |
| 65 | == Get rid of parameters.csv and instead create worksheet == |
| 66 | * so parameter names on first row |
| 67 | * hasOne using naming scheme A_B, means B has one A |
| 68 | * conclusion: use multiple headers. |
| 69 | * allow -parameters and -tparameter |
| 70 | |
| 71 | == Cleanup backend specific protocols == |
| 72 | * e.g. 'touch' commands |
| 73 | |
| 74 | == Visualization framework for analyses runs == |
| 75 | |
| 76 | == Rearchitecture the components == |
| 77 | * one interface, multiple implementations |
| 78 | * unit tests |
| 79 | |
| 80 | == Make submit.sh uses 'lock' file so the first job only ends when all is submitted == |
| 81 | * Problem is that first jobs fails quickly, many dependent jobs are not yet submitted, and get orphaned |
| 82 | * so dependent jobs can be submitted and never have issue (alex feature) |
| 83 | * at this in the #end macro so jobs can never finish until complete workflow is submitted |
| 84 | |
| 85 | == Seperate protocols from code == |
| 86 | * yes, seperate github repo |
| 87 | * should enable to combine multiple protocol folders, multiple parameter filse |
| 88 | * should indicate at which compute version it works |
| 89 | |
| 90 | == Users for runs, priorities? == |
| 91 | * needed if we want 'priority queue' for pilot jobs |
| 92 | |
| 93 | == Publish!== |
| 94 | |
| 95 | == Approach == |
| 96 | Clean start = yes |
| 97 | seperate development 5 from using 4 (bug fixes) = yes |
| 98 | database, commandline or both = both |
| 99 | release schedule -> roadmap |
| 100 | backwards compatibility = no |
| 101 | |
| 102 | |
| 103 | |