wiki:ComputeStartDataModel

Version 2 (modified by george, 14 years ago) (diff)

--

MCF Data Model

The main goal of our model is to unambiguously specify a bioinformatics workflow execution in a distributed compu- tational environment. The model includes the entire workflow specification and specifications for individual workflow tasks. Hence, we divide our model into two structural layers or interfaces. These are a job interface to specify the work- flow and tool interface to specify an individual workflow task. These interfaces are used in two main framework use scenarios:

  • specifying a workflow or job for execution, and
  • adding a new tool to the framework tool repository.

Hence, analyses provided by the tool can be included into workflows. Let us examine these models in more detail. The tool model is similar to the Galaxy tool file, where a tool is described as a set of operations, which it can perform. A tool operation can participate in workflow execution. Our platform is aimed to support specifying any external analysis tool, which can be invoked from a command line or be run as an executable script (e.g. a shell or R-script). In contrast with the Galaxy description, we do not have complex operation parameters in the model. Instead, we treat an operation with differ- ent parameters as a set of separate operations. A list of tool operations grows depending on the complexity of the operations parameterisation. Still, the specification of an individual operation remains simple.

Attachments (2)

Download all attachments as: .zip