Changes between Version 17 and Version 18 of ComputeStartDefault
- Timestamp:
- 2010-11-09T10:54:03+01:00 (14 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
ComputeStartDefault
v17 v18 14 14 Extra Workers are pre-started and stopped by Resident Worker. Resident Worker receives a command from Resource Manager and starts Extra Workers by submitting a script to the cluster scheduler to start them. After being started, Extra Workers communicate to Job Manager and register themselves. In practice, it can take more time to pre-start many Extra Workers for direct parallel execution of analysis operations than submit scripts to a cluster scheduler to execute the same operations. Furthermore, running many Extra Workers in the system increases the network load on the Job Manager node. Still, Extra Workers can be efficiently used in the system having an advanced strategy to pre-start them, that is planned to be developed in the future. 15 15 16 Resource Manager is required only if a computational cluster is used in the system. Its logic is rather simple and di-rectly depends on the policies of the cluster used. We tested our framework on the [http://www.rug.nl/cit/hpcv/faciliteiten/HPCCluster?lang=en Millipede HPC cluster] , which appears in the TOP500 supercomputers list. This cluster has a policy that any cluster job execution should not exceed the ten days limit to assure availability of cluster resources to all users. This means, that Resident Worker cannot run longer that ten days either. In our current implementation to keep a cluster as a part of our computational cloud, Resident Worker starts a new Resident Worker node in some time before it will be removed by the cluster administrator, e.g. two days before the end of a ten-days period. A request for starting a new Resident Worker is passed to the cluster scheduler and processed in some time depending on a cluster load. Hence, we assure that at least one Resident Worker is running on the cluster.16 Resource Manager is required only if a computational cluster is used in the system. Its logic is also straightforward and directly depends on the policies of the cluster used. We tested our framework on the [http://www.rug.nl/cit/hpcv/faciliteiten/HPCCluster?lang=en Millipede HPC cluster] , which appears in the TOP500 supercomputers list. This cluster has a policy that any cluster job execution should not exceed the ten days limit to assure availability of cluster resources to all users. This means, that Resident Worker cannot run longer that ten days either. In our current implementation to keep a cluster as a part of our computational cloud, Resident Worker starts a new Resident Worker node in some time before it will be removed by the cluster administrator, e.g. two days before the end of a ten-days period. A request for starting a new Resident Worker is passed to the cluster scheduler and processed in some time depending on a cluster load. Hence, we assure that at least one Resident Worker is running on the cluster. 17 17 18 18 MCF Compute Manager code base and examples can be found [http://www.molgenis.org/wiki/ComputeStartExamples here].