wiki:ComputeStartExample7

Job Monitoring

Job execution is monitored by reading a log file of the job. A job consists of a number of steps which consists of a number of operations (MCF Data Model). A step of the job can be started when all operations of a previous step are finished. Operations in a step can be executed in parallel. Every operation writes to a job log file. Monitoring is implemented by reading job log files. Monitoring is called from PipelineThread after the step is submitted for execution:

private LoggingReader monitor = new LoggingReader();

...

int numberOfSteps = pipeline.getNumberOfSteps();

for (int i = 0; i < numberOfSteps; i++)
{
     Step step = pipeline.getStep(i);

    //here, step is submitted for execution
    
    ...
    
    monitor.setStep(step);

    while (!monitor.isNotFinishedStep())
    {
    	monitor.checkStepStatus();

        try
        {
        	Thread.sleep(5000);
        }
		...
	//continue to a next step  	
 	...

In actual reading of the remote logging file is done using gridgain in the next method of LoggingReader, where RemoteLoggingReader just reads and transfer the logging to a Job Manager node.

public void checkStepStatus()
{
	Future<RemoteResult> future = exec.submit(new RemoteLoggingReader(log_location));
    RemoteResult back = null;

	try
	{
    	     back = future.get();
        }
        ... 

    String logging = new String(back.getData());

	summary.scripts_started = 0;
	summary.scripts_finished = 0;
	summary.scripts_all = currentStep.getNumberOfScripts();

	for (int i = 0; i < summary.scripts_all; i++)
	{
		String script_id = currentStep.getScript(i).getID();

		int index_started = logging.indexOf(script_id + _STARTED);
		int index_finished = logging.indexOf(script_id + _FINISHED);

		if (index_started > 0) summary.scripts_started++;
		if (index_finished > 0) summary.scripts_finished++;
    }

        if (summary.scripts_finished == summary.scripts_all)
            isStepFinished = true;
       ...

Back to Examples

Last modified 13 years ago Last modified on 2010-11-11T15:32:45+01:00