How to stop all workers simultaneously when an error occurs in one of the workers?

3 次查看(过去 30 天)
Hi guys
I am working with parpool with n number of workers. It is likely that one of the workers returns error at some points. So, I would like to catch error by means of:
parfor i = 1:length(Data)
Try
Simulation(i);
catch ME
stop all workers; % Not the parpool.I want the workers to stop doing %simulations. I do not want them to be closed
change something in Simulation(i);
start workers to do simulation(i);
continue;
end
end
and make some changes and start workers again.
Could you please let me know how to handle it?
Regards,
Vahid

回答(2 个)

Edric Ellis
Edric Ellis 2015-7-27
You can do this using parfeval to send off individual tasks for execution on the workers, and then you can call cancel() on those tasks if you spot an error. Something like this:
% Initiate the work on the workers:
for i = 1:length(Data)
f(i) = parfeval(@Simulation, 1, i);
end
% Check the results, cancel all execution if an error is spotted
completedSuccessfully = true;
for i = 1:length(f)
try
[idx, result] = fetchNext(f);
catch E
% Get here if a simulation threw an error
cancel(f);
completedSuccessfully = false;
break;
end
end
if ~completedSuccessfully
% do stuff...
end

Walter Roberson
Walter Roberson 2015-7-24
You can cancel() task objects. I think at one point I saw a way to determine all of the task IDs, but that is not something I have researched.
  2 个评论
Vahid Ghorbanian
Vahid Ghorbanian 2015-7-25
Walter
Thank you for the response. How can I create object? I do not know if each worker has to have its own object or not. Does the object have to be introduced in the parfor or outside of it?!! Could you please sent a sample code to do what I need?
Walter Roberson
Walter Roberson 2015-7-25
For example,
CreateTask(j, @Simulation, num2cell(1:length(Data)))
and once you have found an error and want to restart, perhaps use recreate(j)
At the moment I do not see a way to access the results of one task other than to know which state it is in. I have not used these facilities so I am likely overlooking something.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Parallel Computing Fundamentals 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by