Debugging parfor

70 次查看(过去 30 天)
Joan Puig
Joan Puig 2011-6-15
Hi,
We have been working on paralelizing our code, and we have found that when an error occurs inside a parfor it is hard to debug it. What methods do you use to figure out what is going on (specially when the serial version works perfectly)?
More specifically, when I run this code, I would expect to get a My:Error with the stack trace pointing to that particular line of code, but instead, we get an error pointing to an internal function and the stack trace shows the "parfor" line as being the source of the problem
clear();
clc();
r = [];
try
parfor i = 1:10
r(i) = rand(1,1);
if r(i)<0.9
error('My:Error','Try again');
end
end
catch le
le
for j = 1:numel(le.stack)
le.stack(j)
end
rethrow(le);
end
Output:
le =
MException
Properties:
identifier: 'My:Error'
message: 'Try again'
cause: {0x1 cell}
stack: [2x1 struct]
Methods
ans =
file: 'C:\Program Files\MATLAB\R2011a\toolbox\matlab\lang\parallel_function.m'
name: 'parallel_function'
line: 475
ans =
file: 'D:\SynapticPoint\SourceTrunk\Matlab\ScratchPad\scr_error_in_parfor.m'
name: 'scr_error_in_parfor'
line: 7
??? Error using ==> parallel_function at 475 Try again
Error in ==> scr_error_in_parfor at 7 parfor i = 1:10
>>

回答(3 个)

Dimitrij Chudinzow
Dimitrij Chudinzow 2017-6-29
My approach is to replace "parfor" by "for". This way you will find the line that causes troble, but unfortunately it'll take more time, since parallel computing will be disbaled for the particular loop.

Edric Ellis
Edric Ellis 2011-6-16
Firstly, there should be few differences between running your code containing PARFOR with MATLABPOOL closed and with MATLABPOOL open, except that you can set breakpoints inside code called from within functions called. I.e. if you have code like:
parfor ii=1:10
x(ii) = myFcn(ii);
end
You can set breakpoints inside myFcn().
Secondly, if you put your code inside a function rather than a script, you should get better diagnostics. I simplified your code a little:
function pfeg
try
parfor i = 1:10
if rand < 0.9
error('My:Error','Try again');
end
end
catch le
getReport( le )
end
and this now gets the output:
Error using ==> parallel_function at 598
Error in ==> pfeg>(parfor body) at 5
Try again
Error in ==> pfeg at 3
parfor i = 1:10

Joan Puig
Joan Puig 2011-6-16
Its true that the computational parts of the code generate the same errors with or without the matlabpool open, which is a good thing.
On the other hand, we have some "configuration" problems where for example:
-The java class path is not set correctly on the workers. -The state of the data cache might be different on the different workers -Database connections on the workers might be in a different state -Datafeed connections on the workers might be in a different state
All this situations are very hard to debug if we can't even find out what line of code is causing the problem
  1 个评论
Edric Ellis
Edric Ellis 2011-6-17
Hi Joan, do you *not* get the line of code in the error stack when the parfor loop is inside a function body?
And yes, we only ensure that the MATLAB path is synchronised between client and workers, you must deal with other setup that's required.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Parallel for-Loops (parfor) 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by