Parfor hangs during execution

How i can solve this problem that freeze parfor during execution?
Operation terminated by user during distcomp.remoteparfor/getCompleteIntervals (line 225)
In parallel_function>distributed_execution (line 823)
[tags, out] = P.getCompleteIntervals(chunkSize);
In parallel_function (line 590)
R = distributed_execution(...

11 个评论

You'll need to provide us with the parfor code you're running. This error message only says you intentionally did a "ctrl+c" command to stop the code.
The parfor that i execute it's the same that yesterday works fine. The code is the follow:
for it = 1:something
% some statment
parfor index = 1:nSeedPopulation
% Seed point passed to evolution
seedPoint = seedPopulation.Value.Individuals(index);
% Number of flies that will be reproduced
nCrossover=2*round(constant.crossoverProbability.Value*nFliesPopulation/2);
% Evolution of seed
[fliesPopulation(index), spamPopulation(index)] = Algorithms.TeMA.evolution(...
seedPoint, fliesPopulation(index), collectPopulation(index),...
nCrossover, primary, sizeOfRing, constant);
end
end
How long does this take without parfor? Use profile to figure out the bottleneck https://www.mathworks.com/help/matlab/ref/profile.html. Debugging in parfor is hard because you can't stop the code at the slowest step, and profiler doesn't time each function inside parfor.
>>profile on
index = 1;
seedPoint = seedPopulation.Value.Individuals(index);
nCrossover=2*round(constant.crossoverProbability.Value*nFliesPopulation/2);
[fliesPopulation(index), spamPopulation(index)] = Algorithms.TeMA.evolution(...
seedPoint, fliesPopulation(index), collectPopulation(index),...
nCrossover, primary, sizeOfRing, constant);
>>profview
I have some information: the first ~7/10 iteration of external for works correctly, very fast; after this it's very slowly like normal for loop. Each iteration of for loop take 3/4 secs, after the freeze each iteration run more than 1 min.
This is the the output of profview during parfor. I didn't understand that you want only the execution of function.
This is the output of single run of function.
Hm... This seems to have happened to multiple people. Is any function generating or passing on large data sets (> 2GB)?
Seemed like breaking up the data into smaller chunks helped one person.
But why sometimens code work correctly and sometimes no?
That's hard to say. Do you have a random number generator in the software? Perhaps memory usage differs, depending on what other software are running at the same time. There's also a job timing issue going on in the back, which always changes between runs depending on the current jobs a CPU has to run. Also, you could get a deadlock situation randomly, if multiple workers are trying to read/write to the same file at the same time. Are you doing some sort of read/write operation?
https://en.wikipedia.org/wiki/Deadlock Deadlocks are issues unique to parallel computing.
I know what is deadlook but there aren't. My code works fine for first 10 iteration and after this hangs execution. Two days ago all works fine, yesterday no. Now I'm looking what are the situation for today.
I'm trying to install previous version of matlab, maybe 2017a/b. I am really disappointed of MATLAB's behavior

请先登录,再进行评论。

回答(6 个)

I am having this same issue for the last couple of days.
Gergely Papp
Gergely Papp 2019-7-2

1 个投票

I have encountered the same issue, both with 2017/b and 2019/a. The calculations launch in the parallel workers, but along the way they hang 1-by-1. They would hang for days if I don't kill them manually.
Andrea Stevanato
Andrea Stevanato 2018-7-16

0 个投票

It's possible to restart all workers without delete(gcp)?
Jayaram Theegala
Jayaram Theegala 2018-7-16
编辑:Jayaram Theegala 2018-7-16

0 个投票

Hello Andrea,
In order to understand the issue better, can you try reducing the number of iterations on your first loop, by setting "it" variable used in your first "for" loop to 1. Also, it may be helpful to just start your pool with 1 worker, and see if the issue still persists.
If the issue continues to happen even after the above changes, provide the following information:
1) Brief information about the function used within the parfor loop: Algorithms.TeMA.evolution
2) Does the issue happen without the above function?
3) Finally, it is generally advisable for outer-most loop to be parfor and the inner loops being for loops. https://in.mathworks.com/help/distcomp/nested-parfor-loops-and-for-loops.html
I am having a similar problem. A simple demo code is shown below, which enters an infinite loop within the parfor. This is on a 2-core MacBook Pro with 3.3 GHz Intel i7 processor. Running Matlab R2017b, but the same code seems to cause problems on a machine running R2019a.
k = 1e10;
tic
for i = 1:k
tan(i);
end
toc
parpool(2);
tic
parfor i = 1:k
tan(i);
end
toc

2 个评论

is ther any updates I am encountering same issue with both matlab2019a and matlab 2020b. I am using particle swarm parallel setting, the optimization hangs in the middle of optimization, and won't continue untill i restart the system, very annoying
Same with me. Particle swarm function on Matlab 2020b in parallel macbook with 6 cores.

请先登录,再进行评论。

I've been stucked in this problem for couples of weeks, and fixed it by removing "continue" in an if-judgement and a for-loop.
for CondA
...
if CondB
continue; % Avoid using "continue"
end
...
end

类别

帮助中心File Exchange 中查找有关 Introduction to Installation and Licensing 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by