Parfor with Iterations Having Unequal Execution Times
4 次查看(过去 30 天)
显示 更早的评论
I've run into an interesting problem with parfor...
I'm trying to use parfor to evaluate a bunch of different input cases of a program simultaneously. I know beforehand that the cases are not going to take the same amount of time to evaluate, and I can tell you with reasonable accuracy which ones are going to take the longest. The number of cases that each worker is running is very low (about 40 total cases being distributed to 12 workers).
Some of these cases will take as much as three or four times longer than the faster ones, and so a lot of user time could be saved if I could predict what order the parfor loop was going to execute in and try to distribute the jobs intelligently.
The simple example would be this: say I have four jobs to run on two workers, three of which will take 10 seconds and one of which will take 30 seconds. Obviously the ideal solution is for one worker to evaluate the long job and the other worker to evaluate the other three in the same amount of time. What tends to happen, though, is that both workers will evaluate one of the short jobs first, and then one of them will wind up with 40 seconds of working time while the other worker sits idle for half that time.
Even more annoying, I think that sometimes there will be an available job sitting in the "queue" with a worker available, but it won't run because that job is supposed to go to a worker that is busy. The example here would be two 10-second jobs and two 30-second jobs, and one worker winds up evaluating both 30-second jobs.
Anyone have any clever ideas of how to work around this problem?
0 个评论
采纳的回答
Titus Edelhofer
2011-12-7
Hi,
if your cases are input cases you might try to use the (slightly more elaborate way) of distributing the code using a job with 40 tasks. It helps at least for the distribution of the tasks, because they get distributed one by one. You might still end up that the longest run is the last in the row to be computed.
One step further would be to create more then one job where you distribute your tasks more clever.
Titus
更多回答(4 个)
Jaap
2011-12-7
is it possible to make an inner loop that goes through the faster cases?
or: if you know everything beforehand, make two parfor loops. while in the first, you will do all short/fast cases (hoping that rem(12,cases)==0). the next loop, you will do the rest in the same fashion.
Titus Edelhofer
2011-12-7
Hi,
here a rather ugly way which gives you full control on which worker processes which index:
function myparfortest
% this is a usual parfor loop
x = zeros(200, 1);
parfor i=1:200
x(i) = rank(magic(i));
end
% use spmd to do the same, although
spmd
y = zeros(200, 1);
% suppose we have 3 workers:
% worker1: process i=1, 4, 7, ...
% worker2: process i=2, 5, 8, ...
% worker3: process i=3, 6, 9, ...
for i=labindex:numlabs:200
y(i) = rank(magic(i));
end
end
% now collect the result
Y = zeros(200, 1);
for i=1:length(y)
yi = y{i};
Y(i:length(y):end) = yi(i:length(y):end);
end
% let's see if we did right:
isequal(x, Y)
Titus
Mikheil Azatov
2016-12-16
A bit late to the party, but have you figured this out since ? I'm having the same issue with R2016b.
0 个评论
Martin Ryba
2021-4-7
Hi I've noticed similar behavior all the way to r2020a. I have a big parfor loop where the execution time of each step is on the order of 10 minutes and I need to do 5000+ of them. I can get 100-150 workers. This gets submitted to a cluster (LSF) so if other jobs are on the node it will slow that node down. So, it appears the work unit allocation is frozen at the start of the loop, and you get to wait for the slowest of your workers to finish his list of jobs before the loop completes. Any good workarounds to give a more dynamic queue since the activity to manage the queue is small compared to the duration of each task?
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Parallel Computing Fundamentals 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!