Problem using parfor for reading variable sized chunks of data into a larger pre-allocated container

3 次查看(过去 30 天)
Hi,
I have an problem where I have pre-allocated a large matrix or vector, a, and where I will read data blocks from a large number of files that will be inserted at the right indexes in a. The different files, and resulting blocks, will typically have different size.
A simple example:
a = zeros(15,1); % pre-allocated vector
b = [1,10;11,15]; % each row contains the start to stop index for each block
parfor i = 1:size(b,1)
a(b(i,1):b(i,2),:) = i*ones((b(i,2)-b(i,1))+1,1);
end
With 'for' instead of 'parfor' it works as intended.
Any tips or solutions which don't reduce the performance I am trying to obtain by using parfor in the first place?
Thanks,
Oyvind

采纳的回答

Edric Ellis
Edric Ellis 2020-10-21
编辑:Edric Ellis 2020-10-21
There's no simple way to do this without at least some duplication of data. With some duplication of data, you could do something simple like this:
aCell = cell(1, size(b,1));
parfor i = 1:size(b,1)
aCell{i} = <stuff>; % return each block in its entirety
end
a = vertcat(aCell{:}); % concatenate all cell entries into the final result
If that is not sufficiently performant, you could consider using parfeval to give you a little more control, but this is more difficult to code, and may not actually save you much. Here's an untested sketch though:
a = zeros(15,1);
for i = 1:size(b,1)
fut(i) = parfeval(@doStuff, 1, b(i,1), b(i,2)); % invoke doStuff(b(i,1),b(i,2))
end
for i = 1:size(b,1)
[idx, result] = fetchNext(fut); % collect the next result
% (note that 'idx' tells you the index into 'fut' that just
% completed)
a(b(idx,1):b(idx,2),:) = result; % push the result into 'a'
end
  4 个评论
Oyvind Heg
Oyvind Heg 2020-10-23
Thank you for the answer.
I'v done a quick profiler example:
parfor:
1 result_cell = cell(N,1);
2 parfor i = 1:N
3 result_cell{i} = readData(...);
4 end
parfeval:
5 fut(1:N) = parallel.FevalFuture;
6 for i = 1:N
7 fut(i) = parfeval(@readData,...)
8 end
9 for i = 1:N
10 [idx,result] = fetchNext(fut);
11 end
My observations when running the profiler are:
The 'for' loop on line 2 and line 9 take about the same amount of time (about 12 seconds in my example).
However line 7 takes as long as 8 seconds. Is that to be expected? Seems like a lot of overhead when the actual work takes 12 seconds.
Thanks,
Oyvind

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Parallel for-Loops (parfor) 的更多信息

标签

产品


版本

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by