avoid sending large array to all workers in parfor loop

Hello --
I'm processing some fairly large point clouds which are stored as 8-column tables with billions of rows (stored as datastores/tall arrays, but that's probably not important here).
In my workflow I load up a reasonable chunk of the point cloud (~3.5 GB) into memory and then I ultimately generate an image based on the data. The image is 37x44 pixels which means that I'm indexing into this point cloud 37 times row-wise and 44 times column wise. This is a very parallelizable task and am running the outer loop with parfor. However, frequently, I'm erroring out because workers abort. The workers seem to abort when my memory hits my limit (32 GB).
I think my problem is obvious when you look at my code below but i'm not sure the best way to fix it. Note below where pcl_line is subset from pcl. I'm assuming that here pcl (which is my 3.5 GB variable) is still being sent to every worker which seems bad. How can I avoid this though? Is this a job for C = parallel.pool.constant(pcl)? Seems promising but my knowledge here is a bit shaky. If not, other thoughts? -- Thanks much, Mike
%set up holders for image outputs
tot_skew = NaN(n_num_px, e_num_px); %just two example output images of many
tot_kurt = NaN(n_num_px, e_num_px);
parfor ii = 1:length(n_chunk_bounds)-1
%temporary vars by row
temp_skew = NaN(1,e_num_px);
temp_kurt = NaN(1,e_num_px);
%slice pcl by line (1/37th size of pcl since there are 37 rows in output image
sub_idx_n = find(pcl.n<n_chunk_bounds(ii) & pcl.n>=n_chunk_bounds(ii+1));
pcl_line = pcl(sub_idx_n,:); %<--guessing this is the problem since pcl is still inside the parfor loop?
%run code for each output pixel for a given image line
for j = 1:length(e_chunk_bounds)-1
sub_idx = find(pcl_line.e>=e_chunk_bounds(j) & pcl_line.e<e_chunk_bounds(j+1));
temp_skew(j) = skewness(pcl_line.h(sub_idx));
temp_kurt(j) = kurtosis(pcl_line.h(sub_idx));
end
%final assignment
tot_skew(ii,:) = temp_skew;
tot_kurt(ii,:) = temp_kurt;
end

4 个评论

MATLAB does not even try to examine the values in n_chunk_bounds and pcl.n to try to figure out ahead of time which subsection of pcl will be needed by any given worker: it has to send all of pcl to every worker.
Is it feasible to replace that code by indexing that is linear calculation based upon ii ?
If not, then extract the chunks outside the parfor loop, into a cell array, and index the cell array inside the loop.
Hi -- I'm not sure I quite understand the suggestion: "Is it feasible to replace that code by indexing that is linear calculation based upon ii?" Could you explain this a bit more?
With regard to the cell array, does this imply that if I pre-break up pcl into it's chunks into a cell array that can ben indexed by ii, that only the ii'th chunk will be sent to worker rather than full pcl_cell?
You are currently breaking up the area based upon values stored in the vector n_chunk_bounds indexed at ii. MATLAB does not look back and analyze how those bounds are created but you can. For example it could hypothethically be the case that n_chunk_bounds(ii) = some_minimum + some_integer_stride * ii -- a linear equation. If so then even though computing n_chunk_bounds ahead of time would seem to be more efficient, MATLAB would find it easier to analyze the portions of pcl that are needed if the n_chunk_bounds(ii) were replaced with some_minimum + some_integer_stride * ii inside the parfor: then it would be able to figure easily that it should send some_integer_stride width to each worker, with proper formulation.
This can only work if the chunks to be extracted are consistent size.
Otherwise you should break them up ahead of time into cell arrays, as parfor does know to only send the memory associated with the content of the indexed cell to the worker.
Ok, I get that and think that's likely possible. For the moment, I went ahead and tried the cell suggestion. This seems to be working well. I'd consider this answered. Thanks for the ideas.
Mike

请先登录,再进行评论。

 采纳的回答

[...] extract the chunks outside the parfor loop, into a cell array, and index the cell array inside the loop.

更多回答(0 个)

类别

帮助中心File Exchange 中查找有关 Matrix Indexing 的更多信息

产品

版本

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by