parfor variable classification issue revisited

3 次查看(过去 30 天)
I have a million (literally) text files that I need to read a number from. I currently do this in a nested loop as such:
len_A = 5;
len_B = 6;
len_C = 7;
len_D = 8;
len_E = 9;
output = zeros(prod([len_A, len_B, len_C, len_D, len_E]), 6);
for ind_A = 1 : len_A
for ind_B = 1 : len_B
for ind_C = 1 : len_C
for ind_D = 1 : len_D
for ind_E = 1 : len_E
line_num = sub2ind([len_E, len_D, len_C, len_B, len_A], ind_E, ind_D, ind_C, ind_B, ind_A);
% Real Script
% open a file from the disk, read in a number
% output_temp(count, :) = [line_num, ind_A, ind_B, ind_C, ind_D, ind_E, the number from line above];
% Example Script
output(line_num, 1:6) = [line_num ind_A, ind_B, ind_C, ind_D, ind_E];
end
end
end
end
end
This is time intensive. Since my disk and processor are not maxed out, I wanted to do this in parallel and speed it up. Based on: https://www.mathworks.com/matlabcentral/answers/838625-parfor-variable-classification-issue, I tried:
output = zeros(prod([5, 6, 7, 8, 9]), 6);
% output = zeros(1, 7);
parfor ind_A = 1 : 5
output_temp = zeros(prod([6, 7, 8, 9]), 6);
count = 0;
for ind_B = 1 : 6
for ind_C = 1 : 7
for ind_D = 1 : 8
for ind_E = 1 : 9
count = count + 1;
line_num = sub2ind([9, 8, 7, 6, 5], ind_E, ind_D, ind_C, ind_B, ind_A);
% Real Script
% open a file from the disk, read in a number
% output_temp(count, :) = [line_num, ind_A, ind_B, ind_C, ind_D, ind_E, the number from line above];
% Example Script
output_temp(count, 1:6) = [line_num, ind_A, ind_B, ind_C, ind_D, ind_E];
end
end
end
end
max_line_num = sub2ind([9, 8, 7, 6, 5], 9, 8, 7, 6, ind_A);
min_line_num = max_line_num - prod([9, 8, 7, 6, 1]) + 1;
output(min_line_num : max_line_num, :) = output_temp;
end
I am unable to figure out how to make this work. I would truly appreciate any help you could provide.

采纳的回答

Walter Roberson
Walter Roberson 2023-8-11
Clear a multidimensional array. parfor along one of the dimensions, preferably the last.
Within the parfor loop, use nested for loops and multidimensional indexing to assign values to a temporary array that is the right size except for being length 1 along the dimension you are parfor over. After you have assigned all the values to the temporary array,
output(:,:,:,:,INDEX, :) = output_temp;
If you need to, then after the parfor loop, reshape() to collapse those other dimensions.
It is important that the only place you write into the output variable, that the indices be one of ":", or an expression that is constant throughout the parfor, or a linear transform of the parfor variable. Using a computed range like you are doing is Not Permitted.
  2 个评论
Craig
Craig 2023-8-18
编辑:Craig 2023-8-18
By following Walter's suggestions, and after some work such as changing the parfor from Walter's recommendation of the last index to the first, this is what I finally got to work for me:
len_A = 5;
len_B = 6;
len_C = 7;
len_D = 8;
len_E = 9;
output = zeros(len_A, len_B, len_C, len_D, len_E, 6);
parfor ind_A = 1 : len_A
output_temp = zeros(len_B, len_C, len_D, len_E, 6);
for ind_B = 1 : len_B
for ind_C = 1 : len_C
for ind_D = 1 : len_D
for ind_E = 1 : len_E
line_num = sub2ind([len_E, len_D, len_C, len_B, len_A], ind_E, ind_D, ind_C, ind_B, ind_A);
% Real Script
% open a file from the disk, read in a number
% output_temp(count, :) = [line_num, ind_A, ind_B, ind_C, ind_D, ind_E, the number from line above];
% Example Script
output_temp(ind_B, ind_C, ind_D, ind_E, 1:6) = [line_num ind_A, ind_B, ind_C, ind_D, ind_E];
end
end
end
end
output(ind_A, :, :, :, :, :) = output_temp;
end
output = reshape(output, prod([len_A, len_B, len_C, len_D, len_E]), 6);
output = sortrows(output, 1);
Walter Roberson
Walter Roberson 2023-8-18
The reason I suggested parfor over the last dimension instead of the first, is that the way multidimensional arrays are stored, the any leading : dimensions are stored in consecutive memory -- so if you had A(:,:,idx) then A(1:end,1:end,idx) would be stored in consecutive memory. But if you had A(idx,:,:) then each piece of data would be size(A,1) apart from each other in memory, which is not as efficient to transfer as consecutive memory.

请先登录,再进行评论。

更多回答(1 个)

Jeff Miller
Jeff Miller 2023-8-16
编辑:Jeff Miller 2023-8-18
Maybe something like this would be helpful, using the wonderful allcomb.
idx = allcomb(1:5,1:6,1:7,1:8,1:9);
nrows = size(idx,1);
output = zeros(nrows,6);
parfor ind_row = 1:nrows
idx_A = idx(ind_row,1);
idx_B = idx(ind_row,2);
idx_C = idx(ind_row,3);
idx_D = idx(ind_row,4);
idx_E = idx(ind_row,5);
result = yourActualFn(idx_A,idx_B,idx_C,idx_D,idx_E);
output(ind_row,:) = [idx(1:5), result];
end
  2 个评论
Craig
Craig 2023-8-18
Thanks for the reply Jeff. This might allow the calculation of the "line_num", but I don't see how it would allow me to do all the other work in the real script.
Jeff Miller
Jeff Miller 2023-8-18
@Craig, Glad you got the problem solved.
Just for future reference, I edited the script to make it clearer what I thought you might do. Could be that I don't understand what other work you want to do in the real script, though.

请先登录,再进行评论。

产品


版本

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by