Data Limit for Parallelisation and detailed questions about parallel scheduling
1 次查看(过去 30 天)
显示 更早的评论
Dear all,
I've run into a strange problem for a parallelisation task for which I have not found the solution yet. I use the spmd-environment and codistribute a large array within this environment. The problem is, that I don't know beforehand, how large the solution array will be, so I'm trying of over-estimate the solution array. I also create the solution array within the spmd-environment in order to use the results in a composite-style after spmd is finished. Now the problem that I don't understand: I can pass the data to the workers and codistribute the input-array and create the solution array within each worker. No problems there. But as soon as I introduce the code that calculates the solution for the solution array, Matlab throws an "error in distcompserialize, Error during serialisation". I find this strange because the amount of data passed into the worker does not change. I thought for some time, that the 2GB-limit might also apply for data created within the worker, but that wouldn't explain why I can pass the data to the workers and create the solution array.
Code that works (spmd-block only):
spmd
disp('Starting Lab...');
d_dist = codistributed(EoraCoDistCalc, codistributor1d(1));
d_local = getLocalPart(d_dist);
disp(['local size is ' num2str(size(d_local))]);
transferG = zeros(100000000,5); % This array should holds the results
line = 1;
% Calculation part is commented out.
%for i=1:size(d_local,1) %Loop over S1 elements
%
% % store Eora row and col locally to avoid multiple look ups of those
% % coordinates in the pre and post conc matrices
%
% EoraRow = d_local(i,2); EoraCol = d_local(i,3);
%
% % find destination coordinates in CREEA for current elements in Eora
%
% CREEArowsubs = find(PreConc(EoraRow,:)); % find those values that are non-zero in that row (i.e. the ones that the element corresponds to)
% CREEAcolsubs = find(PostConc(EoraCol,:));
%
% CREEArowvals = PreConc(EoraRow,CREEArowsubs);
% CREEAcolvals = PostConc(EoraCol,CREEAcolsubs);
%
% Nrow = length(CREEArowsubs); Ncol = length(CREEAcolsubs);
%
% NValues = Nrow*Ncol;
%
% transferG(line:line+NValues-1,:) = [repmat(d_local(i,1),NValues,1) repmat(d_local(i,4),NValues,1) repmat(CREEArowsubs',Ncol,1) reshape(repmat(CREEAcolsubs, Nrow, 1),NValues,1) reshape(CREEArowvals'*CREEAcolvals,NValues,1)];
% line = line+NValues;
%
% end
%
% transferG = transferG(1:line,:);
end
Code that does not work:
spmd
disp('Starting Lab...');
d_dist = codistributed(EoraCoDistCalc, codistributor1d(1));
d_local = getLocalPart(d_dist);
disp(['local size is ' num2str(size(d_local))]);
transferG = zeros(100000000,5);
line = 1;
for i=1:size(d_local,1) %Loop over S1 elements
% store Eora row and col locally to avoid multiple look ups of those
% coordinates in the pre and post conc matrices
EoraRow = d_local(i,2); EoraCol = d_local(i,3);
% find destination coordinates in CREEA for current elements in Eora
CREEArowsubs = find(PreConc(EoraRow,:)); % find those values that are non-zero in that row (i.e. the ones that the element corresponds to)
CREEAcolsubs = find(PostConc(EoraCol,:));
CREEArowvals = PreConc(EoraRow,CREEArowsubs);
CREEAcolvals = PostConc(EoraCol,CREEAcolsubs);
Nrow = length(CREEArowsubs); Ncol = length(CREEAcolsubs);
NValues = Nrow*Ncol;
transferG(line:line+NValues-1,:) = [repmat(d_local(i,1),NValues,1) repmat(d_local(i,4),NValues,1) repmat(CREEArowsubs',Ncol,1) reshape(repmat(CREEAcolsubs, Nrow, 1),NValues,1) reshape(CREEArowvals'*CREEAcolvals,NValues,1)];
line = line+NValues;
end
transferG = transferG(1:line,:);
end
I have a asked a few people that have worked with parallel Matab-environments before, and they said they don't know what's going on. My feeling is there are details to the memory usage in parallel code sections that I am probably not aware of. It would be great if anybody could guide me in the right direction.
Thank you,
Arne
1 个评论
Edric Ellis
2014-2-5
It would be very helpful if you could reduce your problem to a simple, correct, self-contained example so that we can run it and see exactly what the problem is.
Also, what version of MATLAB/PCT are you using? What OS are you using?
Note that the client/worker transfer limit was increased beyond 2GB in R2013a.
回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Distributed Arrays 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!