why code using parallel processing has longer running time than the other?
3 次查看(过去 30 天)
显示 更早的评论
hi, I used this simple code using parallel processing to see the effect of using two cores, then I can use parallel processing with complex code. But I fount out the time when use the same code without using parallel processing is less. I thought when use two cores , the two cores will process a part of dz matrix at the same time, but it seem they not process at the same time . why?
matlabpool open local 2
tic
x=magic(100);
y=magic(100);
x1 = x(1:5,:)+y(1:5,:);
y1 = x(6:10,:)+y(6:10,:);
dx = distributed(x1);
dy = distributed(y1);
dz = [dx ; dy];
spmd
dz
end
toc
this code without using parallel processing
tic
x=magic(100);
y=magic(100);
x1 = x(1:5,:)+y(1:5,:);
y1 = x(6:10,:)+y(6:10,:);
dx = distributed(x1);
dy = distributed(y1);
dz = [dx ; dy];
dz
toc;
Thanks in advance
0 个评论
回答(3 个)
Thomas
2013-5-13
Huda,
You are running a small jobs which might have more overhead parallelizing sincce the number of iterations or jobs size is small. In order to achieve a performance boost, a large number of iterations must be performed or the size of dataset large.
http://vtchl.illinois.edu/sites/hydrolab.dev.engr.illinois.edu/files/MATLAB_Report.pdf This shows some performance considerations for MATLAB http://vtchl.illinois.edu/node/537
David Sanchez
2013-5-14
Try your code in this way:
matlabpool open local 2
tic
spmd
x=magic(100);
y=magic(100);
x1 = x(1:5,:)+y(1:5,:);
y1 = x(6:10,:)+y(6:10,:);
dx = distributed(x1);
dy = distributed(y1);
dz = [dx ; dy];
dz
end
toc
Jill Reese
2013-5-15
编辑:Jill Reese
2013-5-15
There is some confusion here regarding distributed arrays, both in the initial post and in the answer by David Sanchez. Distributed arrays are intended for users who want to work with data distributed across multiple workers, but do not need fine-grained control over exactly how the data is distributed. If fine-grained control over the distribution scheme is required then the spmd construct can be used with codistributed arrays. The spmd block is also useful to examine the data that is actually stored on each worker (see example below).
The initial post was the closest to properly using distributed arrays, but I have tweaked that first bit of code some here:
matlabpool open local 2
tic
x=magic(100);
y=magic(100);
x1 = x(1:5,:)+y(1:5,:); % indexing is happening in the client MATLAB
y1 = x(6:10,:)+y(6:10,:);
dx = distributed(x1); % distribute these matrices across two workers
dy = distributed(y1);
dz = [dx ; dy]; % this concatenation is happening in parallel on the workers
toc
% Use the spmd construct to see what data is actually stored on each worker
spmd
getLocalPart(dz)
end
If you want to accurately compare timings with serial code, then it would be best to completely remove the distributed array construct from your code and compare to this:
tic
x=magic(100);
y=magic(100);
x1 = x(1:5,:)+y(1:5,:);
y1 = x(6:10,:)+y(6:10,:);
dz = [x1 ; y1];
dz
toc;
If you actually do time these code snippets, you will find that the serial code will be faster because the data you are working with is small. Furthermore the work that is happening in parallel (the concatenation) is not enough to offset the communication cost of distributing the arrays.
Distributed arrays really provide a benefit whenever there is not enough memory in a single instance of MATLAB to store the array you want to work with and the result(s) of the computation.
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!