Slow Execution of Parfor Loops due to Communication Overhead: Load static data into worker workspace memory?
2 次查看(过去 30 天)
显示 更早的评论
For my research, I require near realtime execution of a large number (>1000) of matrix-vector multiplications of the form A*x with A a medium scale matrix (e.g. 150x150). These matrices are constructed in an extremely expensive operation (takes hours to complete), and saved in a static data structure (MatSet in the example below). This static data structure is used by all workers, and is not modified after creation.
When I run the code, which is equivalent to the code below, I find that the PARFOR loop is more than 10 times slower than the FOR loop in Matlab 2010b. This is caused by a constant transfer of data (MatSet in this case) between workers. In my case, however, this data transfer is completely unnecessary as MatSet is a read-only dataset!
My question is whether there is some way of loading a STATIC dataset into the workspace of the workers so as to prevent unnecessary communication overhead between workers? Is it possible to do this without having to load data from disk?
Here is the demo code:
matlabpool(2); % init 2 worker threads
Msize = 150; Nloop = 1000;
c1 = zeros(Msize, Nloop); c2 = zeros(Msize, Nloop);
% parallel initialization loop
MatSet = cell(Nloop, 1);
parfor i=1:Nloop
MatSet{i} = rand(Msize); % simulates expensive code operation
end
% real-time parallel loop (SLOW!)
tic;
parfor i=1:Nloop
c1(:,i) = MatSet{i} * rand(Msize, 1);
end
time1 = toc;
% real-time serial loop (for comparison)
tic;
for i=1:Nloop
c2(:,i) = MatSet{i} * rand(Msize, 1);
end
time2 = toc;
fprintf('Parallel time: %2.4f ms, Serial Time: %2.4f ms\n', 1000*time1,1000*time2);
matlabpool close;
Any comments are appreciated,
Coen
0 个评论
采纳的回答
Edric Ellis
2011-11-17
You might be able to take advantage of my Worker Object Wrapper which is designed to help set up this sort of static data to be used on workers.
2 个评论
Edric Ellis
2011-11-21
It's hard to say exactly why it's still slower. Your syntax is fine. Some points to note:
1. generateMatrix is evaluated once per worker.
2. The result "w.Value" is stored separately on each worker.
Either of those two factors could be important. Also, it's worth bearing in mind that some PARFOR loops do not experience speedup due to the overhead of going into a PARFOR loop, and the data transfer involved.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Loops and Conditional Statements 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!