Pcg and Parallel Computing Toolbox

3 次查看(过去 30 天)
Hi MATLAB community, I know that function pcg is supported in the Parallel Computing Toolbox for use in data parallel computations with distributed arrays, i am using a HPC architecture that it's made of 8 nodes, each blade consists of 2 quadcore processors sharing memory for a total of 8 cores and of 64 cores, in total. I run pcg on 1 core and pcg with distributed arrays on 32 cores.
tic
[y]=pcg(A,b,[],100); %first case
toc
A=distributed(A);
b=distributed(b);
tic
[x,flagCG_1,iter] = pcg(@(x)gather(A*x),b,[],100); %second case on 32 cores
toc
i obtained that Elapsed time is 0.001279 seconds. %first case Elapsed time is 0.316632 seconds. %second case on 32 cores
why the time in second case is greater than the time in first case? what am I doing wrong? I tried with larger size matrices but the time in second case is always greater than the time in first case, i probably don't use pcg correctly for distributed arrays. Thanks for your help

回答(2 个)

Gaurav Garg
Gaurav Garg 2020-9-16
Hey Rosalba,
Using Parallel Computing toolbox for very small problems or for large number of threads can prove to be of no use. That's because -
  1. For very small operations, like A = B + C (B = 2x2, C = 2x2), the compiler would need to make arrangements for parallel code, running the code on separate cores, and then joining the threads. This can have a little more overhead than the serial execution.
  2. For very large number of threads, the OS would be busy in switching the threads, saving the state of each thread, and then loading the thread of each state again. This would result in heavy time loss.
In your problem, the former argument seems to be the case.
As a workaround, you can either run the code without use of distributed arrays or try using parfor loop (code snippet given below) -
parfor i = 1:iter
[x] = f(k,l);
end
Note that the function f should not have any data dependency among the iterations.
For more info. on parfor, you can look at the documentation here.

Oli Tissot
Oli Tissot 2020-9-11
You can simply do:
dA = distributed(A);
db = distributed(b);
[dx, flag, iter] = pcg(dA, db, [], 100); % dx is a distributed array
x = gather(dx);
But you should be aware that distributed arrays are not designed to be faster than in-memory arrays, they are designed to process arrays that would not fit in your local memory. In practice, operations on distributed arrays are usually slower because of the extra cost for communication, but if the matrix is large enough these operations could not be performed at all otherwise! If you are interested in performance, you may try to use gpuArray -- pcg is supported for gpuArray.
  4 个评论
ROSALBA CACCIAPUOTI
编辑:ROSALBA CACCIAPUOTI 2020-9-14
I understood, thanks. But i need some parallel code for the pcg method or a way to use the matlab pcg function in parallel. I probably have to write the code by myself, or could you suggest a code to take inspiration from?
Steven Lord
Steven Lord 2020-9-14
How large a problem are you trying to solve? If you're trying to solve small problems, using parallel code may not save you any time as you've seen because the overhead of setting up the problem in parallel may outweigh any savings you get from running in parallel.
Picture you and three young children go to the grocery store (before COVID-19 of course.) If you split the four items on your shopping list among your group are you really saving time? Maybe, but if you do save time you probably don't save a lot of time. And depending on how mature the children are, you may very well lose time. How about if you split the 100 items on your list into four segments? Then you may save more time than the four item list case.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Parallel Computing Fundamentals 的更多信息

产品


版本

R2013a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by