Building sparse matrix inside parfor

12 次查看(过去 30 天)
I'm building a large sparse matrix in smaller pieces. Unfortunately the pieces overlap a bit. At the moment I'm building each piece to match the final size and after the loop I sum the pieces together. I have tried following approaches
1) I tried summing up sparse matrixes inside parfor. Bad idea. Produces full matrix.
2) Build index vectors for each piece of the matrix and combine the index vectors inside parfor. Then use one sparse command after the loop to build the final matrix. This, unfortunately, is rather slow. The reason might be the repetitive entries that the sparse command needs to sum up.
3) Build sparse matrix of each piece and store them in cell array inside parfor. Then sum up the sparse matrixes inside regular for loop. This is the best so far; fast and reliable. (See the pseudocode below.)
4) This is the problematic case: Build sparse vectors out of each piece and store them in cell array. Then sum up the sparse vectors inside regular for loop, and reshape to matrix. Unfortunately for larger systems it crashes with
Error using parallel_function (line 598)
Error during serialization
Error stack:
remoteParallelFunction.m at 31
As a for loop it runs just fine.
Below is some pseudocode to shed light on what I'm doing:
First option 3) that always works.
Aset = cell(1,Nsets) ;
parfor S=1:Nsets
% Do lots of stuff to get iind, jind, Aval
Aset{S} = sparse( iind, jind, Aval, Ndof, Ndof );
end
A = Aset{1};
for S=2:Nsets
A = A + Aset{S} ;
end
Option 4) that gives the error:
Aset = cell(1,Nsets) ;
parfor S=1:Nsets
% Do lots of stuff to get iind, jind, Aval
matind = iind +Ndof*( jind-1 );
Aset{S} = sparse( matind, ones(size(matind)), Aval, Ndof*Ndof, 1 ) ;
end
A = Aset{1};
for S=2:Nsets
A = A + Aset{S} ;
end
A = reshape(A,Ndof,Ndof) ;
Any ideas why option 4 crashes? How should I do this to gain speed?
The size of the final matrix, i.e. Ndof, is few millions. Number of matrix pieces, i.e. Nsets, is 10 to 30. For option 3 it takes roughly 30 seconds to sum the matrixes of size Ndof=4000000.

采纳的回答

Sean de Wolski
Sean de Wolski 2011-12-29
NEW:
I think I have a possible workaround. Instead of using sparse on each iteration, build up the matind and aval vectors in the parfor loop and call sparse once at the end:
Nsets = 12;
Ndof = 1e6;
matind = cell(12,1);
Aval = cell(12,1);
parfor S=1:Nsets
% Do lots of stuff to get iind, jind, Aval
iind = ceil(Ndof*rand(Ndof,1)) ;
jind = ceil(Ndof*rand(Ndof,1)) ;
Aval{S} = 100*randn(Ndof,1);
matind{S} = iind +Ndof*( jind-1 );
end
matind = vertcat(matind{:});
Aval = vertcat(Aval{:});
A = sparse(matind,ones(numel(matind),1),Aval,Ndof*Ndof,1);
A = reshape(A,Ndof,Ndof);
spy(A)
  1 个评论
Mika
Mika 2011-12-30
Thank you Sean. This does the trick for me. It is still slower than I had hoped for. With parfor I'm able to build the pieces to the matrix very fast. Unfortunately knowing the values and indices to matrix is far from the actual matrix, the final sparse takes a while to execute. I guess I need to think something completely different to make it considerably faster.
Also, as you said, this is a work around. The original problem still exists. It might be worth investigating more. Someone else might hit it too.

请先登录,再进行评论。

更多回答(2 个)

fvff
fvff 2014-11-25
编辑:fvff 2014-11-25
Three years too late! There is a workaround to parfor expanding the sparse matrix by using a function handle. See code below for an example.
m = 1e5;
n = 1e5;
A = sparse(m,n);
fcn = @plus;
parfor k = 1:100
i = randi(m,10);
j = randi(n,10);
s = randn(10);
A = fcn(A, sparse(i,j,s,m,n));
end
Hope it helps!
  3 个评论
SE
SE 2018-8-3
Just logged in to tell you that you're a lifesaver! What strange behaviour... I suppose the symbolic addition is what wants inputs to be full rather than sparse? Thanks again!
Fintan Healy
Fintan Healy 2025-1-12
m = 1e5;
n = 1e5;
A = sparse(m,n);
parfor k = 1:100
i = randi(m,10);
j = randi(n,10);
s = randn(10);
A = A + sparse(i,j,s,m,n);
end
as of 2024b the "fcn" wrapper is no longer required, and this was the fastest option for me.

请先登录,再进行评论。


Mika
Mika 2011-12-29
Sean,
Here's a sample that crashes on my machines (imac and ubuntu linux). I tried both R2011a and R2011b.
matlabpool local 4
Nsets = 12;
Ndof = 1e6 ;
Aset = cell(1,Nsets) ;
parfor S=1:Nsets
% Do lots of stuff to get iind, jind, Aval
iind = ceil(Ndof*rand(Ndof,1)) ;
jind = ceil(Ndof*rand(Ndof,1)) ;
Aval = 100*randn(Ndof,1) ;
matind = iind +Ndof*( jind-1 );
Aset{S} = sparse( matind, ones(size(matind)), Aval, Ndof*Ndof, 1 ) ;
end
A = Aset{1};
for S=2:Nsets
A = A + Aset{S} ;
end
A = reshape(A,Ndof,Ndof);
spy(A)
matlabpool close
  3 个评论
Friedrich
Friedrich 2011-12-29
This seems like a bug to me. There isn't any reason why it shouldn't work. You have a limit of 2gb on 64bit and 600mb on 32bit regarding the amount of data which can be transfered from MATLAB to the workers and back. You are far away from that limit. Since it works with small values it should work with bigger one too.
Mika
Mika 2011-12-29
Memory related problem seems likely. The funny thing is that is runs fine in serial. I mean, if you change 'parfor' to 'for' the results that I get seem right.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Creating and Concatenating Matrices 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by