How can I do a memory efficient sparse matrix multiplication?

I have a sparse matrix A (dimension 4000000 x 1000000) and I want to calculate the matrix product:
B = A * A'
This results in: "Out of memory. Type HELP MEMORY for your options."

2 个评论

What is the density or nnz? What is its size in memory (whos), and are you able to evaluate
B = A.' ;
without generating this out of memory error?
16891004 non-zero elements in a 4172241 x 1032306 matrix.
density: 3.921729424555354E-6
Yes, I can calculate B = A.' without generating this error.

请先登录,再进行评论。

回答(1 个)

If you only plan to use A*A' in matrix multiplication, you might be able to use my ProdCascade class
>> A=sprand(4e6, 1e6,1e-5); c=rand(4e6,1);
>> tic; B=A*A'; B*c; toc;
Out of memory. Type HELP MEMORY for your options.
>> tic; B=ProdCascade({A,A.'}); B*c; toc;
Elapsed time is 7.531522 seconds.

8 个评论

I am not sure but it seems that the ProdCascade only saves the matrices?
I need to check the quality of my matrix factorization. I decomposed my sparse matrix Y into two low rank matrices C and D (C * D ≈ Y)
Now I need to calculate C*D*D.'*C.' - Y * Y.'
I can do ProdCascade ({C,D,D.',C.'}) and ProdCascade ({Y * Y.'}) but the minus operator is not applicable to ProdCascade.
True, but why calculate C*D*D.'*C.' - Y * Y.' instead of just C*D-Y ?
Okay, I could try C * D - Y but there's still the problem that I can't calculate C*D.
I have a sparse matrix Y and a dense low rank matrix C: Y ≈ C * D
I calculate D via C_inv = pinv(C); D = C_inv D
So Y is what you used to call A?
What are the dimensions of Y, C, and D.
Yes, Y is what I used to call A. Sorry for the confusion.
Y (4000000 x 13000000) C (4000000 x 5) D (5 x 13000000)
Well I guess I see the problem now: C * D won't be a sparse matrix...
I also have a matrix X (4000000 x 1000000) and want to get a matrix E:
X ≈ C * E
My approach: I get C from svds([Y X], 5)
Calculate D and E from
Y ≈ C * D
X ≈ C * E
Calculate the frobniusnorm
norm(C * D * D.' * C - Y * Y.', 'fro') + norm(C * E * E.' * C - X * X.', 'fro')
and do a gradient descent to get a new C. Calculate a new D and E and so on...
Maybe you have another idea how to tackle the problem?
and do a gradient descent to get a new C. Calculate a new D and E and so on...
No, it should be possible to calculate D and E just by doing
D=C\Y;
E=C\X;
D=C\Y;
E=C\X;
doesn't work because of memory restrictions. I use the pseudoinverse of C to calculate D and E. I wanted to do the gradient descend to improve the C matrix.
If C is full rank, then pinv(C)*Y is equivalent to D=C\Y so iterative minimization shouldn't improve anything.
Since C has only 5 columns, I can't imagine memory restrictions being a problem.

请先登录,再进行评论。

类别

帮助中心File Exchange 中查找有关 Creating and Concatenating Matrices 的更多信息

提问:

2013-11-5

编辑:

2013-11-6

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by