Least squares problem with large matrix
7 次查看(过去 30 天)
显示 更早的评论
Hi, I have a simple matrix problem which I believe can be solved in two steps,
Ultimately I am looking for a least squares solution for
Ax = b
So I might say that x = inverse(A)*b. Of course these are large matrices (~1000x1000) so I would never compute an inverse, but I downloaded a Factorize package which factorizes A and never directly computes the inverse. But here is my first problem, I do not know what A is, I have a function which computes Ax for a given x, so my idea was as follows: I make up an x, and find the matrix form for A via A = b(:)*inverse(x(:)), so now A is a matrix of size (~1,000,000 x 1500), because x is size (5x300). Now if I could do this and had A, then to find x it would be a simple inverse problem, but (again utilizing the matrix factorization) but MATLAB runs out of memory. Thus I was wondering if anyone had a an alternative solution to my problem or or any help. Thanks very much!
4 个评论
Daniel Shub
2011-9-28
I am pretty sure 1000x1000 is not really that large. Unless I am missing something inv(randn(1000, 1000)); goes pretty quick.
回答(2 个)
Teja Muppirala
2011-10-1
Ok. I have an idea.
So you basically have some linear function A(x) -> b that takes the vector space of [5x300] matrices and maps them to the vector space of [1000x1000] matrices.
Since any linear function on a matrix can be expressed as matrix multiplication if you write out the elements into a vector, you are expressing the problem as
A*x(:) = b(:) where A is [1000000x1500], x(:) = [1500x1], and b(:) is [1000000x1].
You want to find the least squares solution to this problem (what x best gets mapped to b), but you don't have the memory to store the matrix A.
In this case, your solution is x = (A'*A)\(A'*b), and to calculate this, you do not need to store the whole A matrix.
All you need is 1500x1500 storage for the symmetric matrix A'*A.
You have a function (I'll give it the name "calculateAx") that gives you Ax for a given x. You can use that function to generate the 1500x1500 matrix because you can evaluate all the individual columns of A one by one.
First column of A = calculateAx([1;0;0;0;0;...])
Second column of A = calculateAx([0;1;0;0;0;...])
Third column of A = calculateAx([0;0;1;0;0;...])
The MATLAB code would look something like this:
AprimeA = zeros(1500,1500);
Aprimeb = zeros(1500,1);
for m = 1:1500
Aprimeb(m) = calculateA([. . 1 . . .])'*b; %There is a 1 in the m-th position
for n = m:1500
AprimeA(m,n) = calculateA([. . . 1. . .])' * calculateA([. . . 1 . . .])
AprimeA(n,m) = AprimeA(m,n);
end
end
x =AprimeA\Aprimeb is now a 1500x1500 sized linear system.
This is a straightforward, absolutely certain way to solve your problem.
That said, calculation time might be a problem. You have to evaluate over a million dot products of 1 million elements in addition to calling your calculateAx function over a million times, and that might take along time. I see 3 solutions to that problem.
First: Use parallel processing if you have the Parallel Computing Toolbox functionality available. You can change one of those loops to a parfor loop and obtain a quite generous speedup.
Second: If you know somehow that the columns of A are more or less independent, and you suspect that AprimeA will be strongly diagonally dominant, then instead of caculating the whole covariance matrix (A'*A is also known as the covariance matrix), just calculate the diagonal entries only, and you will still have a very good estimate for x.
Third: If the x and b in question are not just some random data, but have some sort of structure (just a guess, but from their sizes, are x and b possibly image data?), then maybe you can use a change of basis to approximate and reduce the size of the problem. For example you could use a Fourier basis and express x = [some sines and cosines]*y, and then solve for a reduced orded y instead. You would be throwing out a lot of high frequency components, but if you never needed them in the first place, it's a very reasonable thing to do.
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!