How to get only linearly independent rows in a matrix or to remove linear dependency b/w rows in a matrix?

Say I have a matrix A = [1,1,1;1,2,3;4,4,4]; and I want only the linearly independent rows in my new matrix. The answer might be A_new = [1,1,1;1,2,3] or A_new = [1,2,3;4,4,4]
Since I have a very large matrix so I need to decompose the matrix into smaller linearly independent full rank matrix. Can someone please help?

2 个评论

what is meaning of linear independent in your case? A_new is linearly independent and A is not linearly dependent?
Linear dependence in the matrix makes it singular. In this case A, row 3 can be obtained 4*row 1. This means that the third row is redundant. Making the matrix singular(det(A)=0).I expect if there is some built in function in matlab that could give me A_new or someone has already written a code for such problem.

请先登录,再进行评论。

 采纳的回答

This extracts linearly independent columns, but you can just pre-transpose the matrix to effectively work on the rows.
function [Xsub,idx]=licols(X,tol)
%Extract a linearly independent set of columns of a given matrix X
%
% [Xsub,idx]=licols(X)
%
%in:
%
% X: The given input matrix
% tol: A rank estimation tolerance. Default=1e-10
%
%out:
%
% Xsub: The extracted columns of X
% idx: The indices (into X) of the extracted columns
if ~nnz(X) %X has no non-zeros and hence no independent columns
Xsub=[]; idx=[];
return
end
if nargin<2, tol=1e-10; end
[Q, R, E] = qr(X,0);
if ~isvector(R)
diagr = abs(diag(R));
else
diagr = R(1);
end
%Rank estimation
r = find(diagr >= tol*diagr(1), 1, 'last'); %rank estimation
idx=sort(E(1:r));
Xsub=X(:,idx);

21 个评论

Hey Matt,
Your algorithm seems to break for larger matrices although it worked fine with smaller systems. Do you have some other algorithm?
Thanks Puneet
This is outstanding! Thank you! As to the comments on matrix size, I applied it to a 149689 x 876 matrix, and it worked fine (and very quickly). Truly outstanding.
Can someone please explain to me why this works? I can't figure it out myself somehow.
It remains a bit magical to me, but the key thing to notice is that qr(X,0) for some reason not only sorts the diagonal of R in descending magnitude, but also every R(i,i) has the maximum magnitude over all elements in the lower-right sub-matrix R(i:end,i:end). You can see this in examples like below,
>> X=rand(5); X(:,3:5)=1e-2;
>> [Q, R, e] = qr(X,0);R
R =
-1.6382 -1.1724 -0.0218 -0.0218 -0.0218
0 0.4995 0.0032 0.0032 0.0032
0 0 -0.0037 -0.0037 -0.0037
0 0 0 -0.0000 -0.0000
0 0 0 0 0.0000
Thus, if I consider .0037 an insignificant magnitude, then the R above is equivalent after thresholding to,
R =
-1.6382 -1.1724 -0.0218 -0.0218 -0.0218
0 0.4995 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
which has rank 2. Thus, I can determine the thresholded rank of X just by thresholding the diagonal elements of R. Similarly, the linearly independent columns of R (after thresholding) are its first two columns. Accordingly, the first two columns of X(:,e) will be linearly independent.
Hi Matt J,
Can you tell me what kind of modification I have to do in order to start removing from lower rows, not upper rows? For example, when I applied the licols function on the matrix [1,1,1;1,2,3;4,4,4]', it removed the 1st row. However, in my application, each row has a physical meaning and the removal must be from the lower rows.
Thanks and best Ahmad Gad
Hi Ahmad,
I'm afraid I don't see any way to ensure that ordering. You could try calling qr without column permutation
[Q,R] = qr(A,0);
E=1:N;
and if you are lucky abs(diag( R )) will already be sorted as discussed above. You would have to check, though, and if it isn't appropriately sorted, I wouldn't know what can be done.
Thank you Matt J, that function has helped me insanely much! I spent days using rank([A(i,:);A(j,:)]) for different i,j to try and figure out linearly independent rows, however, it gave me wrong results. Your script works perfectly.
Hi Matt,
I have been using your licols.m function to search within a matrix AA of size, e.g., (100x1000) a subset of 100 columns that will form a square matrix A(100,100) with smallest condition number. I've done this by reducing the tol variable until length(idx) hits a 100. I noticed that when I add more columns to my original AA matrix [say, AA2=(100,2000)], sometimes I'd get a subset matrix with larger condition number than the one I found from AA, although all the columns that were in AA are still in AA2. Does this make sense to you?
Thanks so much for your help,
Guy
@Gy Ralon: Entirely possible as showed in this simple example:
>> A=[10 0 0; 0 1 0]'
A =
10 0
0 1
0 0
>> cond(A)
ans =
10
Now adding column
>> A(:,end+1) = [0 0 200]
A =
10 0 0
0 1 0
0 0 200
>> cond(A)
ans =
200
Condition of 3 x 2 matrix increases
>> cond(A(:,2:3))
ans =
200
>> cond(A(:,[1 3]))
ans =
20
What is impossible is the smallest singular value increase by adding column.
Thanks Bruno, but that's not what I was asking. I am looking for a fixed-size (square) subset matrix of the original matrix (AA) with smallest condition number. I'd think that if I found it to have a condition number C1, then if I add more columns to AA, the resulting smallest-conditioned subset matrix will have a smaller condition than C1. Perhaps your last comment on smallest singular value corresponds to this intuition.
It seems that licols.m is simply not suitable to the task of optimizing condition number of a subset matrix, which is really what I'm looking for, thanks!
"I'd think that if I found it to have a condition number C1, then if I add more columns to AA, the resulting smallest-conditioned subset matrix will have a smaller condition than C1"
That makes the observation even more trivial, when you add a columns, the number of choice of subset square matrices increases so the smallest condition number must decrease.
I don't see anything surprising.
Or I don't understand at all what is your concern.
sometimes I'd get a subset matrix with larger condition number than the one I found from AA, although all the columns that were in AA are still in AA2. Does this make sense to you?
Hi Guy,
Yes, I don't see why that couldn't happen. Here's a simple example.
>> A=eye(3,4); A(:,end)=1
A =
1 0 0 1
0 1 0 1
0 0 1 1
>> licols(A)
ans =
0 0 1
1 0 1
0 1 1
So, yes, it would arguably be better if the first 3 columns of A were extracted, since those form a maximally well-conditioned sub-matrix. However, I don't know if it's possible to ensure that without doing a brute force combinatoric search.
Bruno: Sorry, I meant to say I get a larger condition number even though I start with more columns.
Matt: As I mentioned, I am using the tolerance argument, starting with a large tolerance, say 0.01 (to find maximally distinct columns) and gradually reducing it until I have enough columns to form a square matrix. The resulting matrix, so I hoped, would have the lowest condition within the original AA matrix. This approach, however, does not seem to work consistently; e.g., even for your simple example there is no tolerance value that will result in the extraction of the (more favorable) first three columns. Unfortunately, a brut force search will be prohibitive for the matrix size I need. Any idea is greatly uppreciated!
MICHAEL MONT-ETON's comment moved here.
Matt J.:
Thanks for the code. I'd like to provide a reference for your work in my upcoming paper that would benefit from removal of dependent equations in the system I am studying.
Please indicate how you would like that to read.
Very Respectfully,
Mike Mont-Eton

请先登录,再进行评论。

更多回答(4 个)

A = [1,1,1;1,2,3;4,4,4];
[R,basiccol] = rref(A);
B = A(:,basiccol);
The columns of B are a basis for the range of A. B has the same rank as A.

3 个评论

Thanks for looking at the problem but it gives B =
1 1
1 2
4 4
I do not wish to eliminate columns. Also the B is still linearly dependent. Look at row 3 which is three times row 1.
I have been warned not to trust RREF for this kind of thing. That was my reason for coding the QR-based method in my Answer.
As Matt advises below just transpose to work on rows.
B = A';
[R,basiccol] = rref(B);
B = B(:,basiccol)'

请先登录,再进行评论。

Hello,
I want to ask: instead of rank estimation, can we not just use the minpoly function, get the largest non-zero degree (r) from there and use r instead?

1 个评论

There's no way to avoid estimating rank. minpoly sounds like an alternative way to do so, but requires the Symbolic Toolbox. It also appears to be a lot slower than a QR approach, even for rather small matrices:
>> A=rand(10);
>> tic;minpoly(A);toc
Elapsed time is 0.875673 seconds.
>> tic;qr(A);toc
Elapsed time is 0.000063 seconds.

请先登录,再进行评论。

I wrote a few functions to handle this. They do basically the same as Matt J's function above, with some added bells and whistles. (Namely, it includes an option for ignoring columns that are shifted by a constant; for example, if col2 = 10 - col1. It also returns indices to clusters of originally linearly dependent columns ). Again, you'd have to add a transpose to operate on rows instead of columns. Hope it's useful to someone.
On your example above:
A = [1,1,1;1,2,3;4,4,4]'
[Abasis, Abasisi, Asub]= getLinearIndependent(A)
Result:
Input:
A =
1 1 4
1 2 4
1 3 4
Output:
Abasis =
1 1
1 2
1 3
Abasisi =
1 2
Asub =
1×2 cell array
[1,3] [2]
Here, Asub{1} contains the indices of all columns linearly dependent with the first basis vector, and Asub{2} contains that for the second.
svd , and looking at the number of non-zero singular values

8 个评论

Partial satisfied answer. SVD gives the rank but will be unable to tell set of independent rows. Since eveything is mixed together in U and V. QR is the right method as answered above.
Looking at the columns of the right singular vectors, related to the ZERO singular values, numerically, the singular values beyond a gap, the non-zero entries of these columns can tell you which columns are linearly dependant.
In addition SVD is better suited to low rank examples.
Please show us which rows of A are independent if one looks only from U and V and D?
> A=[1 2 3; 1 2 3; 2 3 4; 3 5 7]
A =
1 2 3
1 2 3
2 3 4
3 5 7
>> [U,D,V]=svd(A)
U =
-0.3157 0.5480 0.7631 0.1331
-0.3157 0.5480 -0.4095 -0.6575
-0.4548 -0.6270 0.3536 -0.5244
-0.7706 -0.0790 -0.3536 0.5244
D =
11.8231 0 0
0 0.4633 0
0 0 0.0000
0 0 0
V =
-0.3259 -0.8527 0.4082
-0.5481 -0.1814 -0.8165
-0.7703 0.4898 0.4082
>>
Of which matrix? U and V are orthonormal so theirs columns are not correlated. Or if you like the correlation is actually 0.
I stop here because so far your claim of using SVD doesn't seem to hold on a solid ground.
I hope you agree that the 3 columns of A are linearly dependant. After svd you see that the rank of A is 2 , this the matrix is rank deficient by 1. Now look at the last column of V =[0.4 -0.8 0.4]’ this relates to the single zero valued singular value. These 3 non zero values indicate that the 3 columns of A are linearly dependant. If for example the values in that column were [0.4 0 0.2]’ that would have implied that columns 1 and 3 are linearly dependant.
"If for example the values in that column were [0.4 0 0.2]"
But it's not the case. So you can't conclude anything in the above example.
If you want to convince, write down your algorithm to detect independent columns using SVD, then we can speak.
The algorithm written above by Matt works pretty good. I have tested that algorithm for quite a large matrix (1M+x1M+). However, I tried the svd route too when I posted this question, as said above I was not able to get results I am looking for.
Thanks, Puneet

请先登录,再进行评论。

类别

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by