How to quickly find the column index of the last non-zero element in all rows in a sparse matrix?

Question

Benson Gou 2020-3-19

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/511800-how-to-quickly-find-the-column-index-of-the-last-non-zero-element-in-all-rows-in-a-sparse-matrix

编辑： Ameer Hamza 2020-3-22

Hi, All,

I have a big sparse matrix A. I want to find out the column index of the Last non-zero element in all rows from the end. Here is my code:

reorderRow = [];

for jRow = 1 : length(A(:,1))

eee = find(A(jRow,:),1,'last');

reorderRow = [reorderRow eee];

end

For example, I have matrix A = [2 0 0;0 5 0;0 0 1;0 3 0;-1 0 -4]. My code gives the following result:

reorderRow = [1 2 3 2 3];

I am wondering if it is possible to obtain reorderRow without iterations. Thanks a lot.

Benson

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

BobH 2020-3-19

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/511800-how-to-quickly-find-the-column-index-of-the-last-non-zero-element-in-all-rows-in-a-sparse-matrix#answer_420980

编辑：BobH 2020-3-19

在 MATLAB Online 中打开

reorderRow = arrayfun(@(X)  find(A(X,:),1,'last'), 1:size(A,1))
reorderRow =
     1     2     3     2     3

4 个评论
显示 2更早的评论隐藏 2更早的评论

Ameer Hamza 2020-3-19

Benson, I think you accepted the wrong answer. John's answer is under this answer.

BobH 2020-3-19

Yes, Benson, please re-select John's answer

请先登录，再进行评论。

Answer 2

John D'Errico 2020-3-19

4
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/511800-how-to-quickly-find-the-column-index-of-the-last-non-zero-element-in-all-rows-in-a-sparse-matrix#answer_420988

编辑：John D'Errico 2020-3-19

在 MATLAB Online 中打开

It is important that if your matrix is sparse, that we use sparse matrix examples to test anything out.

Regardless, I would always do a comparison of other methods to a direct find. For example...

A = sprand(10000,10000,.0005);

So, on average, 5 elements per row. If some rows have no non-zeros, then this will return zero as the index of the last element.

tic
[ir,ic] = find(A);
maxcolind = accumarray(ir,ic,[10000,1],@max);
toc
Elapsed time is 0.026327 seconds.

This will not fail, even if some rows are empty of non-zeros.

sum(maxcolind == 0)
ans =
    59

The random example I created had 59 rows that were entirely zero.

If you want to find the array reorderRow as you asked for, then just do this:

r = find(maxcolind);
reorderRow = [r,maxcolind(r)];

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Benson Gou 2020-3-19

Dear John,

Thanks a lot. Your code took 0.002112 s for a sparse matrix 974 x 632. It works well.

Benson

请先登录，再进行评论。

Answer 3

James Tursa 2020-3-19

3
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/511800-how-to-quickly-find-the-column-index-of-the-last-non-zero-element-in-all-rows-in-a-sparse-matrix#answer_420993

编辑：James Tursa 2020-3-19

在 MATLAB Online 中打开

You can use a mex routine for this and avoid all temporary data copies. E.g., a direct approach:

/*   File:  find_last_nz.c
 *
 *   C = find_last_nz(S)
 *
 *   Finds column number of last non-zero element in each row of sparse matrix
 *   S = sparse matrix (double or logical)
 *   C = full column vector containing the columns numbers for each row
 *
 *   Programmer: James Tursa
 */
/* Includes ----------------------------------------------------------- */
#include "mex.h"
/* Gateway ------------------------------------------------------------ */
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    double *Cpr;
    mwIndex *Ir, *Jc;
    size_t i, j, imax, m, n;
    
    if( nlhs > 1 ) {
        mexErrMsgTxt("Too many outputs.");
    }
    if( nrhs != 1 || !mxIsSparse(prhs[0]) ) {
        mexErrMsgTxt("Need exactly one sparse input.");
    }
    Ir = mxGetIr(prhs[0]);
    Jc = mxGetJc(prhs[0]);
    m = mxGetM(prhs[0]);
    n = mxGetN(prhs[0]);
    plhs[0] = mxCreateDoubleMatrix(m,1,mxREAL);
    Cpr = (double *) mxGetData(plhs[0]);
    for( j=0; j<n; j++ ) {
        imax = Jc[j+1] - Jc[j];
        for( i=0; i<imax; i++ ) {
            Cpr[*Ir++] = j+1;
        }
    }
}

And a sample run:

>> mex find_last_nz.c
Building with 'Microsoft Windows SDK 7.1 (C)'.
MEX completed successfully.
>> A = rand(10).*(rand(10)<.2)
A =
    0.6443         0    0.3111         0         0         0    0.0377         0         0         0
         0         0         0         0         0         0         0         0         0         0
    0.8116         0         0    0.6028    0.8010         0         0    0.4942         0         0
    0.5328         0         0         0         0         0         0         0         0         0
         0         0    0.9049    0.2217         0         0    0.0987         0         0         0
    0.9390         0         0    0.1174         0         0         0    0.9037         0    0.1679
         0         0         0         0         0    0.6791         0         0         0         0
         0         0         0    0.3188         0    0.3955         0         0         0         0
         0    0.2277         0         0         0         0    0.1366         0         0    0.5005
    0.5870    0.4357         0         0         0    0.9880         0         0    0.5767    0.4711
>> find_last_nz(sparse(A))
ans =
     7
     0
     8
     1
     7
    10
     6
     6
    10
    10

What is presented above is a one-pass approach that only looks at the internal sparse indexing arrays.

It is possible that it might be faster to transpose the matrix first so that I could pick off the column numbers directly from the internal arrays, but that would require a data copy and I haven't written and tested this approach. That is, picking off the last non-zero row number for each column would be an extremely fast operation in a mex routine, but for your case the matrix would have to be transposed first which would probably dominate the run-time so I haven't bothered to write and test it.

EDIT

A timing comparison with the m-code method posted by John:

>> A = sprand(10000,10000,.001);
>> 
>> tic;[ir,ic] = find(A);maxcolind = accumarray(ir,ic,[10000,1],@max);toc
Elapsed time is 0.029285 seconds.
>> tic;[ir,ic] = find(A);maxcolind = accumarray(ir,ic,[10000,1],@max);toc
Elapsed time is 0.030831 seconds.
>> tic;[ir,ic] = find(A);maxcolind = accumarray(ir,ic,[10000,1],@max);toc
Elapsed time is 0.028965 seconds.
>> 
>> tic;C=find_last_nz(A);toc
Elapsed time is 0.000260 seconds.
>> tic;C=find_last_nz(A);toc
Elapsed time is 0.000253 seconds.
>> tic;C=find_last_nz(A);toc
Elapsed time is 0.000278 seconds.
>> 
>> isequal(maxcolind,C)
ans =
  logical
   1

So, the mex routine is about 100x faster for this particular example.

6 个评论
显示 4更早的评论隐藏 4更早的评论

Matt J 2020-3-20

Benson, you just need to copy the Matt's code and paste it in a file named find_last_nz.c. Then run

James's code, I think you mean.

Benson Gou 2020-3-22

Dear Matt,

Good morning!

Thanks a lot for your great help. You have a good weekend!

Benson

请先登录，再进行评论。

Answer 4

Ameer Hamza 2020-3-19

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/511800-how-to-quickly-find-the-column-index-of-the-last-non-zero-element-in-all-rows-in-a-sparse-matrix#answer_420981

编辑：Ameer Hamza 2020-3-19

在 MATLAB Online 中打开

For starters, do preallocation, It can speed up the execution if the matrix A has many rows. The following code compares the execution time for your code, BobH's code, and the code with pre-allocation. I intentionally made the A matrix with 10000 rows to show the difference

A = randi([1 5], 10000, 3);
% your code
tic
reorderRow = [];
for jRow = 1 : length(A(:,1))
    eee = find(A(jRow,:),1,'last');
    reorderRow = [reorderRow eee];
end
toc
% arrayfun method
tic
reorderRow = arrayfun(@(X)  find(A(X,:),1,'last'), 1:length(A));
toc
% pre-allocating the array
tic
reorderRow = zeros(1, size(A,1));
for jRow = 1 : length(A(:,1))
    eee = find(A(jRow,:),1,'last');
    reorderRow(jRow) = eee;
end
toc

Result

Elapsed time is 0.068404 seconds. % your code (slowest)
Elapsed time is 0.048778 seconds. % array fun (medium)
Elapsed time is 0.015521 seconds. % pre-allocation (fastest)

Pre-allocation is about 3.5 times faster than the original code.

5 个评论
显示 3更早的评论隐藏 3更早的评论

Ameer Hamza 2020-3-19

Yes, I saw your solution. That's indeed brilliant.

Benson Gou 2020-3-19

Dear Ameer,

Thanks a lot for your great help.

Benson

请先登录，再进行评论。

Answer 5

Matt J 2020-3-19

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/511800-how-to-quickly-find-the-column-index-of-the-last-non-zero-element-in-all-rows-in-a-sparse-matrix#answer_420999

编辑：Matt J 2020-3-19

在 MATLAB Online 中打开

[~,idx]=max(fliplr(logical(A)),[],2);
result=size(A,2)+1-idx;

6 个评论
显示 4更早的评论隐藏 4更早的评论

Matt J 2020-3-20

编辑：Matt J 2020-3-20

You need to use flipud() so that "first" becomes "last", similar to how fliplr() did this in my solution above for row-wise search.

Benson Gou 2020-3-22

Thanks, Matt. It is very helpful. I will try your idea.

Best regards,

Benson

请先登录，再进行评论。

How to quickly find the column index of the last non-zero element in all rows in a sparse matrix?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

4 个评论
显示 2更早的评论隐藏 2更早的评论

更多回答（4 个）

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

6 个评论
显示 4更早的评论隐藏 4更早的评论

5 个评论
显示 3更早的评论隐藏 3更早的评论

6 个评论
显示 4更早的评论隐藏 4更早的评论

另请参阅

类别

标签

Community Treasure Hunt

How to quickly find the column index of the last non-zero element in all rows in a sparse matrix?

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

4 个评论 显示 2更早的评论隐藏 2更早的评论

更多回答（4 个）

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

6 个评论 显示 4更早的评论隐藏 4更早的评论

5 个评论 显示 3更早的评论隐藏 3更早的评论

6 个评论 显示 4更早的评论隐藏 4更早的评论

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

4 个评论
显示 2更早的评论隐藏 2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

6 个评论
显示 4更早的评论隐藏 4更早的评论

5 个评论
显示 3更早的评论隐藏 3更早的评论

6 个评论
显示 4更早的评论隐藏 4更早的评论