Fast method for findings which rows of a matrix are contained in another

Question

Alexander Holmes 2020-8-26

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/584897-fast-method-for-findings-which-rows-of-a-matrix-are-contained-in-another

编辑： Bruno Luong 2020-8-27

I have two matrices, A and B. The rows of B are a subset of the rows of A, and I want to find the row indices of the rows in A that correpsond to rows in B. I can do this using the intersect or ismember functions, but these are both much too slow for my purposes. I need to perform this calculation hundreds of thousands of times.

A is typically varies from a

matrix, and can go up to having several hundred rows (still with 8 columns). B also has 8 columns, and varies from 1 row up to the number of rows in A. Are there any other methods I can use to find the rows of A that are also in B?

Alternatively, I perform a set of operations to the matrix A to come up with the matrix B. Is there a way I can use this to my advantage to find the rows of the final matrix (which would be B) relative to the original matrix A.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Stephen23 2020-8-27

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/584897-fast-method-for-findings-which-rows-of-a-matrix-are-contained-in-another#answer_485651

编辑：Stephen23 2020-8-27

在 MATLAB Online 中打开

>> A = [1,2,3,4;5,6,7,8;9,10,11,12]
A =
     1     2     3     4
     5     6     7     8
     9    10    11    12
>> B = [5,6,7,8;9,10,11,12]
B =
     5     6     7     8
     9    10    11    12

Using permute and test for equality (for versions >=R2016b bsxfun is not required):

>> X = any(all(bsxfun(@eq,A,permute(B,[3,2,1])),2),3)
X =
     0
     1
     1
>> find(X) % optional
ans =
     2
     3

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 2

Bruno Luong 2020-8-27

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/584897-fast-method-for-findings-which-rows-of-a-matrix-are-contained-in-another#answer_485678

编辑：Bruno Luong 2020-8-27

在 MATLAB Online 中打开

Not know how much my code is faster than ISMEMBER (which IMO pretty fast) (EDIT: it's indeed > 3 time faster)

% Generate random A, B such that rows of B are subset of that of A
A = randi(10,1000,8);
B = randi(10,1000,8);
A = [A;B];
A = A(randperm(end),:);
tic
% iA contains rows in A that matches with B
[~,is] = sortrows([A;B]);
isB = is>size(A,1);
ismatch = ~isB(1:end-1)&isB(2:end);
iA = is(ismatch); % row index of A that match B
%iB = is([false;ismatch])-size(A,1); % matching index
%isequal(A(iA,:),B(iB,:))         % meaning isequal(A(iA,:),B(iB,:)) is TRUE
toc %  Elapsed time is 0.000656 seconds.

Do you have something unchanged during the loop? Is there any special properties of A and B (sorted already)? If yes it might exist faster methods using those charracteristics.

2 个评论
显示无隐藏无

Alexander Holmes 2020-8-27

编辑：Alexander Holmes 2020-8-27

Ismember probably is quite fast, it just wasn't fast enough for me the way I had written my code.

That said, I was being silly and could apply the same operations I used to get B to an array of indices without then having to use ismember at all. Thank you for your answer though!

Bruno Luong 2020-8-27

编辑：Bruno Luong 2020-8-27

在 MATLAB Online 中打开

My code is about 3.8 times faster than ismember for 2000/1000 rows

% Generate random A, B such that rows of B are subset of that of A
A = randi(10,1000,8);
B = randi(10,1000,8);
A = [A;B];
A = A(randperm(end),:);
ntries = 1000;
tic
for k=1:ntries
    % iA contains rows in A that matches with B
    [~,is] = sortrows([A;B]);
    isB = is>size(A,1);
    ismatch = ~isB(1:end-1)&isB(2:end);
    iA = is(ismatch); % row index of A that match B
    %iB = is([false;ismatch])-size(A,1); % matching index
    %isequal(A(iA,:),B(iB,:))         % meaning isequal(A(iA,:),B(iB,:)) is TRUE
end
tsort=toc; %  Elapsed time is 0.000656 seconds.
tic
for k=1:ntries
    tf = ismember(A, B, 'rows');
    iA = find(tf);
end
tismember=toc;
tic
for k=1:ntries
    X = any(all(bsxfun(@eq,A,permute(B,[3,2,1])),2),3);
end
teq=toc;
fprintf('tsort              = %f [s]\n', tsort);
fprintf('tismember          = %f [s]\n', tismember);
fprintf('teq                = %f [s]\n', teq);
% fprintf('tsort/tismember	= %f\n', tsort/tismember);
% fprintf('tismember/tsort	= %f\n', tismember/tsort);

Result for 2000/1000 rows

tsort              = 0.252900 [s]
tismember          = 0.934710 [s]
teq                = 12.506212 [s]

Result of 12/6 rows

%A = randi(6,10,8);
%B = randi(6,10,8);
%A = [A;B];
%A = A(randperm(end),:);
tsort              = 0.009770 [s]
tismember          = 0.126905 [s]
teq                = 0.006746 [s]

请先登录，再进行评论。

Fast method for findings which rows of a matrix are contained in another

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

更多回答（1 个）

2 个评论
显示无隐藏无

另请参阅

类别

标签

产品

Community Treasure Hunt

Fast method for findings which rows of a matrix are contained in another

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

更多回答（1 个）

2 个评论 显示 无隐藏 无

另请参阅

类别

标签

产品

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无