Removing duplicate rows (not "unique")
显示 更早的评论
I have a matrix with many (1e5+) rows and I want to remove both copies of all duplicate rows. Is there a fast way to do this? (This function needs to be run many times.)
4 个评论
Azzi Abdelmalek
2016-5-4
Do you mean faster way then using unique function?
Michael Siebold
2016-5-4
jgg
2016-5-4
You can use the other calling methods to get replicate counts.
a = [1 2; 1 2; 2 3; 2 4; 2 5; 4 2; 4 2; 1 3; 1 3; 4 5];
[C,ia,ic] = unique(a,'rows');
[count key] = hist(ic,unique(ic));
Then you can just select the keys with non-unit counts and drop them.
Michael Siebold
2016-5-4
采纳的回答
更多回答(2 个)
Azzi Abdelmalek
2016-5-4
编辑:Azzi Abdelmalek
2016-5-4
A=randi(5,10^5,3);
tic
A=unique(A,'rows');
toc
The result
Elapsed time is 0.171778 seconds.
3 个评论
Michael Siebold
2016-5-4
Azzi Abdelmalek
2016-5-4
编辑:Azzi Abdelmalek
2016-5-4
You said that unique function will leave a copy of duplicate rows. With this example, I show you that there is no duplicates rows stored! And also it doesn't take much time
Mitsu
2021-8-3
I reckon your answer does not address OP's question because running the following:
A=[1 1 1;1 1 1;1 1 0];
tic
A=unique(A,'rows');
toc
Will yield:
A = 1 1 0
1 1 1
Therefore, A still contains one instance of each row that was duplicate. I believe Michael wanted all instances of each row that appears multiple times be removed.
GeeTwo
2022-8-16
0 个投票
%Here's a much cleaner way to do it with 2019a or later!
[B,BG]=groupcounts(A);
A_reduced=BG(B==1); % or just A if you want the results in the same variable.
类别
在 帮助中心 和 File Exchange 中查找有关 MATLAB 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!