Finding Duplicate Values per Column
52 次查看(过去 30 天)
显示 更早的评论
Greetings, suppose Column A has these values - 7 18 27 42 65 49 54 65 78 82 87 98
Is there a way to compare the values (row by row) and search for duplicates? (I'm using Matlab R2010b)I don't want the duplicated values to be removed.
Thanks.
0 个评论
采纳的回答
Jan
2011-10-22
A = [7 18 27 42 65 49 54 65 78 82 87 98];
[n, bin] = histc(A, unique(A));
multiple = find(n > 1);
index = find(ismember(bin, multiple));
Now the values A(index) appear mutliple times.
更多回答(4 个)
the cyclist
2011-10-22
Here's a slightly different way:
X = [1 2 3 4 5 5 5 1];
uniqueX = unique(X);
countOfX = hist(X,uniqueX);
indexToRepeatedValue = (countOfX~=1);
repeatedValues = uniqueX(indexToRepeatedValue)
numberOfAppearancesOfRepeatedValues = countOfX(indexToRepeatedValue)
4 个评论
Wesley Allen
2018-2-9
编辑:Wesley Allen
2018-2-9
Duplicate Finding with Tolerance
If you want to find duplicates with tolerances (e.g., for non-integers), I use the following:
A = [1.313;2.4;2.400000001;1.31299999999;2.25;2.25;2.25000000001;3.7];
TOL = 1e-5;
uniqueA = uniquetol(A,TOL);
duplicateBool = abs(repmat(A,size(uniqueA.'))-repmat(uniqueA.',size(A))) < max(abs(uniqueA))*TOL;
duplicateCount = sum(duplicateBool).';
Just like with the cyclist's answer, if you want to isolate only the values that have more than one instance:
iDuplicate = (duplicateCount ~= 1);
repeatedValues = uniqueA(iDuplicate);
numberOfAppearancesOfRepeatedValues = duplicateCount(iDuplicate);
repeatedBool = duplicateBool(:,iDuplicate);
Using the Results
The unique values are in uniqueA:
>> uniqueA
uniqueA =
1.3130
2.2500
2.4000
3.7000
The quantity of each unique value is in duplicateCount:
>> duplicateCount
duplicateCount =
2
3
2
1
To get the indices of A corresponding to the n-th unique value, uniqueA(n)
>> n = 2;
>> uniqueA(n)
ans =
2.2500
>> duplicateIndex = find(duplicateBool(:,n))
duplicateIndex =
5
6
7
0 个评论
Fernando Meo
2018-8-13
Here is another answer (a one liner)
If AA is a 2D matrix and you wish to find the rows which have a duplicate values in its columns,
RowsWhichHaveDuplicates = find(arrayfun(@(i (~isequal(length(unique(AA(i,:))),size(AA,2))), [1:size(AA,1)]));
Example
AA = [6 7 11 6; 7 11 4 8; 11 15 1 10; 15 4 14 12;
18 13 18 8; 12 13 18 1; 3 14 6 18];
>> RowsWhichHaveDuplicates = RowsWhichHaveDuplicates = find(arrayfun(@(i) (~isequal(length(unique(AA(i,:))),size(AA,2))), [1:size(AA,1)]))
RowsWhichHaveDuplicates =
1 5
If your values are real, then a tolerance can be set by using the matlab "round" function to the decimal places you wish to use.
AA = round(rand(10)*10,1); % First decimal place
AA =
6.0000 2.0000 0.4000 6.7000 9.4000 0.6000 8.3000 3.1000 1.0000 3.0000
9.1000 7.5000 0.6000 6.0000 0.7000 3.1000 0.3000 4.2000 9.0000 3.7000
2.5000 8.9000 5.0000 3.4000 7.2000 6.6000 8.4000 9.3000 9.0000 7.6000
8.6000 1.0000 4.1000 4.0000 8.3000 4.6000 2.6000 0.6000 0.8000 3.1000
7.6000 5.2000 2.2000 3.9000 7.3000 0.2000 6.6000 8.2000 5.2000 9.6000
2.2000 6.0000 4.3000 7.0000 5.1000 6.9000 6.7000 6.4000 2.8000 2.1000
4.2000 9.8000 9.5000 1.4000 5.2000 4.1000 2.6000 8.2000 8.8000 7.3000
1.3000 6.7000 2.0000 3.8000 7.6000 5.7000 3.3000 3.3000 6.7000 2.5000
9.2000 8.5000 7.1000 2.2000 6.3000 9.9000 2.5000 9.5000 1.2000 8.9000
2.9000 1.7000 7.8000 4.1000 0.7000 8.6000 7.1000 9.1000 3.7000 7.1000
RowsWhichHaveDuplicates = find(arrayfun(@(i) (~isequal(length(unique(AA(i,:))),size(AA,2))), [1:size(AA,1)]))
RowsWhichHaveDuplicates =
5 8 10
Hope this helps
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Logical 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!