Identify Duplicate values in an array and replace with Nan

14 次查看(过去 30 天)
Hello: I have a array as below of dimension 40,000x3. Third columns often contain duplicate values. I need to identify duplicate values and replace with Nan. Kindly help.
2.39300000000000 0 6.16800000000000
2.38720000000000 0 6.16800000000000
2.38480000000000 0 6.16800000000000
2.37380000000000 0 6.16800000000000
2.37410000000000 0 6.16800000000000
2.37020000000000 0 6.16800000000000
2.36880000000000 0 6.16800000000000
2.36350000000000 0 6.16800000000000

采纳的回答

Mrutyunjaya Hiremath
If you want to replace only the duplicates with 'NaN' and keep one occurrence of each value intact, here is the code:
% Sample array (Replace this with your 40,000x3 array)
array = [
2.3930, 0, 6.1680;
2.3872, 0, 6.1680;
2.3848, 0, 6.1780;
2.3738, 0, 6.1680;
2.3741, 0, 6.1690;
2.3702, 0, 6.1780;
2.3688, 0, 6.1690;
2.3635, 0, 6.1780;
];
% Extract the third column
third_col = array(:, 3);
% Find unique values and their first occurrence index
[unique_vals, ~, ic] = unique(third_col);
% Count the occurrence of each unique value
counts = accumarray(ic, 1);
% Identify values that occur more than once (duplicates)
duplicate_vals = unique_vals(counts > 1);
% Replace only duplicates with NaN, keep one occurrence of each value
for val = duplicate_vals'
idx = find(third_col == val);
third_col(idx(2:end)) = NaN; % Keep the first occurrence, replace the rest with NaN
end
% Update the third column in the original array
array(:, 3) = third_col;
% Display the updated array
disp(array);
2.3930 0 6.1680 2.3872 0 NaN 2.3848 0 6.1780 2.3738 0 NaN 2.3741 0 6.1690 2.3702 0 NaN 2.3688 0 NaN 2.3635 0 NaN
In this code, for each duplicate value, find its indices in the third column using 'find'. Then, keep the first occurrence (index idx(1)) and replace the rest (idx(2:end)) with NaN. This will leave one instance of each value in the third column and replace only the duplicates with 'NaN'.

更多回答(1 个)

Dyuman Joshi
Dyuman Joshi 2023-9-7
编辑:Dyuman Joshi 2023-9-7
Here's a much faster and simpler approach -
array = [2.3930, 0, 6.1680;
2.3872, 0, 6.1680;
2.3848, 0, 6.1780;
2.3738, 0, 6.1680;
2.3741, 0, 6.1690;
2.3702, 0, 6.1780;
2.3688, 0, 6.1690;
2.3635, 0, 6.1780];
%Get the unique values and the indices corresponding to their 1st occurence
%in order they appear in the array
[val,first_idx] = unique(array(:,3),'stable')
val = 3×1
6.1680 6.1780 6.1690
first_idx = 3×1
1 3 5
%Convert all the values of column 3 to NaN
array(:,3) = NaN;
%Re-assign the values according to the indices
array(first_idx,3) = val
array = 8×3
2.3930 0 6.1680 2.3872 0 NaN 2.3848 0 6.1780 2.3738 0 NaN 2.3741 0 6.1690 2.3702 0 NaN 2.3688 0 NaN 2.3635 0 NaN

类别

Help CenterFile Exchange 中查找有关 Matrices and Arrays 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by