delete rows depend on repeated value from 1 col

1 次查看(过去 30 天)
Hi I have a 20000x60 cell matrix I want to delete rows depend on repeated values from the first columns.
s = {
1 '2013-10-01' '05:35:00' '01:05:00'
2 '2013-10-01' '10:20:00' '00:00:00'
2 '2013-10-01' '10:20:00' '00:00:00'
3 '2013-10-01' '11:00:00' '00:40:00'
3 '2013-10-01' '11:00:00' '00:40:00'
3 '2013-10-01' '11:00:00' '00:40:00'
3 '2013-10-01' '11:00:00' '00:40:00'
5 '2013-10-01' '14:50:00' '00:50:00'
5 '2013-10-01' '14:50:00' '00:50:00'
6 '2013-10-01' '15:15:00' '01:15:00'
7 '2013-10-01' '15:55:00' '00:05:00'
7 '2013-10-01' '15:55:00' '00:05:00'
7 '2013-10-01' '15:55:00' '00:05:00'
7 '2013-10-01' '15:55:00' '00:05:00'}
the result should like:
s1 =
1 '2013-10-01' '05:35:00' '01:05:00'
2 '2013-10-01' '10:20:00' '00:00:00'
3 '2013-10-01' '11:00:00' '00:40:00'
5 '2013-10-01' '14:50:00' '00:50:00'
6 '2013-10-01' '15:15:00' '01:15:00'
7 '2013-10-01' '15:55:00' '00:05:00'
8 '2013-10-01' '19:00:00' '00:05:00'
9 '2013-10-01' '19:10:00' '00:10:00'
I use unique for it but because the different type of data there is Error using cell/unique (line 85) Input A must be a cell array of strings."

采纳的回答

Star Strider
Star Strider 2016-2-16
If the number in the first column is the same for all repeated values in the rest of the row, just use it.
Using your example data:
s = { 1 '2013-10-01' '05:35:00' '01:05:00'
2 '2013-10-01' '10:20:00' '00:00:00'
2 '2013-10-01' '10:20:00' '00:00:00'
3 '2013-10-01' '11:00:00' '00:40:00'
3 '2013-10-01' '11:00:00' '00:40:00'
3 '2013-10-01' '11:00:00' '00:40:00'
3 '2013-10-01' '11:00:00' '00:40:00'
5 '2013-10-01' '14:50:00' '00:50:00'
5 '2013-10-01' '14:50:00' '00:50:00'
6 '2013-10-01' '15:15:00' '01:15:00'
7 '2013-10-01' '15:55:00' '00:05:00'
7 '2013-10-01' '15:55:00' '00:05:00'
7 '2013-10-01' '15:55:00' '00:05:00'
7 '2013-10-01' '15:55:00' '00:05:00'};
sc1 = cellfun(@(x)num2str(x, '%.0f'), s(:,1));
[s1u, ia] = unique(sc1);
s1 = s(ia,:)
s1 =
[1.0000e+000] '2013-10-01' '05:35:00' '01:05:00'
[2.0000e+000] '2013-10-01' '10:20:00' '00:00:00'
[3.0000e+000] '2013-10-01' '11:00:00' '00:40:00'
[5.0000e+000] '2013-10-01' '14:50:00' '00:50:00'
[6.0000e+000] '2013-10-01' '15:15:00' '01:15:00'
[7.0000e+000] '2013-10-01' '15:55:00' '00:05:00'
You may have to modify this slightly if your actual cell array is different, but it works with the array you posted, and as I interpreted it.
  8 个评论
Star Strider
Star Strider 2016-2-17
The '%.0f' is one of a number of format descriptors (see the link Stephen provided for a full discussion of all of them), this one telling MATLAB to produce a string representing a floating-point number with no digits to the right of the decimal. So using that format descriptor, pi=3.1415... would print as 3 with no trailing decimal point. When you read the documentation, I leave it for you to determine the reason I chose this format rather than, for example, '%d'.

请先登录,再进行评论。

更多回答(1 个)

MHN
MHN 2016-2-16
编辑:MHN 2016-2-16
Consider this example (it is not the fastest and easiest way, but it solves your problem)
s = {3 '2013-10-01' '11:00:00' '00:40:00'; 3 '2013-10-01' '11:00:00' '00:40:00'; 3 '2013-10-01' '11:00:00' '00:40:00'; 5 '2013-10-01' '14:50:00' '00:50:00'; 5 '2013-10-01' '14:50:00' '00:50:00'; 6 '2013-10-01' '15:15:00' '01:15:00'; 7 '2013-10-01' '15:55:00' '00:05:00'};
col1 = cell2mat(s(:,1));
uniqcol1 = union(col1,col1);
% change all the repeated rows to '0' (or some number which is unused in the first column) except the first one and then remove them.
for i=1:length(uniqcol1)
for j = 1:size(s,1)
if uniqcol1(i)==s{j,1}
for k = j+1: size(s,1)
if uniqcol1(i)==s{k,1}
s{k,1} = 0;
end
end
end
end
end
col1 = cell2mat(s(:,1));
s(col1==0,:)=[];
  1 个评论
bero
bero 2016-2-17
It is so slowly for my my big matrix, I need more efficient and faster process...

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Logical 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by