How to subset data based on time range

3 次查看(过去 30 天)
Hi,
I am wondering how to subset data based on time range ? For example I would like to subset the maximum number of rows that fall within an 18 hours time range in the table below:
data = {"2017-01-12 09:50:46" "2017-01-15 13:50:46" "2017-01-15 14:50:50" "2017-01-15 17:52:06" "2017-01-15 18:52:22" "2017-01-16 11:52:22" "2017-01-19 09:52:22"}';
t = cell2table(data);
t.data = datetime(t.data, 'InputFormat', 'yyyy-MM-dd HH:mm:ss')
The thing is, I don’t know what is the start point or the end point. All I know is that the correct answer is the most inclusive one (i.e the one with the largest number of rows). In the example here, the correct answer is t.data(2:5) because all the points fall within 18 hours of each other and because it is the answer with the largest number of rows (4). The data is set up in a way where there can only be one correct answer (only one combination has the largest number of rows). I hope I am being clear.
How would I do that ?
Thank you,

采纳的回答

Star Strider
Star Strider 2020-3-5
编辑:Star Strider 2020-3-5
I am not certain what result you want.
This computes all values of ‘t.data’ that are within 18 hours of a particular row value:
data = {"2017-01-12 09:50:46" "2017-01-15 13:50:46" "2017-01-15 14:50:50" "2017-01-15 17:52:06" "2017-01-15 18:52:22" "2017-01-19 09:52:22"}';
t = cell2table(data);
t.data = datetime(t.data, 'InputFormat', 'yyyy-MM-dd HH:mm:ss');
h18 = t.data + hours(18);
for k = 1:size(t.data,1)
within18{k,:} = t.data((t.data >= t.data(k)) & (t.data <= h18(k)));
end
For example:
Within18_3 = [within18{3}]
contains:
Within18_3 =
3×1 datetime array
15-Jan-2017 14:50:50
15-Jan-2017 17:52:06
15-Jan-2017 18:52:22
The first row in ‘Within18_3’ are the third row value and the next two rows are those within 18 hours of it.
EDIT — (5 Mar 2020 at 17:36)
If you want ±9 hours instead, this works:
h18 = [t.data + hours(-9), t.data + hours(9)];
for k = 1:size(t.data,1)
within18{k,:} = t.data((t.data >= h18(k,1)) & (t.data <= h18(k,2)));
end
For example:
Within18_3 = [within18{3}]
now contains:
Within18_3 =
4×1 datetime array
15-Jan-2017 13:50:46
15-Jan-2017 14:50:50
15-Jan-2017 17:52:06
15-Jan-2017 18:52:22
  2 个评论
012786534
012786534 2020-3-5
Hi Star Strider,
Indeed, I should have been clearer. My apologies. I have clarified my question above. Thank you.
Star Strider
Star Strider 2020-3-5
To get the set with the greatest number of rows:
[~,idx] = max(cellfun(@(x)size(x,1), within18))
Out = [within18{idx}]
That works however you want to define it, however it will return only the first set if there are several with the same maximum number of rows.
To return all that have the maximum number of rows:
rowsizes = cellfun(@(x)size(x,1), within18);
maxrows = max(rowsizes)
idx = find(rowsizes == maxrows)
Out = [within18{idx}]

请先登录,再进行评论。

更多回答(1 个)

Guillaume
Guillaume 2020-3-5
It's not too clear what you mean by subsetting.If you want to bin the dates in ranges of 18 hours then use discretize:
discretize(t.data, 'hours(18)') %split into 18 hours bins. Returns the bin indices

类别

Help CenterFile Exchange 中查找有关 Line Plots 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by