Grouping times by end to start time

4 次查看(过去 30 天)
Hey everyone, I have to group event times together and can’t think of the best way to do it. I have two arrays of values. The first are start times and the second is end times. I need to group a collection of start and end times together if the gap time between each group’s end time and start time is less than than some given value. For example, with a max gap time of 5, and start times of [5, 7, 17, 21, 35, 37] and end times of [12, 9, 22, 23, 38, 41], i need to group start time as such: {[5, 7, 17, 21], [35, 37]}, while end times should look like this: {[12, 9, 22, 23], [38, 41]}. This occurs like this: {[5]}, {[12]} since event 1 has the earliest start time, it is stored first, the the other event’s start times are checked to see if they are within gap time from the first event’s end time. This leads to these start times: {[5, 7, 17]}, and these end times: {[12, 9, 22]}. Then again, the other event’s are checked to see if their start times are within the gap time of the group’s maximum end time (in this case 22). Accordingly, the new start times are {[5, 7, 17, 21]}, and end time are {[12, 9, 22, 23]}. The new group’s end time is 23. Now no other event’s have start times within the gap time. So this process repeats with the remaining events. Thank you everyone!

回答(4 个)

Chunru
Chunru 2023-4-6
t1 = [5, 7, 17, 21, 35, 37];
t2 = [12, 9, 22, 23, 38, 41];
t2_c = cummax(t2)
t2_c = 1×6
12 12 22 23 38 41
gap = [0 t1(2:end) - t2_c(1:end-1)]
gap = 1×6
0 -5 5 -1 12 -1
gap = gap > 5
gap = 1×6 logical array
0 0 0 0 1 0
idx = find(diff([-inf gap])> 0)
idx = 1×2
1 5
idx = [idx length(t1)+1];
for i=1:length(idx)-1
t1out{i} = t1(idx(i):idx(i+1)-1);
t2out{i} = t2(idx(i):idx(i+1)-1);
end
t1out, t2out
t1out = 1×2 cell array
{[5 7 17 21]} {[35 37]}
t2out = 1×2 cell array
{[12 9 22 23]} {[38 41]}

Image Analyst
Image Analyst 2023-4-6
If you have the stats toolbox, I'd use dbscan
% Optional initialization steps
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 14;
markerSize = 30;
startTimes = [5, 7, 17, 21, 35, 37];
endTimes = [12, 9, 22, 23, 38, 41];
subplot(2, 1, 1);
plot(startTimes, endTimes, 'b.', 'MarkerSize', markerSize);
grid on
xlabel('StartTimes', 'FontSize',fontSize)
ylabel('EndTimes', 'FontSize',fontSize)
title('Raw, unclassified points', 'FontSize',fontSize)
%--------------------------------------------------------------------------------------------------------------------
% Measure the distance between points.
xy = [startTimes(:), endTimes(:)]
xy = 6×2
5 12 7 9 17 22 21 23 35 38 37 41
distances = pdist2(xy, xy) % Just to see the distance between points.
distances = 6×6
0 3.60555127546399 15.6204993518133 19.4164878389476 39.6988664825584 43.1856457633784 3.60555127546399 0 16.4012194668567 19.7989898732233 40.3112887414927 43.8634243989226 15.6204993518133 16.4012194668567 0 4.12310562561766 24.0831891575846 27.5862284482674 19.4164878389476 19.7989898732233 4.12310562561766 0 20.5182845286832 24.0831891575846 39.6988664825584 40.3112887414927 24.0831891575846 20.5182845286832 0 3.60555127546399 43.1856457633784 43.8634243989226 27.5862284482674 24.0831891575846 3.60555127546399 0
%--------------------------------------------------------------------------------------------------------------------
% Do clustering with the "dbscan" algorithm.
% [classNumbers, corepts] = dbscan(distances, searchRadius, minPointsPerCluster, 'Distance','precomputed')
searchRadius = 5; % It's in the same cluster if the point is within this of other points.
minPointsPerCluster = 2; % We need to have at least this many point to be considered a valid cluster.
[classNumbers, isACorePoint] = dbscan(xy, searchRadius, minPointsPerCluster)
classNumbers = 6×1
1 1 2 2 3 3
isACorePoint = 6×1 logical array
1 1 1 1 1 1
%--------------------------------------------------------------------------------------------------------------------
% Plot the clusters in unique colors.
subplot(2, 1, 2);
numClusters = max(classNumbers);
cMap = turbo(numClusters);
for k = 1 : numClusters
thisClustersIndexes = classNumbers == k;
plot(startTimes(thisClustersIndexes), endTimes(thisClustersIndexes), '.-', ...
'MarkerSize', markerSize, 'LineWidth', 3, 'Color', cMap(k, :))
hold on;
end
grid on
xlabel('StartTimes', 'FontSize',fontSize)
ylabel('EndTimes', 'FontSize',fontSize)
title('Now classified into groups', 'FontSize',fontSize)

Image Analyst
Image Analyst 2023-4-6
编辑:Image Analyst 2023-4-6
Not sure how you got your results but they don't seem to follow your definition of the gap time being the time between the start time and the end time: "the gap time between each group’s end time and start time". This is what I get:
startTimes = [5, 7, 17, 21, 35, 37];
endTimes = [12, 9, 22, 23, 38, 41];
% Measure "gaps" defined as difference between end times and start times.
gapTimes = endTimes - startTimes
gapTimes = 1×6
7 2 5 2 3 4
gapThreshold = 5;
% Find elements with a gap less than the threshold.
indexes = gapTimes <= gapThreshold;
startShort = startTimes(indexes)
startShort = 1×5
7 17 21 35 37
endShort = endTimes(indexes)
endShort = 1×5
9 22 23 38 41
% Find elements with a gap more than the threshold.
indexes = gapTimes > gapThreshold;
startLong = startTimes(indexes)
startLong = 5
endLong = endTimes(indexes)
endLong = 12
Or did you really mean "the gap time is the time between one group’s end time and the start time of the next group"?
  2 个评论
Ryan
Ryan 2023-4-6
You are correct, I explained it poorly, the gap time is the time between one group’s end time and the start time of another group.
Image Analyst
Image Analyst 2023-4-6
But some of your events overlap and have negative gap times:
startTimes = [ 5, 7, 17, 21, 35, 37];
endTimes = [12, 9, 22, 23, 38, 41];
% Measure "gaps" defined as difference between end times and start times.
gapTimes = endTimes(1:end-1) - startTimes(2:end)
gapTimes = 1×5
5 -8 1 -12 1
What do you want to do in the case that the events overlap?

请先登录,再进行评论。


Peter Perkins
Peter Perkins 2023-4-6
This is a lot like hierarchical clustering, if you happen to have the Statistics Toolbox.

类别

Help CenterFile Exchange 中查找有关 Graphics Performance 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by