How can I identify a pattern of occurrences over multiple days?
3 次查看(过去 30 天)
显示 更早的评论
Hello all,
I am attempting to write a script that will look for a pattern of event occurrences over multiple days of data. Seems like it should be simple enough, yet I am scratching my head.
I want to identify spans of time of a minimum of five days where the event occurred on at least 5/7 of the days. For example, if the event occurred on all five weekdays, then did not occur over the weekend, then occurred again on the next 3 weekdays, I would want to return an index of all 10 of those days. A week later (perhaps after some random occurrences in between) if the event occurred for 3 days in a row, skipped a day, then occurred on the 5th day, then I would want a separate set of indices for this pattern.
The input: An array containing the date of each event as a round-number datenum, e.g:
dates= [734841 734842 734843 734844 734845 734848 734849 734850 734859 734860 734861 734863]
The output: a structure containing indices of the members of each separate pattern. e.g:
patternStructure(1).index = [1 2 3 4 5 6 7 8 9 10]
patternStructure(2).index = [20 21 22 23 24]
Thanks,
Peter
0 个评论
回答(1 个)
Geoff
2012-4-13
Well, what you could say is that a value is in the required set if you subtract the date 4 events ago from the date at the current event, and that difference is less than 7 days. That is:
in = (dates(5:end) - dates(1:end-4)) < 7;
Now, I offered a handy way to find sequences in another solution: http://www.mathworks.com.au/matlabcentral/answers/34481-interpolate-nans-only-if-less-than-4-consecutive-nans
Here, too, you can exploit regexp to find the start and end indices of each sequence:
[s,e] = regexp( char(in+'0'), '1+', 'start', 'end' );
And then, accounting for the end being 4 values out, you can construct an array of indices:
patternStructure = arrayfun( @(n) struct('index', s(n):e(n)+4), 1:numel(s) );
But now, these are indices into dates, and not actual date ranges. Your question is a little strange, given your data and your result.
See, the indices for the first detected pattern in your dates array is 1:8, not 1:10, but dates(8)-dates(1)+1 is indeed 10. This is the only range in your supplied data that fits the requirement. For testing, I added:
dates(end+1) = 734864;
Which gave a 5-out-of-6 pattern from indices 9:13
Anyway, this code will detect your patterns, and it's up to you what you want to do with the indices after that =)
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Calendar 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!