Averaging timeseries with different number of data points in each time interval

3 次查看(过去 30 天)
I have an excel file with a high frequency time series of sea level depth data and want to average either over 1 minute or 5 minutes to create a smooth tidal curve. The data alternates between having 5,4 and 3 readings per second and as such has a different number of data points per minute or five minute interval. (I am also having some difficulty translating the time correctly into matlab despite trying a few different methods.)
How do I go about averaging this data to the desired interval?
...a sample of the data below, though it extends to 19:00:00
Date Time dpth01
23-Aug-13 06:00:00 2.5870821
23-Aug-13 06:00:00 2.5886297
23-Aug-13 06:00:00 2.589405
23-Aug-13 06:00:00 2.589495
23-Aug-13 06:00:00 2.5885118
23-Aug-13 06:00:01 2.5824207
23-Aug-13 06:00:01 2.5806436
23-Aug-13 06:00:01 2.5826254
23-Aug-13 06:00:02 2.5822377
23-Aug-13 06:00:02 2.5875535
23-Aug-13 06:00:02 2.5879877
23-Aug-13 06:00:02 2.5863346
23-Aug-13 06:00:02 2.5842102
23-Aug-13 06:00:03 2.5792418
23-Aug-13 06:00:03 2.5762738
23-Aug-13 06:00:03 2.5781688
23-Aug-13 06:00:03 2.5772973

回答(2 个)

Peter Perkins
Peter Perkins 2016-11-2
编辑:Peter Perkins 2016-11-9
If you happen to have access to R2016b (the latest MATLAB release), you could try using timetables. Assuming you start with a file that more or less looks like what you've shown, readtable (in R2016b) will give you something like this:
>> t = readtable('tmp5.csv');
t =
Date Time dpth01
___________ __________ ______
23-Aug-2013 '06:00:00' 2.5871
23-Aug-2013 '06:00:00' 2.5886
23-Aug-2013 '06:00:00' 2.5894
23-Aug-2013 '06:00:00' 2.5895
23-Aug-2013 '06:00:00' 2.5885
23-Aug-2013 '06:00:01' 2.5824
23-Aug-2013 '06:00:01' 2.5806
23-Aug-2013 '06:00:01' 2.5826
23-Aug-2013 '06:00:02' 2.5822
23-Aug-2013 '06:00:02' 2.5876
23-Aug-2013 '06:00:02' 2.588
23-Aug-2013 '06:00:02' 2.5863
23-Aug-2013 '06:00:02' 2.5842
23-Aug-2013 '06:00:03' 2.5792
23-Aug-2013 '06:00:03' 2.5763
23-Aug-2013 '06:00:03' 2.5782
23-Aug-2013 '06:00:03' 2.5773
Then massage the table into a timetable:
>> d = t.Date + timeofday(datetime(t.Time,'InputFormat','HH:mm:ss'));
>> d.Format = 'dd-MMM-yyyy HH:mm:ss';
>> tt = table2timetable(t(:,'dpth01'),'RowTimes',d)
tt =
Time dpth01
____________________ ______
23-Aug-2013 06:00:00 2.5871
23-Aug-2013 06:00:00 2.5886
23-Aug-2013 06:00:00 2.5894
23-Aug-2013 06:00:00 2.5895
23-Aug-2013 06:00:00 2.5885
23-Aug-2013 06:00:01 2.5824
23-Aug-2013 06:00:01 2.5806
23-Aug-2013 06:00:01 2.5826
23-Aug-2013 06:00:02 2.5822
23-Aug-2013 06:00:02 2.5876
23-Aug-2013 06:00:02 2.588
23-Aug-2013 06:00:02 2.5863
23-Aug-2013 06:00:02 2.5842
23-Aug-2013 06:00:03 2.5792
23-Aug-2013 06:00:03 2.5763
23-Aug-2013 06:00:03 2.5782
23-Aug-2013 06:00:03 2.5773
retime is the way to compute means over specified time periods:
>> retime(tt,'secondly','mean') % every second
ans =
Time dpth01
____________________ ______
23-Aug-2013 06:00:00 2.5886
23-Aug-2013 06:00:01 2.5819
23-Aug-2013 06:00:02 2.5857
23-Aug-2013 06:00:03 2.5777
>> retime(tt,datetime(2013,8,23,6,0,0:5:15),'mean') % every 5 seconds
ans =
4×1 timetable array
Time dpth01
____________________ ______
23-Aug-2013 06:00:00 2.584
23-Aug-2013 06:00:05 NaN
23-Aug-2013 06:00:10 NaN
23-Aug-2013 06:00:15 NaN
In the last case, there's not enough data for anything interesting, but you get the idea. There's all kinds of other interpolation and aggregation options in retime as well.
Hope this helps.
  3 个评论
Peter Perkins
Peter Perkins 2016-11-9
I apologize, that line was a red herring. I have deleted that line, it was because I was initially trying to demonstrate what you'd do prior to R2016b, when the first column of the file would have come in as text.
However, the fact that you got that error leads me to think that you are using something prior to R2016b, or something went wrong with reading the file. What release are you using? If prior to R2016b, you can still more or less do what I've illustrated, just not with timetables.
Kelly Hiersche
Kelly Hiersche 2020-4-1
Thank you Mr. Perkins!! I had a similar issue and this worked PERFECTLY! Greatly appreciated!
Best
Kelly

请先登录,再进行评论。


Walter Roberson
Walter Roberson 2016-10-31
There are two basic approaches:
1) decide upon a common time step and starting time, and use interp1() to interpolate the values at that common time step. For example you might decide to interpolate at every 1/2 second no matter what time base the inputs were; Or,
2) merge all of the sample times into one vector. Then use interp1() to interpolate at all of the sample times. The interval between adjacent time samples could come out uneven but every input value would appear in the output.
Either way, once you have the data interpolated at the same time points, you can take sums or differences meaningfully.
  2 个评论
Dom Smith
Dom Smith 2016-11-1
I am not too concerned about interpolating the data though, due to how high frequency it is, I really want to just average what I have over 5 minute time slots. Is there no way to select data related to particular times... ie. select depths from 06:00:00 to 06:04:59 and average these, perhaps looping this to work through the period in question?
Walter Roberson
Walter Roberson 2016-11-2
Resample to a common time base with points closer together than twice the least time delta (so that every point has an influence on the output.) reshape() so each column includes the amount of time you need. mean()

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Data Type Identification 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by