Summing elements in an array between uneven date intervals
1 次查看(过去 30 天)
显示 更早的评论
I have a dataset of daily rainfall. I also have a list of 28 days on which environmental sampling was carried out. I'm trying to figure out how to sum the total precipitation that fell between each environmental sampling (represented by "sample_dates")
The environmental samplings are mostly about 30 days apart, but not exactly. I already know how to find the monthly precipitation, but what I want is the exact amount of rainfall that fell between each sampling.
Any advice on what functions to use would be so appreciated. I've read documentation for cumsum, ismember, movsum, and a bunch of previously asked questions but I can't find anything that's helped me come up with a solution.
T = readtable("precip_test.csv", "VariableNamingRule","preserve");
A = table2array(T);
%sequential dates from Sept 2019 to Dec 2022
rain_dates = A(:,1);
%daily precipitation in mm for each day
rain_mm = A(:,2);
%dates that environmental samples were taken
sample_dates = A(:,3);
sample_dates(any(isnan(sample_dates), 2), :) = [];
采纳的回答
Star Strider
2023-10-6
编辑:Star Strider
2023-10-6
I am not certain what you want. If I am readiing the file correctly and converting the dates correctly (both 'excel' and 'posixtime' give reasonable results for the dates, however the ‘Date Time’ and ‘Sample Dates’ do not make sense with 'excel') the ‘Sample Dates’ seem to give appropriate results with ‘Precip (mm)’ so it would seem that with that we are finished and nothing further need be done.
T = readtable("precip_test.csv", "VariableNamingRule","preserve")
DateTime = datetime(T.Date, 'ConvertFrom','excel');
SampleDates = datetime(T.('Sample dates'), 'ConvertFrom','excel');
T2 = table(DateTime, T.('Precip (mm)'), SampleDates, 'VariableNames',{'Date Time','Precip (mm)','Sample Dates'})
T2 = rmmissing(T2); % Remove 'NaT' Rows
LastLine = T2(end,:)
VN = T2.Properties.VariableNames;
figure
stairs(T2.('Date Time'), T2.('Precip (mm)'))
grid
xlabel(VN{3})
ylabel(VN{2})
return % Stop Here
A = table2array(T);
%sequential dates from Sept 2019 to Dec 2022
rain_dates = A(:,1);
%daily precipitation in mm for each day
rain_mm = A(:,2);
%dates that environmental samples were taken
sample_dates = A(:,3);
sample_dates(any(isnan(sample_dates), 2), :) = [];
What else do you want to do with these resullts?
.
15 个评论
Emily
2023-10-6
Dear Star Strider,
Thank you for your reply. I should have been clearer, the dates are in excel datenumber format.
The end result that I am after is the sum of precipitation (measured daily – that's what the long list of dates is for) that fell between each sampling date (the shorter list of dates, they are all contained on the long list of dates).
in other words, in case that's still not clear:
every day we measure how much it rains. every approximately (not exactly) ~30 days we take a sample. how much rain falls in total between the samples? the sample interval is uneven, so that's why I'm having issues figuring out how to approach this.
thanks again!
Star Strider
2023-10-6
编辑:Star Strider
2023-10-6
O.K. I changed that in my earlier code, so they now convert from 'excel' however now the 'Sample Dates' do not make sense in the context of 'Date Time'.
What do we do with the date information?
T = readtable("precip_test.csv", "VariableNamingRule","preserve")
T = 1338×3 table
Date Precip (mm) Sample dates
_____ ___________ ____________
43709 6.7 43773
43710 0 43802
43711 0 43838
43712 0.4 43865
43713 15.9 43899
43714 5 43927
43715 4.8 43959
43716 2 43984
43717 0.1 43984
43718 0 44018
43719 0 44075
43720 0 44102
43721 0 44144
43722 0.4 44166
43723 4.6 44204
43724 12.8 44235
DateTime = datetime(T.Date, 'ConvertFrom','excel');
SampleDates = datetime(T.('Sample dates'), 'ConvertFrom','excel');
T2 = table(DateTime, T.('Precip (mm)'), SampleDates, 'VariableNames',{'Date Time','Precip (mm)','Sample Dates'})
T2 = 1338×3 table
Date Time Precip (mm) Sample Dates
___________ ___________ ____________
01-Sep-2019 6.7 04-Nov-2019
02-Sep-2019 0 03-Dec-2019
03-Sep-2019 0 08-Jan-2020
04-Sep-2019 0.4 04-Feb-2020
05-Sep-2019 15.9 09-Mar-2020
06-Sep-2019 5 06-Apr-2020
07-Sep-2019 4.8 08-May-2020
08-Sep-2019 2 02-Jun-2020
09-Sep-2019 0.1 02-Jun-2020
10-Sep-2019 0 06-Jul-2020
11-Sep-2019 0 01-Sep-2020
12-Sep-2019 0 28-Sep-2020
13-Sep-2019 0 09-Nov-2020
14-Sep-2019 0.4 01-Dec-2020
15-Sep-2019 4.6 08-Jan-2021
16-Sep-2019 12.8 08-Feb-2021
T2 = rmmissing(T2); % Remove 'NaT' Rows
LastLine = T2(end,:)
LastLine = 1×3 table
Date Time Precip (mm) Sample Dates
___________ ___________ ____________
28-Sep-2019 0 07-Jun-2022
VN = T2.Properties.VariableNames;
figure
stairs(T2.('Date Time'), T2.('Precip (mm)'))
grid
xlabel(VN{3})
ylabel(VN{2})
TT = table2timetable(T2(:,[1 2]))
TT = 28×1 timetable
Date Time Precip (mm)
___________ ___________
01-Sep-2019 6.7
02-Sep-2019 0
03-Sep-2019 0
04-Sep-2019 0.4
05-Sep-2019 15.9
06-Sep-2019 5
07-Sep-2019 4.8
08-Sep-2019 2
09-Sep-2019 0.1
10-Sep-2019 0
11-Sep-2019 0
12-Sep-2019 0
13-Sep-2019 0
14-Sep-2019 0.4
15-Sep-2019 4.6
16-Sep-2019 12.8
TTrainfall = retime(TT, 'monthly','sum')
TTrainfall = timetable
Date Time Precip (mm)
___________ ___________
01-Sep-2019 80
% return % Stop Here
%
% A = table2array(T);
%
% %sequential dates from Sept 2019 to Dec 2022
% rain_dates = A(:,1);
%
% %daily precipitation in mm for each day
% rain_mm = A(:,2);
%
% %dates that environmental samples were taken
% sample_dates = A(:,3);
% sample_dates(any(isnan(sample_dates), 2), :) = [];
Since there is only one month, there is only one result, showing that for the data available, September 2019 had 80 mm of liquid precipitation.
.
Emily
2023-10-6
编辑:Emily
2023-10-6
I tried your code and everything is loaded correctely.
So, now I want to sum up precipitation from between the first two 'Sample Dates' in the list, which are:
'04-Nov-2019'
'03-Dec-2019'
'DateTime' has dates for every single day, this date range from the 'Sample Dates' is represented by
DateTime(65,1) through DateTime(94,1)
My goal is to sum up the precipitation corresponding to the date range.
The precip corresponding to the these ranges is represented as:
T.('Precip (mm)')(65,1) through T.('Precip (mm)')(94,1)
Then I want to repeat this process for all of the dates in the 'Sample Dates' list
Does this make sense?
Thank you for your continued help.
Emily
2023-10-6
I cannot use the retime function to resample at the 1 month interval, because what I am trying to calculate is not exactly the same as monthly precipitation. I need to calculate the precipitation between each sampling date which is never exactly 1 month, it varies each time. So I need to sum over a custom interval using the sampling dates as inputs.
Star Strider
2023-10-6
I do not understand the sampling dates. They make no sense with respect to the first column dates.
How are they supposed to be used?
Emily
2023-10-6
These are the sample dates:
'04-Nov-2019'
'03-Dec-2019'
'08-Jan-2020'
'04-Feb-2020'
'09-Mar-2020'
'06-Apr-2020'
'08-May-2020'
'02-Jun-2020'
'02-Jun-2020'
'06-Jul-2020'
'01-Sep-2020'
'28-Sep-2020'
'09-Nov-2020'
'01-Dec-2020'
'08-Jan-2021'
'08-Feb-2021'
'08-Mar-2021'
'09-Apr-2021'
'05-Jun-2021'
'08-Jul-2021'
'06-Sep-2021'
'01-Nov-2021'
'08-Dec-2021'
'06-Jan-2022'
'04-Mar-2022'
'01-Apr-2022'
'04-May-2022'
'07-Jun-2022'
They are all within the DateTime dates (daily dates from Sept. 2019 - Nov 2023)
The sample dates are needed to create the intervals over which I need to sum precipitation (measured daily).
For example I want to sum up daily precip from:
'04-Nov-2019' to '03-Dec-2019'
'03-Dec-2019' to '08-Jan-2020'
'08-Jan-2020' to '04-Feb-2020'
...
'04-May-2022' to '07-Jun-2022'
The reason I can't just use monthly precip is because the amount of time between these intervals is slightly different.
Hopefully this provides some clarity.
Emily
2023-10-6
In case its not clear, the daily precipitation corresponds to the DateTime dates. The Sampling Dates is the list of intervals over which to sum.
Star Strider
2023-10-6
The file contains only data for September 2019, and the first sampling interval is betweeen 4 November and 3 December 2019.
That makes absolutely no sense.
Emily
2023-10-6
I see the issue. Your T2 has truncated the dates. Just use this and you should be able to see that the file contains data from Sept 2019 - Nov 2023.
T = readtable("precip_test.csv", "VariableNamingRule","preserve")
DateTime = datetime(T.Date, 'ConvertFrom','excel');
SampleDates = datetime(T.('Sample dates'), 'ConvertFrom','excel');
Star Strider
2023-10-6
编辑:Star Strider
2023-10-7
The dates are truncated because several of the 'Sample Dates' column were listed as NaT.
Separating out 'Sample Dates' to a separate table permits logical indexing based on its elements.
Try this —
T = readtable("precip_test.csv", "VariableNamingRule","preserve")
T = 1338×3 table
Date Precip (mm) Sample dates
_____ ___________ ____________
43709 6.7 43773
43710 0 43802
43711 0 43838
43712 0.4 43865
43713 15.9 43899
43714 5 43927
43715 4.8 43959
43716 2 43984
43717 0.1 43984
43718 0 44018
43719 0 44075
43720 0 44102
43721 0 44144
43722 0.4 44166
43723 4.6 44204
43724 12.8 44235
DateTime = datetime(T.Date, 'ConvertFrom','excel');
SampleDates = datetime(T.('Sample dates'), 'ConvertFrom','excel');
T2 = table(DateTime, T.('Precip (mm)'), SampleDates, 'VariableNames',{'Date Time','Precip (mm)','Sample Dates'})
T2 = 1338×3 table
Date Time Precip (mm) Sample Dates
___________ ___________ ____________
01-Sep-2019 6.7 04-Nov-2019
02-Sep-2019 0 03-Dec-2019
03-Sep-2019 0 08-Jan-2020
04-Sep-2019 0.4 04-Feb-2020
05-Sep-2019 15.9 09-Mar-2020
06-Sep-2019 5 06-Apr-2020
07-Sep-2019 4.8 08-May-2020
08-Sep-2019 2 02-Jun-2020
09-Sep-2019 0.1 02-Jun-2020
10-Sep-2019 0 06-Jul-2020
11-Sep-2019 0 01-Sep-2020
12-Sep-2019 0 28-Sep-2020
13-Sep-2019 0 09-Nov-2020
14-Sep-2019 0.4 01-Dec-2020
15-Sep-2019 4.6 08-Jan-2021
16-Sep-2019 12.8 08-Feb-2021
VN = T2.Properties.VariableNames;
T3 = T2(:,3); % Separate 'Sample Dates'
T3 = rmmissing(T3)
T3 = 28×1 table
Sample Dates
____________
04-Nov-2019
03-Dec-2019
08-Jan-2020
04-Feb-2020
09-Mar-2020
06-Apr-2020
08-May-2020
02-Jun-2020
02-Jun-2020
06-Jul-2020
01-Sep-2020
28-Sep-2020
09-Nov-2020
01-Dec-2020
08-Jan-2021
08-Feb-2021
T2 = T2(:,[1 2])
T2 = 1338×2 table
Date Time Precip (mm)
___________ ___________
01-Sep-2019 6.7
02-Sep-2019 0
03-Sep-2019 0
04-Sep-2019 0.4
05-Sep-2019 15.9
06-Sep-2019 5
07-Sep-2019 4.8
08-Sep-2019 2
09-Sep-2019 0.1
10-Sep-2019 0
11-Sep-2019 0
12-Sep-2019 0
13-Sep-2019 0
14-Sep-2019 0.4
15-Sep-2019 4.6
16-Sep-2019 12.8
LastLine = T2(end,:)
LastLine = 1×2 table
Date Time Precip (mm)
___________ ___________
30-Nov-2023 0
PrecipStats = table('Size',[size(T3,1)-1,3], 'VariableTypes',{'datetime','datetime','double'});
for k = 1:size(T3,1)-1
Lv = T2{:,1} >= T3{k,1} & T2{:,1} < T3{k+1,1};
PrecipStats(k,:) = {T3{k,1}, T3{k+1,1}, sum(T2{Lv,2})};
end
PrecipStats.Properties.VariableNames = {'Start Date','End Date','Precip Total'}
PrecipStats = 27×3 table
Start Date End Date Precip Total
___________ ___________ ____________
04-Nov-2019 03-Dec-2019 48.6
03-Dec-2019 08-Jan-2020 103.6
08-Jan-2020 04-Feb-2020 119
04-Feb-2020 09-Mar-2020 47.9
09-Mar-2020 06-Apr-2020 44
06-Apr-2020 08-May-2020 71.1
08-May-2020 02-Jun-2020 56.8
02-Jun-2020 02-Jun-2020 0
02-Jun-2020 06-Jul-2020 99
06-Jul-2020 01-Sep-2020 115
01-Sep-2020 28-Sep-2020 73.1
28-Sep-2020 09-Nov-2020 97.1
09-Nov-2020 01-Dec-2020 66.6
01-Dec-2020 08-Jan-2021 155
08-Jan-2021 08-Feb-2021 68.2
08-Feb-2021 08-Mar-2021 65.4
LastLine = PrecipStats(end,:)
LastLine = 1×3 table
Start Date End Date Precip Total
___________ ___________ ____________
04-May-2022 07-Jun-2022 61.9
figure
stairs(T2.('Date Time'), T2.('Precip (mm)'), 'DisplayName',VN{2})
hold on
stairs(PrecipStats.('Start Date'), PrecipStats.('Precip Total'), 'DisplayName','Interim Totals (mm)')
hold off
grid
xlabel(VN{3})
ylabel(VN{2})
legend('Location','NW')
.
Emily
2023-10-7
Yes! this is it!
Can I ask a clarifying question so i better understand your approach ––
What is the purpose of this line of code?
Lv = T2{:,1} >= T3{k,1} & T2{:,1} < T3{k+1,1}
The for loop approach intuitively makes a lot of sense to me but I want to understand the nuances clearly so I could do it on my own
Thank you for your patience and help. much appreciated.
Star Strider
2023-10-7
My pleasure!
That line creates the logical vector, ‘Lv’. (I could have used find, to determine the numerical indices, and while that would work and in some situations is necessary, it is unnecessary here and inefficient to do the same operation, that being to find the rows in ‘T2’ that meet that specific condition.)
A somewhat simpler and more compact illustration —
T9 = array2table([1:10; randn(1,10)]')
T9 = 10×2 table
Var1 Var2
____ ________
1 -1.3207
2 -0.4068
3 -0.93179
4 -0.89652
5 -0.45122
6 0.54392
7 -0.61332
8 1.8858
9 1.3137
10 0.55674
Lv = T9{:,1} >= 3 & T9{:,1} < 7
Lv = 10×1 logical array
0
0
1
1
1
1
0
0
0
0
Out = T9(Lv,:)
Out = 4×2 table
Var1 Var2
____ ________
3 -0.93179
4 -0.89652
5 -0.45122
6 0.54392
That’s essentially all there is to it.
.
Emily
2023-10-7
thank you! this makes a lot of sense and after reading through the documentation and looking at the examples I feel like i actually understand it, which is awesome. much appreciated :)
更多回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!发生错误
由于页面发生更改,无法完成操作。请重新加载页面以查看其更新后的状态。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
亚太
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)