Get and interpolate missed daily data

3 次查看(过去 30 天)
Hi all,
I am working with daily climate data for future projections on a basin catchment (see file attached). I just noticed that the daily precipitation data are reported considering all months of 30 days. So, I need to add rows for month of 31 days (January, March, May, July, August, October,December) and delete row for the month of February (in this case also considering the 29 days for 2032, 2036, 2040, 2044, 2048, 2052, 2056, 2060). Is there any way to automatize the process?
Also how can I interpolate the missing data?
Thanks

采纳的回答

Peter Perkins
Peter Perkins 2022-3-14
编辑:Peter Perkins 2022-3-14
Tis is a one-liner with timetables and retime:
>> tt = readtimetable("missed data.xlsx")
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
tt =
11158×3 timetable
Date Lat Lon Prec_mm_day_
___________ ______ ______ ____________
01-Jan-2030 45.341 8.7889 0
02-Jan-2030 45.341 8.7889 0
03-Jan-2030 45.341 8.7889 3.0156
[snip]
28-Dec-2060 45.341 8.7889 0
29-Dec-2060 45.341 8.7889 0
30-Dec-2060 45.341 8.7889 0
((In older releases of MATLAB, use readtable and table2timetable.) There are a bunch of NaT's in the times, where the data is "at" Feb 29 in a non-leap year and Feb 30 in any year. TGhose are easy to delete.
>> tt(415:420,:)
ans =
11×3 timetable
Date Lat Lon Prec_mm_day_
___________ ______ ______ ____________
27-Feb-2031 45.341 8.7889 0.30469
28-Feb-2031 45.341 8.7889 0
NaT 45.341 8.7889 0.0078125
NaT 45.341 8.7889 0
01-Mar-2031 45.341 8.7889 0.0078125
02-Mar-2031 45.341 8.7889 0
>> tt(ismissing(tt.Date),:) = [];
Now interpolate the missing days.
>> tt2 = retime(tt,"daily","linear")
tt2 =
11322×3 timetable
Date Lat Lon Prec_mm_day_
___________ ______ ______ ____________
01-Jan-2030 45.341 8.7889 0
02-Jan-2030 45.341 8.7889 0
03-Jan-2030 45.341 8.7889 3.0156
[snip]
29-Dec-2060 45.341 8.7889 0
30-Dec-2060 45.341 8.7889 0
>> tt(28:33,:)
ans =
7×3 timetable
Date Lat Lon Prec_mm_day_
___________ ______ ______ ____________
28-Jan-2030 45.341 8.7889 16.398
29-Jan-2030 45.341 8.7889 20.563
30-Jan-2030 45.341 8.7889 2.6094
01-Feb-2030 45.341 8.7889 4.5938
02-Feb-2030 45.341 8.7889 0.14844
03-Feb-2030 45.341 8.7889 0.0078125
>> tt2(28:34,:)
ans =
7×3 timetable
Date Lat Lon Prec_mm_day_
___________ ______ ______ ____________
28-Jan-2030 45.341 8.7889 16.398
29-Jan-2030 45.341 8.7889 20.563
30-Jan-2030 45.341 8.7889 2.6094
31-Jan-2030 45.341 8.7889 3.6016
01-Feb-2030 45.341 8.7889 4.5938
02-Feb-2030 45.341 8.7889 0.14844
03-Feb-2030 45.341 8.7889 0.0078125
The Lat/Lon seem to be constant. I don't know if they should be stored separately, or if this is only art of your data and you have not told us about needing to do this calculation at all your locations.
  3 个评论
Stefania Avvedimento
The values "0" should be not interpolated since they indicate no rainy days. I only need to interpolate the values of missing day.
Peter Perkins
Peter Perkins 2022-3-14
My response already shows that the value on the 31st of each long month is interpolated.

请先登录,再进行评论。

更多回答(1 个)

Arif Hoq
Arif Hoq 2022-3-14
A=(datetime(2030,1,1):datetime(2060,12,31))'; % generating normal date
A1=datenum(A);
C=readtable('missed data.xlsx');
D1=datenum(table2cell(C(:,1)));
E=ismember(A1,D1);
idx=find(E==0); % index of missing date
F=datetime(A1(idx),'ConvertFrom','datenum') % missing date
  2 个评论
Peter Perkins
Peter Perkins 2022-3-14
编辑:Peter Perkins 2022-3-14
DON'T use datenum. It has been old for many years. Can't say this strongly enough. Use datetime.
Stefania Avvedimento
Hi Arif,
when I run your code I get this error (figure attached).

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Tables 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by