Make unequally spaced data, equally spaced

19 次查看(过去 30 天)
Hello all,
I have the hourly temperature history for a long period of time (100k datapoints) for several locations. For easier data manipulation I would like 24 measurements for each day. However the data I have, has sometimes either 2-4 measurements within the same hour or inversely there are some hours without any measurement.
The time spamps are of the format 200001010000 (YEARMODAHRMN). I would like to ask you if you can think or have any script that could do the interpolation between adjacent data so that finally I end up with data points that are equally spaced.
Thank you in advance.
  3 个评论
Star Strider
Star Strider 2015-4-28
Are the timestamps imported as integer (numeric) or string variables?
Konstantinos Belivanis
This is how the data looks like (see attached). The problem is that not all the temperatures are corresponding to the time on the hour (0800, 0900, 1000, etc.) and some of them have asterisks. Interpolation between the two closest values is what I would like to achieve.

请先登录,再进行评论。

采纳的回答

Star Strider
Star Strider 2015-4-30
Now that we have the data file, this is one option:
[d,s,r] = xlsread('Austin-Temperatures.xlsx');
d(any(isnan(d),2),:) = []; % Rmeove NaN Rows
ds = num2str(d(:,1), '%11d'); % Convert To Strings
dn = datenum(ds, 'yyyymmddHHMM'); % Date Numbers
dvck = datevec(dn); % Check Conversion
dn_intrp = datenum([dvck(1,1:4) 0 0]):(1/24):datenum([dvck(end,1:4) 0 0]);
T = interp1(dn, d(:,2), dn_intrp', 'linear','extrap');
figure(1)
plot(dn, d(:,2), 'gp', 'MarkerSize',10)
hold on
plot(dn_intrp, T, 'bp', 'MarkerFaceColor','c')
hold off
grid
datetick('x', 'HH:MM')
legend('Original Data', 'Hourly Interpolated Data','Location','N')
The ‘dn_intrp’ assignment creates a vector of hourly ‘date number’ values between the first hour value and the last hour value. It then uses that to interpolate the temperatures in the ‘T’ assignment. I set it to do a linear extrapolation, so here it creates a temperature at 3:00. Delete that if necessary simply by deleting it (the first element) from the ‘dn_intrp’ and ‘T’ vectors.
The plot is simply to illustrate the data the routine produces. It is not necessary for the code.
  2 个评论
Konstantinos Belivanis
Thanks a lot strider! It does it EXTREMELY fast! only thing needed was to convert back the dates to strings but I found the datestr() function!
Thanks again! It was awesome!

请先登录,再进行评论。

更多回答(1 个)

pfb
pfb 2015-4-28
编辑:pfb 2015-4-28
Perhaps this is obvious, but
datenum(date,'yyyymmddhhMM');
where "date" is a char variable containing your timestamp, converts the date into a number. E.g.
datenum('200001010000','yyyymmddhhMM')
Gives
730486
and
dd=['200001010000';'200101010000']
datenum(dd,'yyyymmddhhMM')
gives
730486
730852
You can go through your timestamps individually, in a loop, or form a Nx12 matrix of chars and feed it to datenum. Either way you end with a Nx1 vector of numbers representing the timestamps, and you'll have a similar vector containing the corresponding temperatures.
At this point, you can form an equally spaced grid using linspace, and use "interp1" to interpolate your data. You'll have to be a bit careful in selecting the correct number of gridpoints, but that should not be hard.
Type "help function" or "doc function" to summon the documentation for the builtin functions.

类别

Help CenterFile Exchange 中查找有关 Tables 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by