interpolating missing data
15 次查看(过去 30 天)
显示 更早的评论
Hi all,
I'm trying to estimate model parameters in MATLAB using data I collected in the lab, but I didn't measure all of the variables every day (so for some days I only have data for one variable). The data look like this (time; variable 1; variable 2; variable 3):
1 2330000 5.92275000000000e-06 36.2000000000000
2 52900000 2.79773000000000e-07 35.2000000000000
3 357000000 6.69468000000000e-08 26.1000000000000
4 389000000 1.19846000000000e-07 3.38000000000000
5 668000000 7.43263000000000e-08 0.350000000000000
6 1100000000.00000 4.52455000000000e-08 0.230000000000000
7 1530000000.00000 3.24575000000000e-08 0.340000000000000
8 1250000000.00000 3.96000000000000e-08 0.500000000000000
9 1490000000.00000 3.33154000000000e-08 0.360000000000000
10 1850000000.00000 NaN NaN
12 2050000000.00000 2.42585000000000e-08 0.270000000000000
14 2290000000.00000 NaN NaN
17 2120000000.00000 NaN NaN
19 5090000000.00000 9.79568000000000e-09 0.140000000000000
I've found a way to deal with this by replacing the NaN's with 0s, but I really don't want to do that in this case since it would screw up the estimation. I read something about interpolating the missing data using interp1 but I haven't been able to get that to work. Any help would be much appreciated. Thank you!
0 个评论
采纳的回答
Sven
2011-12-1
Let's start with your data.
data = [1 2330000 5.92275000000000e-06 36.2000000000000
2 52900000 2.79773000000000e-07 35.2000000000000
3 357000000 6.69468000000000e-08 26.1000000000000
4 389000000 1.19846000000000e-07 3.38000000000000
5 668000000 7.43263000000000e-08 0.350000000000000
6 1100000000.00000 4.52455000000000e-08 0.230000000000000
7 1530000000.00000 3.24575000000000e-08 0.340000000000000
8 1250000000.00000 3.96000000000000e-08 0.500000000000000
9 1490000000.00000 3.33154000000000e-08 0.360000000000000
10 1850000000.00000 NaN NaN
12 2050000000.00000 2.42585000000000e-08 0.270000000000000
14 2290000000.00000 NaN NaN
17 2120000000.00000 NaN NaN
19 5090000000.00000 NaN 0.140000000000000]
Now here's how you can use interp1, looped over each column. I've updated it to handle NaN values on the end that can't be addressed with pure interpolation:
fullData = data;
for c = 2:size(data,2)
nanRows =
fullData(nanRows,c) = interp1(data(~nanRows,1), data(~nanRows,c), data(nanRows,1));
nanRows = isnan(data(:,c));
fullData(nanRows,c) = interp1(data(~nanRows,1), data(~nanRows,c), data(nanRows,1), 'nearest','extrap');
end
2 个评论
Sven
2011-12-2
Yes, is is a small annoyance I have with interp1. Note the difference between _interpolation_ and _extrapolation_. For the former, you need a value above *and* below your query point. I assume that what you really want to do is:
1. Interpolate *linearly* for any _internal_ NaNs.
2. Set those NaN values on the outside to their nearest non-NaN neighbour's value.
My two most-used modes for *interp1* are 'linear' or 'nearest'. There's also an 'extrap' option to extrapolate. But since the above points one and two use different _forms_ of interpolation/extrapolation, you can't do this in one line.
What I do is run two interp commands... one to linearly interpolate, and one to 'nearestly' exrapolate. I've updated the answer accordingly.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Smoothing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!