how to deal with a missing value of a time series?
1 次查看(过去 30 天)
显示 更早的评论
I have few time series that are to be used in regression. for some of them the few first or last values are missing, how should I deal deal with this for the Matlab not to give error?
0 个评论
回答(2 个)
Fangjun Jiang
2011-11-17
It depends on your need. You could fill it with zero, or nan. Or you can fill it with values that are interpreted from known data.
0 个评论
Richard Willey
2011-11-17
Hi Yoshiko
The treatment of missing data is a fairly complicated topic. The choice of techniques to handle a missing data problem very much depends on how you plan to use the resulting data.
If you plan to generate a regression model from your data then your best course of action is to code the missing data points as NaNs. The regress command in Statistics Toolbox will then ignore any row that contains a missing data point.
I would strongly advise against using interpolation to substitute new values for the missing data points.
- This will impact any future analysis you do with this data and potentially bias metrics like R^2
- Using an interpolation technique for extrapolation can produce very inaccurate results
In a similar vein, coding this missing data points with any kind of numeric value (say 0 or -9999) can cause significant problems. (The regression algorithm will treat this value as a valid number)
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Linear and Nonlinear Regression 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!