Original vectors and interp1 vectors result in a different correlation coefficients?
2 次查看(过去 30 天)
显示 更早的评论
I am curious as to why I can be receiving a different correlation coefficient (R) value between the original random vectors and their interpolation. When I investigate the interpolated values, both vectors have different numbers than the original yet plot the same. In calculating the R values of both original and both interpolated vectors, the results are not similar. Any ideas of why this is?
2 个评论
Jan Orwat
2016-1-13
Could you describe between what kind of data you are calculating this correlation? Is it like vector x and y and then x_interpolated and y_interpolated? You wrote about random vectors, then those are interpolated. It seems it may increasing the strength of relationship between the data thus changing correlation, even if interpolated data looks similar. In other words interpolation decreases randomness.
回答(1 个)
John D'Errico
2016-1-14
编辑:John D'Errico
2016-1-14
There is NO expectation that a new set of interpolated points will have the same correlation coefficient as that for the base set. For example, suppose we have a set of points that follow a nonlinear relationship.
x = (0:1:10)';
y = exp(x);
We can compute the correlation coefficient.
corrcoef([x,y])
ans =
1 0.691404156500157
0.691404156500157 1
If I then interpolate the points, to get a NEW set of values, i.e.,
xi = linspace(0,10,50)':
yi = interp1(x,y,xi,'spline');
corrcoef([xi,yi])
ans =
1 0.687541374878435
0.687541374878435 1
As you see, the correlation coefficient is not the same for the new set as for the old one. That is as expected.
It would not matter had I used spline, cubic, or linear interpolation in interp1. The correlation coefficient will generally be close to the original set, but it need not be the same at all.
This is a nonlinear relationship. There is NO expectation that the correlation be the same for interpolated points as for the original. The correlation coefficient is NOT something that is maintained by interpolation. In fact, if I get creative in how I choose my data and the new set of points, I can trivially come up with an example where the correlation coefficient changes sign.
x = (0:1:10)';
y = sin(x);
corrcoef([x,y])
ans =
1 -0.116741765087288
-0.116741765087288 1
So a moderately small negative correlation.
xi = (0:.1:1)';
yi = interp1(x,y,xi,'spline');
corrcoef([xi,yi])
ans =
1 0.994300310123026
0.994300310123026 1
So a correlation coefficient that is near 1, on an interpolated set from the original data, when the original set had a negative correlation.
In all cases I have shown, the interpolated values will plot neatly on top of the curve from the original set.
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Get Started with Curve Fitting Toolbox 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!