Different correlation results using matrix with NaN values
2 次查看(过去 30 天)
显示 更早的评论
Hello, I am having problems when calculating the correlation coefficient in two different ways:
the first way is by eliminating all pairs of NaN values before correlating:
ign_nan = isfinite(M1) & isfinite(M2);
M1=M1(ign_nan);
M2=M2(ign_nan);
[RHO1,P] = corrcoef(M1',M2');
The second way is by leaving the matrix as the original one, but adding
'rows','pairwise' to ignore NaN
M1=reshape(M1,[size(M1,1)*size(M1,2) 1]);
M2=reshape(M2,[size(M2,1)*size(M2,2) 1]);
[RHO2,P] = corrcoef(M1',M2','rows','pairwise');
Can someone tell me why is RHO1 different from RHO2?
Thank you! Magui
0 个评论
回答(1 个)
Kirby Fears
2016-4-18
Falco,
The documentation for corrcoef indicates that 'complete' is the 'rows' value corresponding to your first calculation.
Try using the following for RHO2 and compare it to RHO1.
[RHO2,P] = corrcoef(M1',M2','rows','complete');
Documentation for the 'rows' setting is below.
'rows' — Use of NaN option 'all' (default) | 'complete' | 'pairwise'
Use of NaN option, specified as one of these values:
'all' — Include all NaN values in the input before computing the correlation coefficients.
'complete' — Omit any rows of the input containing NaN values before computing the correlation coefficients. This option always returns a positive definite matrix.
'pairwise' — Omit any rows containing NaN only on a pairwise basis for each two-column correlation coefficient calculation. This option can return a matrix that is not positive definite.
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Descriptive Statistics 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!