Different correlation results using matrix with NaN values

2 次查看(过去 30 天)
Hello, I am having problems when calculating the correlation coefficient in two different ways:
the first way is by eliminating all pairs of NaN values before correlating:
ign_nan = isfinite(M1) & isfinite(M2);
M1=M1(ign_nan);
M2=M2(ign_nan);
[RHO1,P] = corrcoef(M1',M2');
The second way is by leaving the matrix as the original one, but adding
'rows','pairwise' to ignore NaN
M1=reshape(M1,[size(M1,1)*size(M1,2) 1]);
M2=reshape(M2,[size(M2,1)*size(M2,2) 1]);
[RHO2,P] = corrcoef(M1',M2','rows','pairwise');
Can someone tell me why is RHO1 different from RHO2?
Thank you! Magui

回答(1 个)

Kirby Fears
Kirby Fears 2016-4-18
Falco,
The documentation for corrcoef indicates that 'complete' is the 'rows' value corresponding to your first calculation.
Try using the following for RHO2 and compare it to RHO1.
[RHO2,P] = corrcoef(M1',M2','rows','complete');
Documentation for the 'rows' setting is below.
'rows' — Use of NaN option 'all' (default) | 'complete' | 'pairwise'
Use of NaN option, specified as one of these values:
'all' — Include all NaN values in the input before computing the correlation coefficients.
'complete' — Omit any rows of the input containing NaN values before computing the correlation coefficients. This option always returns a positive definite matrix.
'pairwise' — Omit any rows containing NaN only on a pairwise basis for each two-column correlation coefficient calculation. This option can return a matrix that is not positive definite.

类别

Help CenterFile Exchange 中查找有关 Descriptive Statistics 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by