partialcorr output differ greatly for single and duble-precision formats
4 次查看(过去 30 天)
显示 更早的评论
Hi,
I'm using partialcorr to compute the partial rank correlation coefficient between y and x controlling for z.
When the data are in single-percision format, the correlation coefficients are much different than the those I obtain from the same tests with double-precision format. This seems to happen only for the partialcorr function with type 'Spearman'. Using Pearson does not produce this problem, nor using the partialcorri function with Spearman. Rounding the decimal places has almost no effect, suggesting that the issue is not related to numeric precision. Here is a reproducible example:
% generate data
rng(1);
y = randn(10000,1);
x = y.*randn(10000,1);
z = x + randn(10000,1);
partialcorr(single(y),single(x),single(z),'Type','Spearman') % rho output = 0.2227
partialcorr(y,x,z,'Type','Spearman') % rho output = 0.0155
partialcorr(single(y),single(x),single(z),'Type','Pearson') % rho output = 0.0255
partialcorr(y,x,z,'Type','Pearson') % rho output = 0.0255
What could be the issue that leads to obtaining very different coefficients when using Spearman?
Thanks for your help!
0 个评论
回答(1 个)
Diya Tulshan
2023-7-12
编辑:Diya Tulshan
2023-7-12
Hii Marco Ciapparelli,
I understand you want to get a solution to the issue regarding Spearman and Pearson.
The difference in output that you are observing when using 'partialcorr' with the 'Spearman' type between single-precision and double-precision formats could be due to the algorithm used for computing the partial rank correlation coefficient.
The 'Spearman' type in 'partialcorr' computes the partial rank correlation coefficient using the Spearman's rank correlation formula. The algorithm for computing rank correlations involves sorting the data, assigning ranks, and then calculating the correlation based on the ranks. So,when you use single-precision data, there can be differences in the sorting and ranking process due to the limited precision of single-precision numbers.
The 'Pearson' type in 'partialcorr' calculates the partial correlation coefficient using Pearson's correlation formula. Pearson's correlation is based on the covariance and standard deviations of the data, which are not affected by the precision of the numbers.Thus, you observe consistent results.
To obtain accurate results for partial rank correlation using the 'Spearman' type, it is recommended to use double-precision data instead of single-precision. The 'Spearman' type relies on the ranks of the data, and the limited precision of single-precision numbers can introduce inconsistencies in the ranking process.
Or you can convert your data to double precision to get the result as shown below:-
% generate data
rng(1);
y = randn(10000,1);
x = y.*randn(10000,1);
z = x + randn(10000,1);
partialcorr(double(y),double(x),double(z),'Type','Spearman') % rho output = 0.2227
partialcorr(y,x,z,'Type','Spearman') % rho output = 0.0155
partialcorr(single(y),single(x),single(z),'Type','Pearson') % rho output = 0.0255
partialcorr(y,x,z,'Type','Pearson') % rho output = 0.0255
Or else if you want to use a single-precision data, partialcorri would be a better choice with 'Spearman'.
% generate data
rng(1);
y = randn(10000,1);
x = y.*randn(10000,1);
z = x + randn(10000,1);
partialcorri(single(y),single(x),single(z),'Type','Spearman') % rho output = 0.2227
partialcorr(y,x,z,'Type','Spearman') % rho output = 0.0155
partialcorr(single(y),single(x),single(z),'Type','Pearson') % rho output = 0.0255
partialcorr(y,x,z,'Type','Pearson') % rho output = 0.0255
Also kindly refer to the links mentioned below for better understanding:-
Hope this helps!
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!