How to calculate correlation p-value?
81 次查看(过去 30 天)
显示 更早的评论
I have a correlation matrix and have performed some filtering on it. Now I want to calculate the p-value of the filtered correlation matrix. Can anyone help we with the code. [R,P]=corrcoef(A) returns both the correlation matrix as well as the p-value matrix, but I already have the correlation matrix, and just want to calculate the p-value matrix.
Thank you in advance. Ritankar.
0 个评论
回答(3 个)
Gregory Pelletier
2024-1-24
Here is how to calculate the p-values the same way that matlab does in corrcoef if you only know the correlation coefficient matrix R and the number of samples N (see p_check below for the manual calculation of the p-value compared with p from corrcoef):
load hospital
X = [hospital.Weight hospital.BloodPressure];
[R, p] = corrcoef(X)
N = size(X,1);
t = sqrt(N-2).*R./sqrt(1-R.^2);
s = tcdf(t,N-2);
p_check = 2 * min(s,1-s)
% R =
% 1.0000e+00 1.5579e-01 2.2269e-01
% 1.5579e-01 1.0000e+00 5.1184e-01
% 2.2269e-01 5.1184e-01 1.0000e+00
% p =
% 1.0000e+00 1.2168e-01 2.5953e-02
% 1.2168e-01 1.0000e+00 5.2460e-08
% 2.5953e-02 5.2460e-08 1.0000e+00
% p_check =
% 0 1.2168e-01 2.5953e-02
% 1.2168e-01 0 5.2460e-08
% 2.5953e-02 5.2460e-08 0
0 个评论
the cyclist
2016-5-26
You cannot calculate a P-value from only a correlation matrix. You need the underlying data. The reason why is pretty easy to understand ... The correlation matrix could have come from a dataset with maybe N=10 measurements, or perhaps N=100000 measurements. These will (almost certainly) have different P-values.
1 个评论
Gregory Pelletier
2024-1-24
Here is how to calculate the p-values the same way that matlab does in corrcoef if you only know the correlation coefficient matrix R and the number of samples N (see p_check below for the manual calculation of the p-value compared with p from corrcoef):
load hospital
X = [hospital.Weight hospital.BloodPressure];
[R, p] = corrcoef(X)
N = size(X,1);
t = sqrt(N-2).*R./sqrt(1-R.^2);
s = tcdf(t,N-2);
p_check = 2 * min(s,1-s)
% R =
% 1.0000e+00 1.5579e-01 2.2269e-01
% 1.5579e-01 1.0000e+00 5.1184e-01
% 2.2269e-01 5.1184e-01 1.0000e+00
% p =
% 1.0000e+00 1.2168e-01 2.5953e-02
% 1.2168e-01 1.0000e+00 5.2460e-08
% 2.5953e-02 5.2460e-08 1.0000e+00
% p_check =
% 0 1.2168e-01 2.5953e-02
% 1.2168e-01 0 5.2460e-08
% 2.5953e-02 5.2460e-08 0
Anil Kamat
2021-5-30
编辑:Anil Kamat
2021-5-30
Lets say
N --> no.of the observations / data points
r --> assumed corr.coef
t = r*sqrt((N-2)/(1-r^2)); % find t-statistics
p1 = 1 - tcdf(t,(N-2)) % find pvalue using Student's t cumulative distribution function for one sample test.
https://www.mathworks.com/help/stats/tcdf.html
1 个评论
the cyclist
2021-6-1
编辑:the cyclist
2021-6-1
Can you help me understand that your formula is correct? Here is a correlation coefficient calculated from a randome dataset, and then your calculation. (I modified your formula only to make meaningful variable names.)
rng default
N = 1000;
x = randn(N,2);
[correlationMatrix,pMatrix] = corrcoef(x);
pValueFromOriginalData = pMatrix(1,2);
correlationCoefficient = correlationMatrix(1,2);
t = correlationCoefficient*sqrt((N-2)/(1-correlationCoefficient^2)); % find t-statistics
p_from_anil = 1 - tcdf(t,(N-2)); % find pvalue using Student's t cumulative distribution function for one sample test
sprintf('p-value from original data = %7.4f',pValueFromOriginalData)
sprintf('p-value from Anil = %7.4f',p_from_anil)
They give different results.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Descriptive Statistics 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!