Determining variables that contribute to principal components

Question

Eric 2012-9-26

1
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/49134-determining-variables-that-contribute-to-principal-components

Hi,

I am trying to do a PCA analysis on a (24x3333) matrix where 24 is the number of observations and 3333 is the number of variables. I am using:

[coeff,score,eigval] = princomp(zscore(aggregate));

23 PCs are needed to explain 95% of the variance in the data. My question is how do I know which variables are contributing to each component. I believe I need to make a variable spreadsheet naming all 3333 variables. However, it is not clear how I would be able to identify the variables contributing to each component.

I also am creating a variable: %percent variation explained (PVE): variation in the original variable explained by a principal component

Because ultimately I want to quantify how much a variable contributes to its respective principal component.

for i = 1:3333

pve(:,i) = 100*coeff(i,i)*sqrt(var(score(:,i)))/(var(aggregate(:,i)));

end

Any insight would be a big help. I've been trying to figure this out for for weeks with no luck.

Thanks,

Eric

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Ilya 2012-9-26

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/49134-determining-variables-that-contribute-to-principal-components#answer_60092

The first paragraph in the doc description for princomp says "COEFF is a p-by-p matrix, each column containing coefficients for one principal component." For example, to project your data onto the 1st principal axis, do zscore(aggregate)*coeff(:,1). Why not measure the contribution of a variable to a component by the size of the respective coefficient? Especially since you have standardized your data by zscore.

Since you have 23 components, the columns in score past 23 are filled with zeros. If you need to get the principal component variance, take the 3rd output from princomp.

5 个评论
显示 3更早的评论隐藏 3更早的评论

Ilya 2012-9-28

编辑：Ilya 2012-9-28

在 MATLAB Online 中打开

I could understand what you mean if you wanted to go in the opposite direction, that is, explain variance in an original variable by a specific principal component. Since the covariance matrix is diagonal in the PCA space, we can separate contributions of the principal components to the variance of a variable. I do not see how to separate variable contributions to the variance of a principal component since the variables are not independent (and if they were, you would not need PCA in the first place). Here is what you could do:

% Load data and perform PCA
load hald
[coeff,~,latent] = princomp(ingredients);
cov(ingredients)
% Variance in variable I explained by principal component J
i = 2;
j = 1;
varI = coeff(i,:)*(latent.*coeff(i,:)')
varIfromJ = coeff(i,j)*latent(j)*coeff(i,j)
percVarIfromJ = varIfromJ/varI

Alternatively, you could, of course, ask the authors of those gait studies what exactly they did.

Eric 2012-9-30

Thank you. Your way to find percVarIformJ was what I was attempting to do with my original for loop.

请先登录，再进行评论。

Determining variables that contribute to principal components

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

5 个评论
显示 3更早的评论隐藏 3更早的评论

更多回答（0 个）

另请参阅

类别

标签

Community Treasure Hunt

Determining variables that contribute to principal components

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

5 个评论 显示 3更早的评论隐藏 3更早的评论

更多回答（0 个）

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

5 个评论
显示 3更早的评论隐藏 3更早的评论