I have 45x484 matrix but when I calculate coeff pca function, I am getting coeff with 484x44 matrix which causes errors in biplot. Why is the rows and columns switch places?

6 次查看（过去 30 天）

显示更早的评论

ack 2021-8-31

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1443989-i-have-45x484-matrix-but-when-i-calculate-coeff-pca-function-i-am-getting-coeff-with-484x44-matrix

评论： ack 2021-9-4

采纳的回答： Ive J

在 MATLAB Online 中打开

%from struct to matrix using function

T1 = createDataMatrix(REC);
x=ismissing(T1);
y=any(x,1);
z=T1(:,~y);
a=z;

% scaling data for each column using standardised Z

ZM=zscore(a);
ZM=ZM-mean(ZM);
%PCA using Matlab built-in function
[coeff,score,latent,~,explained,~]=pca(ZM);

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

采纳的回答

Ive J 2021-8-31

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1443989-i-have-45x484-matrix-but-when-i-calculate-coeff-pca-function-i-am-getting-coeff-with-484x44-matrix#answer_778109

Well you should find the answer in your problem not MATLAB pca function. You have 45 observations with 484 variables, so degree of freedom (you already centered your variables) in your case would be 44 and that's the max number of PCs with a non zero variance. You need to look at the total variance explained and pick those PCs explaining much of the variance (let's say 90%); I highly doubt the number of PCs explaining that much of variance even exceeds half of variables in a real case situation (though I admit depends on the nature of the problem).

Bottom line: pca function works just fine.

13 个评论
显示 11更早的评论隐藏 11更早的评论

Ive J 2021-9-2

编辑：Ive J 2021-9-2

在 MATLAB Online 中打开

To remove only samples with missingness, you can apply something this to your dataset:

A = randn(487, 45); % raw data wish 487 observations and 45 variables
A(randi([1, numel(A(:))], 20, 1)) = nan; % add some missing values to raw data
fprintf('original matrix size: %d observations and %d features\n', size(A, 1), size(A, 2))
original matrix size: 487 observations and 45 features
nanObsIdx = any(isnan(A), 2); % samples having at least one missing value in either of features (columns)
A(nanObsIdx, :) = [];
fprintf('pruned matrix size: %d observations and %d features\n', size(A, 1), size(A, 2))
pruned matrix size: 467 observations and 45 features

Obviously this doesn't affect your features but only samples.

ack 2021-9-4

Hi Ive, sorry for the late reply. My coding is sort of working now. Thank you so much for your inputs!

请先登录，再进行评论。

类别

AI and Statistics Statistics and Machine Learning Toolbox Dimensionality Reduction and Feature Extraction

在 Help Center 和 File Exchange 中查找有关 Dimensionality Reduction and Feature Extraction 的更多信息

产品

MATLAB

版本

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

I have 45x484 matrix but when I calculate coeff pca function, I am getting coeff with 484x44 matrix which causes errors in biplot. Why is the rows and columns switch places?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

13 个评论
显示 11更早的评论隐藏 11更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

I have 45x484 matrix but when I calculate coeff pca function, I am getting coeff with 484x44 matrix which causes errors in biplot. Why is the rows and columns switch places?

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

13 个评论 显示 11更早的评论隐藏 11更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

13 个评论
显示 11更早的评论隐藏 11更早的评论