I have 45x484 matrix but when I calculate coeff pca function, I am getting coeff with 484x44 matrix which causes errors in biplot. Why is the rows and columns switch places?
3 次查看(过去 30 天)
显示 更早的评论
%from struct to matrix using function
T1 = createDataMatrix(REC);
x=ismissing(T1);
y=any(x,1);
z=T1(:,~y);
a=z;
% scaling data for each column using standardised Z
ZM=zscore(a);
ZM=ZM-mean(ZM);
%PCA using Matlab built-in function
[coeff,score,latent,~,explained,~]=pca(ZM);
0 个评论
采纳的回答
Ive J
2021-8-31
Well you should find the answer in your problem not MATLAB pca function. You have 45 observations with 484 variables, so degree of freedom (you already centered your variables) in your case would be 44 and that's the max number of PCs with a non zero variance. You need to look at the total variance explained and pick those PCs explaining much of the variance (let's say 90%); I highly doubt the number of PCs explaining that much of variance even exceeds half of variables in a real case situation (though I admit depends on the nature of the problem).
Bottom line: pca function works just fine.
13 个评论
Ive J
2021-9-2
编辑:Ive J
2021-9-2
To remove only samples with missingness, you can apply something this to your dataset:
A = randn(487, 45); % raw data wish 487 observations and 45 variables
A(randi([1, numel(A(:))], 20, 1)) = nan; % add some missing values to raw data
fprintf('original matrix size: %d observations and %d features\n', size(A, 1), size(A, 2))
nanObsIdx = any(isnan(A), 2); % samples having at least one missing value in either of features (columns)
A(nanObsIdx, :) = [];
fprintf('pruned matrix size: %d observations and %d features\n', size(A, 1), size(A, 2))
Obviously this doesn't affect your features but only samples.
更多回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!