Canoncorr Coefficients for large data

Question

Matt Soldano 2022-3-10

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1668119-canoncorr-coefficients-for-large-data

回答： Ayush Modi 2024-1-18

[A, B, r, U, V, stats] = canoncorr(XTrain,yTrain);

I am working with a large dataset, however our problem stems from the fact that we have over 50,000 observations and about 93 samples. We understand that the matrix will not be full rank when we run canoncorr, nonetheless we are curious as to how canoncorr picks specific observations for the coefficients (A and B) for each sample. To further explain, we created a LOOCV script to see which observations were picked for the coefficient matricies A and B. What we learned was that there were a handful of observations that were used on the training data for all of the samples, and only ~350 of the 50,000 observations were used for the training data.

Ultimate question: Does canoncorr randomly set the observations to 0 for the coefficients matrix when there are more observations than samples? Or is there a system for picking the best/most representative observations for the coefficients matrix?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Ayush Modi 2024-1-18

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1668119-canoncorr-coefficients-for-large-data#answer_1392561

Hi Matt,

As per my understanding, you would like to know how does "canoncorr" function selects the observations to set to 0 if input matrix is not a full rank matrix. I am assuming that:

Observations are the individual data points or samples in the dataset i.e. 50,000 observations.
Variables are the different measurements or features recorded for each observation i.e. 93 variables.

"canoncorr" function does not randomly set observations to zero for the coefficients matrix. It systematically computes the canonical coefficients for the variables, and any zeros in the coefficients matrices correspond to variables that are linearly dependent on others.

Please refer to the following MathWorks documentation for more information on "canoncorr" function:

https://www.mathworks.com/help/stats/canoncorr.html#mw_4abce453-6f5d-4217-be8d-df41ff90115e:~:text=single%20%7C%20double-,Output%20Arguments,-collapse%20all

Hope this helps!

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Canoncorr Coefficients for large data

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Canoncorr Coefficients for large data

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论