PLRS matlab vs Eigenvector Research software

13 次查看(过去 30 天)
I have noticed differences between PLRS matlab vs Eigenvector Research software based on predicted values. For matlab, I am using the below steps; and calculated predicted values by using leave out cross validation method.
C=cvpartition(50,'leaveout')
[Xl,Yl,Xs,Ys,beta,pctVar,mse,stats] = plsregress(X,y,8,'CV',C);
yfit = [ones(size(X,1),1) X]*beta;
On the other hand, I have Eigenvector toolbox and using specific function related Eigenvector Research software to calculated predicted values, see below
[press,cumpress,rmsecv,rmsec,cvpred,misclassed] = crossval(X,y,'sim','loo',8);
Even if I have eactly same regression coefficient for both matlabi and eignevector software, the yfit is not equal with cvpred? They are similat behaviour, but I have differences based on scale. I am sure that I am using mean center preprocess. I would like to learn how cvpred is calculating, I did not find any information on how this value is calculated

回答(2 个)

Shubham
Shubham 2024-1-19
Hi Nasire,
The discrepancies you're observing between the predicted values obtained from MATLAB's plsregress function and the Eigenvector Research software's crossval function could arise from several factors, even if the regression coefficients are the same. Here are a few points to consider that might explain the differences:
  1. Preprocessing: Even though you mentioned that you are using mean center preprocessing, ensure that the preprocessing steps are identical in both environments. Any slight difference in how the data is preprocessed before it is fed into the model can result in different predictions.
  2. Cross-validation implementation: The leave-one-out cross-validation (LOOCV) method should theoretically give the same results in both MATLAB and Eigenvector Research software. However, implementation details can vary. For example, there could be differences in how the data is partitioned, how missing values are handled, or how the model is updated for each left-out sample.
  3. Numerical precision: Different software may use different levels of numerical precision or different algorithms that can lead to slight variations in results.
  4. Scaling: If the predicted values are on different scales, it could be that one software automatically scales the predictions while the other does not. Check to see if there is a scaling option or post-processing step that is applied in one software and not the other.
  5. Algorithm Differences: The underlying algorithms used by MATLAB and Eigenvector Research software may have subtle differences. For example, they might use different criteria for convergence in iterative calculations, or they might handle collinear variables differently.
Regarding the calculation of cvpred in the Eigenvector Research software, without specific documentation or access to the source code, it's challenging to know exactly how the predictions are computed. However, in a typical LOOCV for PLS regression, the process is as follows:
  • For each sample in the dataset:
  • Temporarily remove the sample from the dataset.
  • Build the PLS model on the remaining data.
  • Predict the response for the removed sample using the model.
  • Aggregate the predictions for all samples to form the cvpred vector.
.

Prakash
Prakash 2024-4-7
[T,P,U,Q,B]=pls(X,Y,tol);
Unrecognized function or variable 'pls'.

类别

Help CenterFile Exchange 中查找有关 Linear Regression 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by