How to do cross-validation with PLS feature extraction before SVM?
3 次查看(过去 30 天)
显示 更早的评论
Hi,
I would like to know the best way to do cross-validation with a pipeline where PLS feature extraction is done before fitting an SVM. Here is my current code:
% Cross validation (train: 80%, test: 20%)
rng default;
cv = cvpartition(size(X,1),'HoldOut',0.8);
idx = cv.test;
% Separate to training and test data
XTrain = X(~idx,:);
YTrain = Y(~idx, :);
XTest = X(idx,:);
YTest = Y(idx, :);
n_components = 10; % We should optimize this
[XL,yl,XS,YS,beta,PCTVAR, MSE, stats] = plsregress(XTrain,YTrain,n_components);
W = stats.W;
SVMModel = fitcsvm(XS,YTrain,'Standardize',false,'KernelFunction','rbf',...
'KernelScale','auto'); % I would like to have parameter optimization here
% PLS does centering of the data, X0 = X - mean(X)
% XS = X0 * W
XS_test = (XTest - mean(XTrain)) * W;
YPred = predict(SVMModel, XS_test);
accuracy = sum(YPred == YTest)/length(YPred)
The use of fitcsvm(..., 'Optimizehyperparameters', all) isn't suitable here since there is information leakage between the k-folds since the whole XTrain is used for plsregress to get XS. Are there some hyperparameter optimization functions in matlab where I could use the whole PLS+SVM as fitting function?
1 个评论
Rishik Ramena
2021-8-30
Yes your analysis is correct. The use of fitcsvm isn't suitable here due to the information leakage between the k-folds. There are no inbuilt hyperparameter optimization functions in matlab which can be used for the whole PLS+SVM setup.
回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Classification Trees 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!