hyperparameter tuning with fitclinear

7 次查看(过去 30 天)
Hello Matlab community,
I would like to run an SVM classification on my high-dimensional data. I decided to use fitclinear to do so. I would like to tune lambda.
What I don't understand is the cross-validation that takes place in the HyperparameterOptimizationOptions field.
The 'MaxObjectiveEvaluations' field is by default set to 30 and 'Kfold' is by default set to 5. In my script, I choose to tune lambda and the result is 30 lambda's ranked. I do not understand where the cross-validation happens exactly.
Here is a simplified example of my code:
% 1. load data
x = data.data;
y = labels;
% 2. CV particion
CV = cvpartition(data.sex, 'KFold', 5);
for i = 1:5
x_train = x(CV.training(i), :);
y_train = y(CV.training(i));
x_test = x(CV.test(i), :);
y_test = y(CV.test(i));
% 3. normalization
[x_train_norm, C, S] = normalize(x_train);
x_test_norm = normalize(x_test, 'center', C, 'scale', S);
% 4. Hyperparameter (lambda) tuning
VariableDescriptions = hyperparameters('fitclinear', x_train_norm, y_train);
[mdl, ~, HyperparameterOptimizationResults] = fitclinear(x_train_norm', y_train,...
'ObservationsIn','columns', 'OptimizeHyperparameters', VariableDescriptions(1,1),...
'HyperparameterOptimizationOptions', struct('Optimizer', 'randomsearch', 'AcquisitionFunctionName', ...
'expected-improvement-plus', 'Verbose', 0));
% I am choosing 'OptimizeHyperparameters', VariableDescriptions(1,1)
% here because I only want to tune Lambda
% 5. Find best lambda out of the 30 MaxObjectiveEvaluations
idx = find(HyperparameterOptimizationResults.Rank == 1);
lambda = HyperparameterOptimizationResults.Lambda(idx);
% 6. Train final SVM model
finalModel = fitclinear(x_train_norm', y_train, 'ObservationsIn', 'columns', ...
'Lambda', lambda);
% 7. Predict labels for test data
[predictionsY, scores] = predict(finalModel, x_test_norm);
end
In this example, when the hyperparameter tuning happens in Step 4, is the x_train_norm further split into 5 training/test groups? And then the 30 lambdas are calculated using these 5 training/test groups of the x_train_norm? Is this process an equivalent of a nested cross-validation?
I appreciate the help!
Best,
Nasia

回答(1 个)

Drew
Drew 2023-8-23
The short answer is yes. That is, the code you shared is doing "nested cross-validation" because the hyperparameter optimization inside fitclinear is using 'Kfold',5 crossvalidation by default as part of the HyperparameterOptimizationOptions. This is documented at https://www.mathworks.com/help/stats/fitclinear.html.
If this answer is helpful for you, please remember to accept the answer.

产品


版本

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by