Changing optimization technique for Gaussian process regression model

17 次查看(过去 30 天)
By default, GPR model uses 'bayesopt' optimizer to optimize the hyperparameters. I wish to use 'particle swarm optimization'
to optimize the hyperparamaters i.e. to minimize the loss function or the MSE. Please help.
clear;clc;close all
load('data001.mat')
x = data001(:,1);
y = data001(:,2);
rng default
gprMdl = fitrgp(x,y,'KernelFunction','squaredexponential',...
'OptimizeHyperparameters','auto','HyperparameterOptimizationOptions',...
struct('AcquisitionFunctionName','expected-improvement-plus'));
ypred = resubPredict(gprMdl);
figure();
plot(x,y,'r.');
hold on
plot(x,ypred,'k','LineWidth',2);
xlabel('x');
ylabel('y');
hold off

采纳的回答

Alan Weiss
Alan Weiss 2022-7-6
Alan Weiss
MATLAB mathematical toolbox documentation
  7 个评论
Alan Weiss
Alan Weiss 2022-7-11
There are several errors in your setup, but the most important one is this: you are attempting to use particleswarm to choose an integer or categorical variable. But particleswarm is for continuous variables only. It cannot reliably choose a kernel function.
I am truly baffled as to why you are trying so hard to not use the built-in optimizer that is tailor-made for this purpose. But that is your choice.
If you really want to use another optimizer (and I think it is a mistake to do so), you have to use one that allows integer variables. Most likely ga. So you have 5 kernel functions to choose from. Let x(1) be an integer from 1 through 5. You need to map the x(1) variable to the appropriate name, like this:
krnl = KernelFunction {x(1)};
You seem to want to optimize the kernel scale also. According to the documentation of fitrgp, " fitrgp uses the KernelParameters argument to specify the value of the kernel scale parameter, which is held constant during fitting." However, according to the KernelParameters argument documentation, this argument must be "Initial values for the kernel parameters, specified as a vector. " It is just initial values, they are not held constant. So I don't think that you can do what you want to do using ga. Instead, let's optimize the Sigma argument for values between 1e-3 and 10.
Now your data has continuous values for the variables, as you are using regression, not classification. You have to change the cross-validation partition to no stratification.
Your final code is as follows:
load('data001.mat')
X = data001(:,1);
Y = data001(:,2);
KernelFunction = {'exponential','squaredexponential','matern32','matern52','rationalquadratic'};
c = cvpartition(Y,'KFold',5,'Stratify',false);
fobj = @(x)kfoldLoss(fitrgp(X,Y,'CVPartition',c,'KernelFunction',KernelFunction{x(1)},'Sigma',x(2)));
intcon = 1; % intcon is the indicator of integer variables
lb = [1,1e-3]; % set lower bounds
ub = [5,1e1]; % set upper bounds
[sol,fval] = ga(fobj,2,[],[],[],[],lb,ub,[],intcon);
FinalModel = fitrgp(X,Y,'KernelFunction', KernelFunction{sol(1)},'Sigma',sol(2));
I do not understand why you would use this code. I see no advantage over the built-in optimization capabilities of fitrgp. But suit yourself.
Alan Weiss
MATLAB mathematical toolbox documentation
Josh
Josh 2022-7-14
Thanks for being generous with your patience to my questions and elaborating on it with some information.
Appreciate for responding. Thank you.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Gaussian Process Regression 的更多信息

产品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by