How to use both mean and standard deviation/variance of each data to build a surrogate model?

5 次查看(过去 30 天)
I have the following data coming from a stochastic experiment. I ran the experiment for 10 times for each x and tabulated the mean and standard deviations. How can I use both the mean and standard deviation to train a surrogate model? I tried using fitrgp but it does not take standard deviations. I could use sigma, but it takes a scaler value, not a vector.
x = [5; 7; 9; 11; 13];
mean_output= [103.78; 108.84; 117.68; 109.57; 72.26];
std_output = [4.20; 3.44; 4.25; 10.09; 10.71];
gprModels = fitrgp(x, mean_output, ...
'KernelFunction', 'squaredexponential', ...
'BasisFunction', 'constant', ...
'FitMethod', 'exact', ...
'PredictMethod', 'exact')

采纳的回答

Nithin
Nithin 2025-4-25
The "Sigma" parameter in the "fitrgp" function is designed for homoscedastic noise, meaning it cannot directly take a vector of output standard deviations (std_output) that varies across data points. To incorporate both the mean and standard deviation when training a surrogate model with "fitrgp", you can use two separate Gaussian Process (GP) models: one to model the mean and another to model the variance. After fitting these models, you can define a custom GP that integrates the variance model, and then fit a final GP using this custom kernel. This methodology is based on the research paper "Improved Most Likely Heteroscedastic Gaussian Process Regression via Bayesian Residual Moment Estimator," which can be found in the following link: https://ieeexplore.ieee.org/document/9103623
Refer to the following steps for a detailed understanding of the approach:
  • Fit a GP surrogate model to your mean values as usual:
gprMean = fitrgp(x, mean_output, 'KernelFunction', 'squaredexponential');
  • Fit a separate GP to the sample variances (i.e., squared standard deviations)
variance_output = std_output.^2;
gprVar = fitrgp(x, variance_output, 'KernelFunction', 'squaredexponential');
  • Create a kernel function that, for the training points, adds the predicted variance from gprVar to the diagonal of the kernel matrix:
kfcn = @(XN, XM, theta) myHeteroKernel(XN, XM, theta, gprVar);
function K = myHeteroKernel(XN, XM, theta, gprVar)
sigmaL = exp(theta(1));
sigmaF = exp(theta(2));
D2 = pdist2(XN, XM, 'squaredeuclidean');
K = (sigmaF^2) * exp(-0.5 * D2 / sigmaL^2);
if isequal(XN, XM) % Only add variance for training data
varXN = predict(gprVar, XN);
varXN = max(varXN, 0);
K = K + diag(varXN);
end
end
  • Now, fit the GP to your mean data using the custom kernel:
gprHetero = fitrgp(x, mean_output, ...
'KernelFunction', kfcn, ...
'KernelParameters', [0; 0], ...
'Sigma', 0); % Set Sigma=0, as noise is handled by the kernel
  • Make predictions for the input values using:
xq = (5:0.1:13)';
[yhat, ysd, yci] = predict(gprHetero, xq);
As the number of input data points is only 5, the variance GP may be rough. Implementing the above script results the below image:
Refer to the following documentations to know more about "fitrgp" function: https://www.mathworks.com/help/stats/fitrgp.html
  1 个评论
Rounak Saha Niloy
While running the code, I am getting this error-
Error using classreg.learning.modelparams.GPParams.make (line 378)
'Sigma' must be a positive, numeric, real scalar with no NaN or Inf values.

请先登录,再进行评论。

更多回答(1 个)

UDAYA PEDDIRAJU
UDAYA PEDDIRAJU 2025-4-22
Hi Rounak,
To incorporate both the mean and standard deviation of your stochastic experiment data into a Gaussian Process surrogate model in MATLAB, you can use the "Sigma" parameter of the "fitrgp" function. This parameter does not accept a vector of observation noise standard deviations, but you can use a representative scalar noise level, for example, the mean or median of your standard deviations:
x = [5; 7; 9; 11; 13];
mean_output = [103.78; 108.84; 117.68; 109.57; 72.26];
std_output = [4.20; 3.44; 4.25; 10.09; 10.71];
sigma_scalar = mean(std_output); % Representative noise level
gprModel = fitrgp(x, mean_output, ...
'KernelFunction', 'squaredexponential', ...
'BasisFunction', 'constant', ...
'Sigma', sigma_scalar, ...
'FitMethod', 'exact', ...
'PredictMethod', 'exact');
Alternatively, omit the Sigma parameter and let fitrgp estimate a single noise level automatically by enabling hyperparameter optimization:
% Data (same as above)
x = [5; 7; 9; 11; 13];
mean_output = [103.78; 108.84; 117.68; 109.57; 72.26];
% Fit Gaussian Process Regression model with automatic noise estimation
gprModel_opt = fitrgp(x, mean_output, ...
'KernelFunction', 'squaredexponential', ...
'BasisFunction', 'constant', ...
'OptimizeHyperparameters', 'Sigma', ... % Optimize noise level
'HyperparameterOptimizationOptions', struct('ShowPlots', true, 'Verbose', 1));

类别

Help CenterFile Exchange 中查找有关 Gaussian Process Regression 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by