Obtaining posterior covariance of GPR using fitrgp and predictExactWithCov

9 次查看(过去 30 天)
Hello, thank you for reading this question. I have a question about using matlab's (maybe undocumented) function predictExactWithCov.
I have found in this forum that if I have a RegressionGP gprmdl, set of test points xTest, then by using
[postMeanGPR,postCovGPR,confInt] = predictExactWithCov(gprmdl.Impl,xTest,0.05);
Unable to resolve the name 'gprmdl.Impl'.
I can obtain an posterior mean of gprmdl at xTest (postMeanGPR), posterior covariance matrix at xTest (postCovGPR), and 95% confidence interval based on covariance matrix (confInt).
On the other hand, I can calculate posterior mean and variance using
(*)
where zero prior is assumed, kis a kernel, and so on.
However, using the kernel information from gprmdl like
sig_f = gprmdl.KernelInformation.KernelParameters(12);
sig_l = gprmdl.KernelInformation.KernelParameters(1:11)';
lambda = gprmdl.Sigma;
and calculating posterior mean and covariance using equation (*), I can obtain a same mean as postMeanGPR but posterior variance was quite different with postCovGPR's diagonal elements.
Can anyone explain why does such a discrepency occur?
Thanks.
PS: the overall code is attached below
clear;
clc;
wineTable = readtable('winequality-red.csv'); % wine data from cortez et al 2009., file attached
% 1st-11th columns are inputs and 12th column is output
wineData = table2array(wineTable);
len_tot = length(wineData);
len_train = 1000;
len_test = len_tot - len_train;
DataTrain = zeros(len_train,12);
DataTest = zeros(len_test,12);
NN = 1;
rng(NN);
%% STEP 1. Load data and separate it into training and test sets
p = sort(randperm(len_tot,len_test));
jj = 1; kk = 1;
for ii = 1:len_tot
if ismember(ii,p','rows')
DataTest(jj,:) = wineData(ii,:);
jj = jj + 1;
else
DataTrain(kk,:) = wineData(ii,:);
kk = kk + 1;
end
end
%% STEP 2. Calculate GP posterior mean and covariance using predictExactWithCov
xTrain = DataTrain(:,1:11);
yTrain = DataTrain(:,12);
xTest = DataTest(:,1:11);
yTest = DataTest(:,12);
gprmdl = fitrgp(xTrain,yTrain,'KernelFunction','ardsquaredexponential','BasisFunction','none');
[postMeanGPR,postCovGPR,confInt] = predictExactWithCov(gprmdl.Impl,xTest,0.05);
%% STEP 3. Calculate GP posterior mean and variance using equation (*)
sig_f = gprmdl.KernelInformation.KernelParameters(12);
sig_l = gprmdl.KernelInformation.KernelParameters(1:11)';
lambda = gprmdl.Sigma;
Gram = Kmat(xTrain,sig_f,sig_l);
KinvY = (Gram + lambda^2*eye(len_train))\yTrain;
postMeanGPR2 = zeros(len_test,1);
postVarGPR2 = postMeanGPR2;
for ii = 1:len_test
postMeanGPR2(ii) = kCol(xTest(ii,:),xTrain,sig_f,sig_l)'*KinvY;
postVarGPR2(ii) = SE(xTest(ii,:),xTest(ii,:),sig_f,sig_l) - kCol(xTest(ii,:),xTrain,sig_f,sig_l)'*((Gram + lambda^2*eye(len_train))\kCol(xTest(ii,:),xTrain,sig_f,sig_l));
end
function val = SE(x1, x2, sig_f, sig_l)
% x1, x2 in R^d, row vectors
% hyperparameter sig_l: length scale, length(sig_l) = d, sig_f > 0
Lambda = diag(sig_l);
val = sig_f^2 * exp(-0.5*( (x1 - x2) /Lambda^2 * (x1 - x2)' ) );
end
function k = kCol(xTest, xTrain, sig_f, sig_l)
% xTest: Test point, row vector
% xTrain: Training points, each row corresponds to one training point
N = length(xTrain);
k = zeros(N,1);
for ii = 1:N
k(ii) = SE(xTest, xTrain(ii,:), sig_f, sig_l);
end
end
function K = Kmat(X, sig_f, sig_l)
% each row of X corresponds to one data point of Gram matrix K
n = length(X);
K = zeros(n,n);
for ii = 1:n
for jj = 1:n
K(ii,jj) = SE(X(ii,:),X(jj,:), sig_f, sig_l);
end
end
end

回答(1 个)

UDAYA PEDDIRAJU
UDAYA PEDDIRAJU 2024-7-26
Hi Heein,
Dcoumentation for "predictExactWithCov" can be found by right clicking on it and selecting help, you can also see the implementational details of the function by opening it. The discrepancy in posterior covariance calculations between "predictExactWithCov" and your manual implementation likely arises from differences in implementation.
To resolve this:
  1. Check Kernel Function Implementation: Accurately replicate the kernel function used in "fitrgp". Minor differences can significantly impact results.
  2. Inspect Hyperparameters: Verify that hyperparameters extracted from "gprmdl.KernelInformation.KernelParameters" match your manual calculations.
  3. Debug Matrix Operations: Carefully review matrix operations in your manual code for errors.
  4. Simplify and Compare: Start with a smaller dataset and simpler kernel to isolate the issue. Compare intermediate results step-by-step.

类别

Help CenterFile Exchange 中查找有关 Gaussian Process Regression 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by