Understanding the Jeffrey prior penalty term in the glmfit function in matlab R2024a

11 次查看(过去 30 天)
Hi,
Consider the example separable dataset
x_i = 800:5:900;
y_i = [0 0 0 0 0, ...
0 0 0 0 0, ...
1 1 1 1 1, ...
1 1 1 1 1, ...
1];
In the R2024a version of matlab it is now possible to fit a probit curve to these data using Jeffrey's prior as a penalty function.
%Standard way - singular coefficients
b_standard = glmfit(x_i', y_i', 'binomial','link', 'probit');
Warning: Iteration limit reached.
Warning: The estimated coefficients perfectly separate failures from successes. This means the theoretical best estimates are not finite. For the fitted linear combination XB of the predictors, the sample proportions P of Y=N in the data satisfy:
XB<-0.0197012: P=0
XB>-0.0197012: P=1
To get proper estimates, fit with 'LikelihoodPenalty' set to 'jeffreys-prior'.
%Jeffrey prior - nonsingular coefficients
b_jeff = glmfit(x_i', y_i', 'binomial','link', 'probit',LikelihoodPenalty="jeffreys-prior");
%Plot the standard and Jeffrey-prior coefficients
figure(1)
hold on
plot(b_standard(1),b_standard(2),'r*')
plot(b_jeff(1),b_jeff(2),'k*')
This is good because instead of obtaining divergent coefficients we obtain finite coefficients which are shifted towards the origin.
My questions are related to what is going on under the hood of the glmfit function.
  • Is there a simple way to plot the Jeffrey prior that is being used (as a function of possible coefficient values), and if so how?
  • Is it also possible to plot the resulting likelihood function (as a function of possible coefficient values) when we used the Jeffrey prior?
In the case where the Jeffrey prior is not being used, I can compute the likelihood function as
%Plot the likelihood
%I here transformed from b_0 and b_1 to the mean and slope parameters
%because the plots often look nicer in the latter coordinates
mu = linspace(840,860,201);
sigma = linspace(-1,5,101);
LikelihoodList = [];
for i = 1:length(mu)
for j = 1:length(sigma)
LikelihoodList = [LikelihoodList; mu(i), sigma(j), LikelihoodFunction(-mu(i)./sigma(j),1./sigma(j),x_i,y_i)];
end
end
X = LikelihoodList(:,1); Y = LikelihoodList(:,2); [x,y]=meshgrid(X,Y); Z = LikelihoodList(:,3);
figure(2)
plot3(X,Y,Z)
function likelihood = LikelihoodFunction(b_0,b_1,x_i,y_i)
%Computes the likelihood
%Take in a single value of b0 and b1 and a list of experimental data x_i and y_i
p_i = normcdf(b_0 + b_1*x_i);
likelihood_product_list = p_i.^y_i.*(1-p_i).^(1-y_i);
likelihood = prod(likelihood_product_list);
end
Related texts can be found here:
neither are using matlab.

采纳的回答

Arnav
Arnav 2024-11-5
Hi Moosejr,
According to the sources mentioned, Jeffrey’s prior is proportional to the square root of the determinant of the fisher information. Fisher Information can be calculated as shown in the code snippet shown below:
function I = FisherInformation(b_0, b_1, x_i)
I = zeros(2, 2);
for i = 1:length(x_i)
eta_i = b_0 + b_1 * x_i(i);
p_i = normcdf(eta_i);
phi_i = normpdf(eta_i);
% Clip probabilities to avoid division by zero
p_i = max(min(p_i, 1 - 1e-10), 1e-10);
w = (phi_i^2) / (p_i * (1 - p_i));
I(1, 1) = I(1, 1) + w;
I(1, 2) = I(1, 2) + w * x_i(i);
I(2, 1) = I(1, 2);
I(2, 2) = I(2, 2) + w * x_i(i)^2;
end
% Regularization term to avoid singular matrix
I = I + 1e-10 * eye(2);
end
Then, the likelihood with Jeffrey’s prior can be found as shown in the code snippet below:
likelihood = LikelihoodFunction(b_0, b_1, x_i, y_i);
fisher_info = FisherInformation(b_0, b_1, x_i);
jeffrey_prior = sqrt(det(fisher_info));
LikelihoodList(i, j) = likelihood * jeffrey_prior;
The plot of the likelihood is shown below:
  1 个评论
Moosejr
Moosejr 2024-11-20
编辑:Moosejr 2024-11-20
Hi @Arnav,
Thank you for your excellent response! I'm very sorry for replying so late, I have been travelling without access to my computer.
I have tried to implement your suggestion in a minimal example, I hope I got it right. I think we get the same curve (I'm not 100% sure because it is a bit hard to compare your png/jpg to my matlab file since I can't locally rotate the former).
However, I noticed that the maximum likelihood value differs from the ML estimate provided by matlab's glmfit function (the 'ko'-point differs from the maximum value of the likelihood function that I plotted). I therefore assume that something is amiss. Do you know what is wrong and how it can be fixed?
x_i = 800:5:900;
y_i = [0 0 0 0 0, ...
0 0 0 0 0, ...
1 1 1 1 1, ...
1 1 1 1 1, ...
1];
%Jeffrey prior - nonsingular coefficients
b_jeff = glmfit(x_i', y_i', 'binomial','link', 'probit', LikelihoodPenalty= "jeffreys-prior");
LikelihoodList = [];
mu = linspace(830,870,50);
sigma = linspace(-1,15,50);
[X,Y] = meshgrid(sigma, mu);
for i = 1:length(mu)
for j = 1:length(sigma)
likelihood = LikelihoodFunction(-mu(i)./sigma(j), 1./sigma(j), x_i, y_i);
fisher_info = FisherInformation(-mu(i)./sigma(j), 1./sigma(j), x_i);
jeffrey_prior = sqrt(det(fisher_info));
LikelihoodList(i, j) = likelihood * jeffrey_prior;
end
end
figure(3)
surf(X, Y, LikelihoodList)
hold on
plot3(1/b_jeff(2), -b_jeff(1)/b_jeff(2), max(LikelihoodList(:)), 'ko');
%% Functions
function likelihood = LikelihoodFunction(b_0,b_1,x_i,y_i)
%Computes the likelihood
%Take in a single value of b0 and b1 and a list of experimental data x_i and y_i
p_i = normcdf(b_0 + b_1*x_i);
likelihood_product_list = p_i.^y_i.*(1-p_i).^(1-y_i);
likelihood = prod(likelihood_product_list);
end
function I = FisherInformation(b_0, b_1, x_i)
I = zeros(2, 2);
for i = 1:length(x_i)
eta_i = b_0 + b_1 * x_i(i);
p_i = normcdf(eta_i);
phi_i = normpdf(eta_i);
% Clip probabilities to avoid division by zero
p_i = max(min(p_i, 1 - 1e-10), 1e-10);
w = (phi_i^2) / (p_i * (1 - p_i));
I(1, 1) = I(1, 1) + w;
I(1, 2) = I(1, 2) + w * x_i(i);
I(2, 1) = I(1, 2);
I(2, 2) = I(2, 2) + w * x_i(i)^2;
end
% Regularization term to avoid singular matrix
I = I + 1e-10 * eye(2);
end

请先登录,再进行评论。

更多回答(0 个)

产品


版本

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by