Understanding the update equation in logistic regression/classifier

1 次查看(过去 30 天)
Following a tutorial, I tried implementing the steps of building a logistic classifier below
%% logistic regression tutorial; https://machinelearningmastery.com/logistic-regression-tutorial-for-machine-learning/
% initialize variables
temp = [2.7810836 2.550537003 0
1.465489372 2.362125076 0
3.396561688 4.400293529 0
1.38807019 1.850220317 0
3.06407232 3.005305973 0
7.627531214 2.759262235 1
5.332441248 2.088626775 1
6.922596716 1.77106367 1
8.675418651 -0.2420686549 1
7.673756466 3.508563011 1]
X1 = temp(:,1)
X2 = temp(:,2)
Y = temp(:,3)
% define inline function "log_trans"
log_trans = @(x)(1 ./ (1 + exp(-x)));
% initialize parameters
B0 = 0;
B1 = 0;
B2 = 0;
alpha = 0.3;
epoc = 1;
dataSize = size(Y,1);
for i2 = 1:dataSize*10
i1 = round(mod(i2,10.0001));
x = B0*1 + B1*X1(i1) + B2*X2(i1);
prediction = log_trans(x);
B0 = B0 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*1;
B1 = B1 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*X1(i1);
B2 = B2 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*X2(i1);
if prediction > 0.5
Y_pred(i1,1) = 1;
else
Y_pred(i1,1) = 0;
end
if mod(i1,10) == 0
Acc(epoc,1) = ((dataSize-sum(abs(Y - Y_pred)))/(dataSize));
epoc = epoc + 1;
end
end
It works to produce the same final coefficient values the tutorial puts forth.
My question is regarding these lines:
B0 = B0 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*1;
B1 = B1 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*X1(i1);
B2 = B2 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*X2(i1);
I understand each iteration updates the previous coefficient values (B0, B1, B2). The update is weighted by alpha (set to 0.3 per tutorial). The remaining 3 "terms": (Y(i1) - prediction), prediction, and (1 - prediction) I cannot arrive at a satisfyingly intuitive understanding.
Prediction is a "logistic curve" (again excuse my lack of formal language) ranging from 0 to 1. Y is a column vector of labels 0 vs 1. So I intuit at least that the closer prediction is to Y(i1), the better the coefficient is performing, and so the smaller the incremental adjustment. I cannot however intuit the inclusion of prediction, and of (1 - prediction), and would appreciate some help here.

回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Statistics and Machine Learning Toolbox 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by