Understanding the update equation in logistic regression/classifier

Question

Z Liang 2019-10-16

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/485738-understanding-the-update-equation-in-logistic-regression-classifier

Following a tutorial, I tried implementing the steps of building a logistic classifier below

%% logistic regression tutorial; https://machinelearningmastery.com/logistic-regression-tutorial-for-machine-learning/
% initialize variables 
temp = [2.7810836	2.550537003	0
1.465489372	2.362125076	0
3.396561688	4.400293529	0
1.38807019	1.850220317	0
3.06407232	3.005305973	0
7.627531214	2.759262235	1
5.332441248	2.088626775	1
6.922596716	1.77106367	1
8.675418651	-0.2420686549	1
7.673756466	3.508563011	1]
X1 = temp(:,1)
X2 = temp(:,2)
Y = temp(:,3)
% define inline function "log_trans"
log_trans = @(x)(1 ./ (1 + exp(-x)));
% initialize parameters
B0 = 0;
B1 = 0;
B2 = 0;
alpha = 0.3;
epoc = 1;
dataSize = size(Y,1);
for i2 = 1:dataSize*10
    i1 = round(mod(i2,10.0001));
    x = B0*1 + B1*X1(i1) + B2*X2(i1);
    prediction = log_trans(x);
    B0 = B0 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*1;
    B1 = B1 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*X1(i1);
    B2 = B2 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*X2(i1);
    if prediction > 0.5
        Y_pred(i1,1) = 1;
    else
        Y_pred(i1,1) = 0;
    end
    if mod(i1,10) == 0
        Acc(epoc,1) = ((dataSize-sum(abs(Y - Y_pred)))/(dataSize));
        epoc = epoc + 1;
    end      
end

It works to produce the same final coefficient values the tutorial puts forth.

My question is regarding these lines:

    B0 = B0 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*1;
    B1 = B1 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*X1(i1);
    B2 = B2 + alpha*(Y(i1) - prediction)*prediction*(1 - prediction)*X2(i1);

I understand each iteration updates the previous coefficient values (B0, B1, B2). The update is weighted by alpha (set to 0.3 per tutorial). The remaining 3 "terms": (Y(i1) - prediction), prediction, and (1 - prediction) I cannot arrive at a satisfyingly intuitive understanding.

Prediction is a "logistic curve" (again excuse my lack of formal language) ranging from 0 to 1. Y is a column vector of labels 0 vs 1. So I intuit at least that the closer prediction is to Y(i1), the better the coefficient is performing, and so the smaller the incremental adjustment. I cannot however intuit the inclusion of prediction, and of (1 - prediction), and would appreciate some help here.