fminunc stopped because it cannot decrease the objective function along the current search direction.
26 次查看(过去 30 天)
显示 更早的评论
I am trying to using `fminunc` top obtain the optimal theta in logistic regression, however I keep getting that:
fminunc stopped because it cannot decrease the objective function
along the current search direction.
Searching online, I found that this is usually the result of a gradient error which I am implementing in `logistic_costFunction.m`. I re-checked my work but I cannot spot the root cause.
I am not sure how to solve this issue, any help would be appreciated.
Here is my code
clear all; close all; clc;
%% Plotting data
x1 = linspace(0,3,50);
mqtrue = 5;
cqtrue = 30;
dat1 = mqtrue*x1+5*randn(1,50);
x2 = linspace(7,10,50);
dat2 = mqtrue*x2 + (cqtrue + 5*randn(1,50));
x = [x1 x2]'; % X
subplot(2,2,1);
dat = [dat1 dat2]'; % Y
scatter(x1, dat1); hold on;
scatter(x2, dat2, '*'); hold on;
classdata = (dat>40);
%% Compute Cost and Gradient
% Setup the data matrix appropriately, and add ones for the intercept term
[m, n] = size(x);
% Add intercept term to x and X_test
x = [ones(m, 1) x];
% Initialize fitting parameters
initial_theta = zeros(n + 1, 1);
% Compute and display initial cost and gradient
[cost, grad] = logistic_costFunction(initial_theta, x, dat);
fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Gradient at initial theta (zeros): \n');
fprintf(' %f \n', grad);
%% ============= Part 3: Optimizing using fminunc =============
% In this exercise, you will use a built-in function (fminunc) to find the
% optimal parameters theta.
% Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', 400);
% Run fminunc to obtain the optimal theta
% This function will return theta and the cost
[theta, cost] = ...
fminunc(@(t)(logistic_costFunction(t, x, dat)), initial_theta, options);
logistic_costFunction.m
-----------------------
function [J, grad] = logistic_costFunction(theta, X, y)
% Initialize some useful values
m = length(y); % number of training examples
grad = zeros(size(theta));
H = sigmoid(X*theta);
T = y.*log(H) + (1 - y).*log(1 - H);
J = -1/m*sum(T);
for i = 1 : m
grad = grad + (H(i) - y(i)) * X(i,:)';
end
grad = 1/m*grad;
end
sigmoid.m
function g = sigmoid(z)
% Computes thes sigmoid of z
g = zeros(size(z));
g = 1 ./ (1 + (1 ./ exp(z)));
end
0 个评论
回答(4 个)
Wasiq Malik
2019-7-20
编辑:Wasiq Malik
2019-7-20
i was having the same issue then i figured out a mistake
i was using exp(z) in my sigmoid function, looks like you made the same mistake
infact the sigmoid funciton is 1/(1+e^-z)
so change your sigmoid func definition to exp(-z)
and everything will work fine regarding fminunc
function g = sigmoid(z)
% Computes thes sigmoid of z
g = 1 ./ (1 + exp(-z));
end
1 个评论
Raghav Gopal Rao Netrakanti
2019-11-8
Hi,
I have the sigmoid function written correctly, but is still get the same error. :/ Is there anything else i need to change?
Alan Weiss
2019-4-16
Without running your example, I wonder if you could make two little changes to see if things are OK:
- Change the initial point to not be all zeros. Random is OK, but you might want to set the seed first to make things reproducible.
- Set the CheckGradients option to true (well, since you are using optimset, set the DerivativeCheck option to 'on') to determine if the gradient calculation is OK.
Alan Weiss
MATLAB mathematical toolbox documentation
Matt J
2019-4-16
编辑:Matt J
2019-4-16
You will need to use a dedicated function for computing the log-sigmoid. Combining log and sigmoid as separate functions is numerically unstable. This FEX contribution may be useful, as way of stably computing log(sum(exp(x))
1 个评论
Matt J
2019-4-16
编辑:Matt J
2019-4-16
Or, try this. Note that you should be using "classdata" where you are currently using "dat".
[theta, cost, exitflag,stats,grad] = ...
fminunc(@(t)(logistic_costFunction(t, x, classdata)), initial_theta, options)
function [J, grad] = logistic_costFunction(theta, X, y)
Xt = (X*theta);
T = y.*logsigmoid(Xt) + (1 - y).*logsigmoid(-Xt);
J = -mean(T);
if nargout>1
grad=(sigmoid(Xt)-y).'*X;
grad=grad.'/numel(y);
end
end
function g = sigmoid(z)
% Computes thes sigmoid of z
g = 1 ./ (1 + exp(-z));
end
function y = logsigmoid(z)
% Computes thes log-sigmoid of z in a numerically stable fashion.
z=-z;
idx=z<=33;
y=z;
y(idx)=log1p( exp(z(idx)) );
y=-y;
end
AOULADHADJ Driss
2020-10-18
use this instead of the last line of the code that contain the fmincun function
fminunc(@(t)(logistic_costFunction(t, x, dat)), initial_theta, options)
ps: without the semicolons
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Get Started with Optimization Toolbox 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!