Fminsearch: more initial values than parameters

Question

Conor Donnelly 2016-1-12

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/263545-fminsearch-more-initial-values-than-parameters

评论： Conor Donnelly 2016-1-14

Hi,

I am using fminsearch to estimate 4 parameters: x(1), x(2), x(3), and x(4). The problem is that I wish to use two objective functions to estimate these parameters: the two functions share the parameters x(2) and x(3) where as x(1) is unique to the first objective function and x(4) is unique to the last objective function. I have included the two functions below:

@(x) x(1)*exp(-t*(x(1)+x(3)))+(x(3)*((x(2)*exp(-t*x(2))*(x(1)+x(3)))/(x(1)-x(2)+x(3))-(x(2)*exp(-t*(x(1)+x(3)))*(x(1)+x(3)))/(x(1)-x(2)+x(3))))/(x(1)+x(3))
@(x)-exp(-t*x(3))/(x(3)/x(4)-1)-exp(-t*x(4))/(x(4)/x(3)-1)

I have a condition within my code which states which objective function should be used for each value of t. Finally, 4 initial values are specified for fminsearch each time.

I originally thought that as x(4) is not included within the first function that fminsearch would not attempt to calculate this parameter. The issue is that this seems to not be the case; random estimates of this 4th parameter are still made, despite the fact that it is not within the objective function. Why is that?

I realise that this question may seem complicated (and like an atypical use of fminsearch) so I suppose a more straightforward scenario is that of an objective function with only one parameter: if you provide a vector of 5 initial values then you will get back a vector of 5 parameter estimates where only the first is a legitimate estimate and the other 4 seem like nonsense. For some reason, however, these nonsense parameters aren't set to 0, is there any way that I can make sure that they are? Is it possible to specify x(4) is a fixed value not to be estimated, i.e. x(4)=0?

2 个评论
显示无隐藏无

jgg 2016-1-12

编辑：jgg 2016-1-12

I'm a little confused. Are you running fminsearch for a fixed value of t? If so, why not seperate this into two objective functions and run it with just that value of t instead of doing it this way. That way, you would get 3 parameters estimated in the case where you have 3, and vice versa (and it would save on computation time).

The basic reasons is that the optimizer is using a reflection pattern to try to optimize which means it will poll values of x(4) frequently which are not zero. These won't affect the objective function, since it doesn't enter into it, so it could very well find a value of the other parameters which are optimal but x(4) is not zero. Both are minimums of your function.

If you know ex ante that the value for x(4) is junk, then you could just set it to zero after the solver ends.

Conor Donnelly 2016-1-12

t is a vector of times and the distribution of these times is described by x(1), x(2) and x(3) (where the first function is the probability density function of the distribution). So each observed time, t, is run through the function to get the best estimates of x(1), x(2) and x(3).

The problem is that some individuals are censored and for these individuals I wish instead to use the second function (where the censored individuals can be used to help estimates x(3)). In this case, however, nonsense estimates are being generated for x(1) and x(2), introducing bias to the overall optimal estimates produced. I want to somehow stop fminsearch from making estimates on x(1) and x(2) if they don't appear in the objective function.

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

jgg 2016-1-13

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/263545-fminsearch-more-initial-values-than-parameters#answer_205964

在 MATLAB Online 中打开

Okay, so based on what I outlined earlier, I think this is the way you should try to solve this. First of all, structure your data like this: for each observation i (an individual) have a number t which is the time (censored or not), an indicator ind which indicates censoring. Put each into a big vector, so the index of the vector corresponds to the given person. (for instance, ind(i) is the indicator for person i being censored and so forth.

Then, you just want to form the likelihood function. Your first function is the probability of observing a time t (given the parameters x) for an uncensored individual. So define:

 f1 = @(x,t)(x(1)*exp(-t*(x(1)+x(3)))+(x(3)*((x(2)*exp(-t*x(2))*(x(1)+x(3)))/(x(1)-x(2)+x(3))-(x(2)*exp(-t*(x(1)+x(3)))*(x(1)+x(3)))/(x(1)-x(2)+x(3))))/(x(1)+x(3))); %or whatever
 f2 = @(x,t)(-exp(-t*x(3))/(x(3)/x(4)-1)-exp(-t*x(4))/(x(4)/x(3)-1)); %or whatever

The second function corresponds to the individuals who are censored. Then, the (log) probability of an observation is:

f = @(x,t,ind)(ind*log(f1(x,t)) + (1-ind)*log(f2(x,t)));

This is the log-likelihood function you want to maximize. The only issue now is wrapping it, put this all together as follows:

function [ll] = log_likelihood(x,t,ind)
    f1 = @(x,t)(...)%as above
    f2 = @(x,t)(...)%as above
    f = @(x,t,ind)(ind*log(f1(x,t)) + (1-ind)*log(f2(x,t)));
    n = size(t,1);
    ll = zeros(n,1);
    for i=1:size(t,1)
        ll(i) = f(x(i),t(i),ind(i));
    end
    ll = -sum(ll)
   end

Now, you have a function which takes in your data (t,ind) and parameters x and returns you the (negative) log-likelihood of that data given the parameters. To optimize this, run in your main script by passing the data:

%load in t,ind first
func = @(x)log_likelihood(x,t,ind);
x_0 = [1,1,1,1];%some starting point;
optimum = fminsearch(func,x_0);

This will use all of the information you have to produce a single estimation of x(1),x(2),x(3),x(4) without the issue you were encountering. In fact, this is probably the statistically correct way to do it, since estimating either part without is inefficient and potentially biased.

This code might not be exactly what you need, but it should provide a basic structure. You can also look up the technique: maximum likelihood estimation, and then look for dealing with right-censored data.