different results from fmincon with nonlcon

Question

Alexander 2014-3-28

2
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/123532-different-results-from-fmincon-with-nonlcon

回答： Alexander 2014-4-23

Hi, my problem is that an optimization with fmincon (see below) is not stable, i.e. various runs deliver various results. The optimization problem is described as below:

Many thanks in advance for your help!

Best Wishes, Alex

The function to be optimized:

fun=@(x)x*meanOrRegressed

The non-linear constraint (Sum of all positive weights not above 1.3 and sum of all negative weights not below -0.3:

function [c,ceq,gradc,gradceq]=nonlcon_g(x,1.3,0.3)
    nrAssets=size(x,1);
    weightSumMax = sum(max(0,x(1:nrAssets-1)));
    c(1) = weightSumMax-1.3;
    weightSumMin = sum(min(0, x(1:nrAssets-1)));
    c(2) = -weightSumMin-0.3;
    gradc1=1/2*(sign(x(1:nrAssets-1))+ones(nrAssets-1,1));
    gradc1=vertcat(gradc1,0);
    gradc2=1/2*(sign(x(1:nrAssets-1))-ones(nrAssets-1,1));
    gradc2=vertcat(gradc2,0);
    gradc=horzcat(gradc1,gradc2);
    ceq =[];
    gradceq=[];
end
[weights,optimizedValue,exitflag] = fmincon(@fun,initialWeights',A,b,Aeq,beq,lowerBound,upperBound,@nonlcon_g,options)
A =    -1    -1    -1    -1    -1     0
     1     1     1     1     1     0
     0     0     0     0     0    -1
     0     0     0     0     0     1
b = 0.3 1.3 0.3 1.3
Aeq =   1     1     1     1     1     1
beq =     1
lowerBound = -0.3 -0.3 -0.3 -0.3 -0.3 -0.3
upperBound = 1.3 1.3  1.3  1.3  1.3  1.3 1 3
meanOrRegressed =
      2.349096891796729e-004
     -7.582259013820250e-005
      1.190461785891006e-003
      2.529756213317396e-003
      1.066862350689632e-003
      5.133561643835617e-005
options = 
Display: []
MaxFunEvals: 60000
MaxIter: 40000
TolFun: 1.000000000000000e-010
TolX: []
FunValCheck: []
OutputFcn: []
PlotFcns: []
ActiveConstrTol: []
Algorithm: 'active-set'
AlwaysHonorConstraints: []
BranchStrategy: []
DerivativeCheck: []
Diagnostics: []
DiffMaxChange: []
DiffMinChange: []
FinDiffRelStep: []
FinDiffType: 'forward'
GoalsExactAchieve: []
GradConstr: 'on'
GradObj: []
HessFcn: []
Hessian: []
HessMult: []
HessPattern: []
HessUpdate: []
InitialHessType: []
InitialHessMatrix: []
InitBarrierParam: []
InitTrustRegionRadius: []
Jacobian: []
JacobMult: []
JacobPattern: []
LargeScale: []
LineSearchType: []
MaxNodes: []
MaxPCGIter: []
MaxProjCGIter: []
MaxRLPIter: []
MaxSQPIter: 200000
MaxTime: []
MeritFunction: []
MinAbsMax: []
NodeDisplayInterval: []
NodeSearchStrategy: []
NoStopIfFlatInfeas: []
ObjectiveLimit: []
PhaseOneTotalScaling: []
Preconditioner: []
PrecondBandWidth: []
RelLineSrchBnd: []
RelLineSrchBndDuration: []
ScaleProblem: []
Simplex: []
SubproblemAlgorithm: []
TolCon: 1.000000000000000e-008
TolConSQP: []
TolGradCon: []
TolPCG: []
TolProjCG: []
TolProjCGAbs: []
TolRLPFun: []
TolXInteger: []
TypicalX: []
UseParallel: 'always'

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Matt J 2014-4-14

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/123532-different-results-from-fmincon-with-nonlcon#answer_133275

编辑：Matt J 2014-4-14

在 MATLAB Online 中打开

AlexData.mat

Another solution, one which avoids the need for epsilon-approximation, is to recognize that your original "nonlinear" constraints are really linear ones. In particular, one can show that

sum(max(x,0))<=K

is equivalent to the set of linear inequalities

sum(x(J))<=K

for every index subset, J. This can be expressed by adding more rows to your A,b data. I do so as follows and get a highly accurate result.

    load AlexData
    N=length(initialWeights);
    AA=dec2bin(0:2^N-1,N)-'0';
     AA(sum(AA,2)<=1,:)=[];
    M=size(AA,1); 
    bb(1:M,1)=maxLong;
    bb(M+1:2*M)=maxShort;
    A=[A;AA;-AA];  %Append new constraints
    b=[b(:);bb];
    lowerBound=max(lowerBound,-maxShort);
    upperBound=min(upperBound, maxLong);
     options.algorithm='interior-point';
     options.TolX=1e-20;
     options.TolFun=1e-20;
     options.UseParallel=[];
    [weights,optimizedValue,exitflag,output,lambda,grad,hessian] = ...
      fmincon(@(x)dot(-x,ret), initialWeights', A, b, Aeq, beq,...
      lowerBound, upperBound,[],options);
    weightError=weights(:).' - [0, 1.3, -.3  0 0 0],

Unfortunately, this means that your constraint data A,b consume O(N*2^N) memory where N is the number of unknown weights. It should be fine, though, for small problems (N<=15 or so).

5 个评论
显示 3更早的评论隐藏 3更早的评论

alexander 2014-4-17

Dear Matt, first of all: Thank you so much for your huge effort and your answer. I have run the same problem on an old Matlab version, however in 32bit and it worked well. Also I was asking another person to run the problem on Matlab version 2013b and it also worked. It very much seems, that the problem is because my 2011b release, which was relatively short after 64bit was introduced. This seems to be the problem, which appears to be solved with bugfixes in later releases.

To the points you've addressed above, which were very interesting: 1. Before I've tried to remove the parallel computing option, which did not help. 2. The epsilon environment provides only an approximation. My target was, however, just to exclude randonmess in the results coming from non-linear boundary conditions, which are not two time continuously differentiable. 3. It is a very good idea to find ways to determine which weights should be positive/negative in order to get rid of non-linear constraints. 4. Also the combinatorial approach is an excellent idea to get rid of non-linear constraints for problems with n<15. 5. Actually the sum over lenght is not a bug, as asset n is treated differently (does not fall under the non-linear constraint) to the remaining assets.

After all, it seems that instability came from an erroneous release of Matlab 64bit. The advice you gave, goes beyond that, as it further helps to massively speed up the optimization. Many thanks for that!

I'm new to this website and would be glad if I could acknowledge your effort, but cannot find a button to click that questions solved.

Kind Regards, Alex

alexander 2014-4-22

Dear Matt, thanks again for all your help. The least I can do is to click this accept answer button, but I'm looking the third time and still cannot find it! Hope you can help! Thanks & Kind Regards, Alex

Matt J 2014-4-23

Maybe because you're not logged in to your mathworks account?

请先登录，再进行评论。

Answer 2

Matt J 2014-3-28

2
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/123532-different-results-from-fmincon-with-nonlcon#answer_131039

The nonlinear constraints are not differentiable, so that could well be the cause of the instability.

14 个评论
显示 12更早的评论隐藏 12更早的评论

Alexander 2014-4-5

编辑：Matt J 2014-4-5

在 MATLAB Online 中打开

Dear Matt, many thanks for your reply. Actually the linear problem should have just one solution and is quite simple: There are n assets, where each has an expected mean return. The expected portfolio mean return is simply the weighted sum of the constituent's mean returns. This portfolio return should be maximized. The optimizer would then put maximum weight to the assets subject to constraints: 1) Every asset weight needs to be in certain range 2) sum of weights is 1 and 3) Sum of negative weights should not be below -30% and sum of positive weights should not exceed 130%. The exact solution for this problem is 130% in asset with highest expected return and -30% in asset with lowest expected return... Thus, there is just on local/global maxima, which should be easiy to find. The constraint which causes the issues are the non-linear ones: sum(max(weights,0))-130%<=0 and -sum(min(weights,0))-30%<=0%. Without this non-linear constraint results the results are completely replicable. The problem comes when introducing the non-linear constraint. At the beginning I thought the problme arises because of the function sum(max/min(weights,0)) as it is not 2 times continuously differentiable (as you wrote in your first reply). For that reason I have replaced max(x,0) by

0 for x<0, 
6*(x/eps)^2*x-8*(x/eps)^3*x+3*(x/eps)^4*x for 0<=x<=eps and 
x for x>eps

This function (and 1st and 2nd order derivatives) are continuous at x=0 and x=eps: f(0), f(eps)=eps, f'(0)=0, f'(eps)=1, f''(0)=0 and f''(eps)=0. Similarly for sum(min(x,0)). The optimizations are not stable, but I've made an interesting Observation: There is a limited set of results, which are repeated from time to time. So sometimes the same result Comes up 3 times in a row. Suddently a slightly different result Comes up. Then the original result... It is quite random, but in say 10 optimizations about 3 different results are coming up. Really puzzeling, despite of such a simple Problem! I'd be grateful if you have another idea what could this be. Many thanks in advance!

Matt J 2014-4-5

编辑：Matt J 2014-4-5

在 MATLAB Online 中打开

I have one other idea, but we're really getting to the point where we need to see this "instability" quantified. How big are the differences in the solutions? How big are the differences in their objective function values? Are the differences not accounted for by TolX, TolFun, TolCon, etc... Are you always running with the same starting x0? Even if all solutions are supposed to be global, note that they need not always be unique. For example, if all mean asset returns are equal, any set of feasible weights are globally optimal. I assume this is less of a problem as the returns get more distinctive from each other, but its not clear just how distinctive they need to be for given TolFun, TolX, TolCon.

You should turn the 'Display' option to 'iter' and show us the results of each iteration. You should also be calling fmincon with all 7 of its output arguments

[x,fval,exitflag,output,lambda,grad,hessian]= fmincon(...)

and showing us those as well. And, you should show us your latest implementation of nonlcon_g.

Having said all that, my one other suspicion of why things are unstable is that your objective function and constraints are all locally linear, assuming no x(i) falls into your small "eps" region. When everything is linear, your Hessian (of the Lagrangian) is singular almost everywhere. In fact, it is the zero matrix. Newton-like methods like active-set will do badly with such singularities. You might try the 'interior-point' algorithm instead and see if it makes a difference. It would also be interesting to see if the instability persists when you switch from a linear objective to a nonlinear one.

Alexander 2014-4-5

在 MATLAB Online 中打开

Hi Matt, many thanks again! Just to your questions: The differences in the weights are around tenth of percentages, sometimes e.g. 129.87%, another time 130.00%. Many times the differences are around 1e-6%. The differences do not account for TolCon, as there 1e-12 is required. Also the maximized portfolio return was once 0,331141% and another run yielded 0,331121%. Thus the difference in the target function also violates the TolFun constraint of 1e-12.

Yes, I always start with the same x0 starting vector. In this simple case there is just a unique solution, except when (as you write) all asset returns are equal, which is an extremely unlikely event and not happening in our test-case.

I've put all the relevant informations in the sequel and in the attachment (if you would run the program, you may see that occassionally results are equal, but most of the time they are different. The differences here are not huge, but they exceed the allowed boundaries. In other applications the results deviate substantially). Many thanks again in advance for your help! Kind Regards, Alex

Below all required info besides the attachment:

 [weights,optimizedValue,exitflag,output,lambda,grad,hessian] = fmincon(@(x)func(x',ret), initialWeights', A, b, Aeq, beq, lowerBound, upperBound,@(x)nonlcon(x,maxLong,maxShort),options)

>> initialWeights'

ans =

000000000000000e-001
000000000000000e-001
000000000000000e-001
000000000000000e-001
000000000000000e-001
                         0

>> ret

ret =

    1.942376576660610e-004
    5.813798243822433e-004
   -2.055640404206464e-003
   -1.382827864740534e-003
   -3.364121777595930e-004
    5.037671232876713e-005

and the constraints

>> A

A =

    -1    -1    -1    -1    -1     0
   1     1     1     1     0
   0     0     0     0    -1
   0     0     0     0     1

>> b

b =

    3.000000000000000e-001    1.300000000000000e+000    3.000000000000000e-001    1.300000000000000e+000

>> Aeq

Aeq =

1 1 1 1 1 1

>> beq

beq =

1

>> lowerBound

lowerBound =

Columns 1 through 4
   -3.000000000000000e-001   -3.000000000000000e-001   -3.000000000000000e-001   -3.000000000000000e-001
Columns 5 through 6
   -3.000000000000000e-001   -3.000000000000000e-001

>> upperBound

upperBound =

Columns 1 through 4
    1.300000000000000e+000    1.300000000000000e+000    1.300000000000000e+000    1.300000000000000e+000
Columns 5 through 6
    1.300000000000000e+000    1.300000000000000e+000

>> options

options =

                   Display: 'off'
               MaxFunEvals: 60000
                   MaxIter: 40000
                    TolFun: 1.000000000000000e-012
                      TolX: []
               FunValCheck: []
                 OutputFcn: []
                  PlotFcns: []
           ActiveConstrTol: []
                 Algorithm: 'interior-point'
    AlwaysHonorConstraints: []
            BranchStrategy: []
           DerivativeCheck: []
               Diagnostics: 'off'
             DiffMaxChange: []
             DiffMinChange: []
            FinDiffRelStep: []
               FinDiffType: 'central'
         GoalsExactAchieve: []
                GradConstr: []
                   GradObj: []
                   HessFcn: []
                   Hessian: []
                  HessMult: []
               HessPattern: []
                HessUpdate: []
           InitialHessType: []
         InitialHessMatrix: []
          InitBarrierParam: []
     InitTrustRegionRadius: []
                  Jacobian: []
                 JacobMult: []
              JacobPattern: []
                LargeScale: []
            LineSearchType: []
                  MaxNodes: []
                MaxPCGIter: []
             MaxProjCGIter: []
                MaxRLPIter: []
                MaxSQPIter: 200000
                   MaxTime: []
             MeritFunction: []
                 MinAbsMax: []
       NodeDisplayInterval: []
        NodeSearchStrategy: []
        NoStopIfFlatInfeas: []
            ObjectiveLimit: []
      PhaseOneTotalScaling: []
            Preconditioner: []
          PrecondBandWidth: []
            RelLineSrchBnd: []
    RelLineSrchBndDuration: []
              ScaleProblem: []
                   Simplex: []
       SubproblemAlgorithm: []
                    TolCon: 1.000000000000000e-012
                 TolConSQP: []
                TolGradCon: []
                    TolPCG: []
                 TolProjCG: []
              TolProjCGAbs: []
                 TolRLPFun: []
               TolXInteger: []
                  TypicalX: []
               UseParallel: 'always'

Alexander 2014-4-8

Hi Matt, is there anything else which might be useful for you to tackle the problem? I'd be glad to be of any assistance. Thanks & Kind Regards, Alex

Matt J 2014-4-11

编辑：Matt J 2014-4-11

Hi Alex,

I've run your code, and also made some modifications of my own, which I've attached.

I do not see the instability that you describe when I run your version. I ran 10 consecutive trials and verified that the answers were identical across all of them. However, I vaguely wonder if you might get slightly different results because of UseParallel being turned on. Parallel processing routines can cause things to be executed in a somewhat random order. In theory the order of parallel tasks shouldn't matter, but in practice, for parallel reduction operations , there can be slight order-dependencies in the results due to floating point precision issues. That's just a guess, though -- I'm not even sure there are any reduction operations when fmincon runs in parallel mode. In any case, I don't really see that UseParallel would help you in terms of speed for such a small computation as this. I think it might even hurt you.

I do see limitations on the accuracy of the result. However, because you are not solving the original problem, but rather an epsilon-approximation of it, you can only expect to get approximate results. There is also an influence on accuracy of doing finite difference derivative computations.

In my version, I turned on the 'GradConstr' option and also switched back to the 'active-set' method. I also introduced a refinement step. It basically takes the solution of the epsilon-problem and uses that to decide which weights should be positive and which weights should be negative. It then constrains the search to that specific pattern of positives/negatives (similar to what I suggested earlier that you do combinatorically). This allows you to get rid of the nonlinear constraints, whereupon the solution becomes highly well-conditioned and accurate.

Finally, I corrected a bug in your implementations of max_smooth/min_smooth. You set nrAssets=length(x)-1 for some reason, when it really needs to be nrAssets=length(x).

请先登录，再进行评论。