Quadprog message: The problem is non-convex

Question

S. 2020-1-23

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/501615-quadprog-message-the-problem-is-non-convex

编辑： Matt J 2020-3-24

Hi,

First of all, I am using Matlab 2017b and the optimalization toolbox of 2019. I am using quadprog (not suprisingly ;)) to solve a QP problem. In reality, my problem is convex, however I get in the command window a message from quadprog that my problem is non-convex. I am not using lower and upper bounds, but inequality constraints. Furthermore, I checked if I had negative eigenvalues in my Hessian. This was indeed the case. I had one or two really small negative eigenvalues (-1e20). However, I am using quadprog for multiple timesteps, whereby the lineair term f (in my case F) is updated and therefore changes a little bit after every time step. If I am running the code for example for 10 timesteps, I get 8 times the message that the problem is non-convex and two times that the problem is convex and a minimum has been found. This is in my opinion quite weird, because the Hessian stays the same every timestep (and with that the one or two really small negative eigenvalues).

Does anyone have a potential solution to fix this issue or the cause of it? If you think you need more information, please let me know.

tic;
[X,fval,exitflag,output,lambda] = quadprog(G,F,Aineq_SparseCondensed,bineq_SparseCondesed,[],[],[],[],[],options);  
toc
G = (G+G')/2;

Thanks in advance!

Kind regards

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Walter Roberson 2020-3-9

You cannot use the 2019 toolboxes with 2017 MATLAB without expecting inexplicable errors.

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Matt J 2020-1-23

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/501615-quadprog-message-the-problem-is-non-convex#answer_411631

编辑：Matt J 2020-1-23

You cannot submit a problem that is borderline convex to quadprog if it is using an algorithm that expects convexity. If you do, quadprog can see the problem as convex or not-convex arbitrarily, depending on floating point numerical errors that essentially contribute random noise to the calculations that it is doing.

I would guess that your Hessian is supposed to be more distinctly positive definite, so you should review the way you set up the problem.

42 个评论
显示 40更早的评论隐藏 40更早的评论

Matt J 2020-1-24

编辑：Matt J 2020-1-24

在 MATLAB Online 中打开

The problem is not borderline convex, but completely convex in reality...Furthermore, if I use for example lower bound and upper bound instead of the inequality constraints the problem is convex

No, whether a quadratic problem is strongly convex (which you want, so that numerical behavior is good) is dictated entirely by the eigenvalues of your Hessian. Adding or removing bounds and other linear inequality constraints does not influence that.

However, the ability of quadprog to recognize that your problem is strongly convex can be impaired by floating point errors when your Hessian matrix is nearly singular. Moreoever, changing the constraints can conceivably affect the floating point errors that are encountered and what software flags they trigger. For example, under some particular constraints, the algorithm might try to step in a direction that is in the null space of your Hessian. Along such directions, the quadratic objective function should have zero curvature, but due to floating point errors, the curvature that gets calculated by the algorithm may be slightly negative and so it will raise a flag that the problem is non-convex.

Do you maybe know a systematic approach I could perform to know what causes this problem

In addition to checking the Hessian eigenvalues as you have done (they should all be strictly positive), you should check its reciprocal condition number rcond(H). If it is very close to zero, it means H is nearly singular and hence not strongly convex, for numerical purposes.

Furthermore, how could I get my Hessian more positive definite by just changing the set up of my problem? Could you give an example?

Often, a near singular Hessian arises because there aren't enough equations to determine the optimum uniquely, or if there are redundant variables. For example in a linear least squares problem,

the Hessian will be singular if A does not have full column rank. Poor choices of units can also cause this behavior. For example, this quadratic function

is theoretically strongly convex, but numerically, it has a very singular-looking Hessian:

>> H=[1e12, 0; 0 1]*2; rcond(H),
ans =
   1.0000e-12

But if I just change the units of x via the change of variables

, then the new objective

has a distinctly non-singular Hessian:

>> H=eye(2)*2; rcond(H)
ans =
     1

S. 2020-2-11

编辑：S. 2020-2-11

Thanks for your response and the clarification. Sorry for my delayed response.

However, I have still some additional questions and will provide you with more information.

In the second answer, you mention that if the reciprocal condition number is very close to zero, it is nearly singular. Could you indicate what ''very close to zero'' is, because it is very subjective? Is there a boundary in which order of magnitude the reciprocal condition number should lay? For example, if the reciprocal condition number is smaller than 1e-10, then the Hessian is nearly singular. Furthermore, is looking at the condition number also useful? If yes, from which order of magnitude it can be said that the matrix is ill-conditioned.

The optimization problem, I am trying to solve:

with zi|k = H*xi|k, ui|k the predicted inputs, xi|k the predicted states.

In this optimal control problem Q and R are both symmetric positive definite matrices, with Q a diagonal matrix (all diagonal entries have the value 100) and R a diagonal matrix (all diagonal entries have the value 0.1). Furthermore, the mean of the vector of z_r is around 5 and the mean of the vector u_r is around 0.1. The predicted inputs are constrained to be between 0 and 10. Therefore, I think the units are fine.

Moreover, the future states could be written as a function of the future inputs and after some substitution the optimization problem looks like:

with xi|k = Z*ui|k + z_k_hat

with Z a banded null-space matrix (should keep a specific structure), Omega a diagonal matrix with transpose(H)*Q*H and R on the diagonal entries, C_1 a vector with -2*transpose(z_r)*Q*H and -2*transpose(u_r)*R, z_k_hat a matrix with [A; A^2; ....;A^N] and A represents the dynamics of the system.

The condition number of Z is 1e7.

Do you have any ideas/suggestions how I can reformulate this model? It is hard from me to know where to start and I think the units are fine.

Matt J 2020-2-17

编辑：Matt J 2020-2-17

Is there a boundary in which order of magnitude the reciprocal condition number should lay?

You won't be able to indentify a sharp boundary that applies to your situation, but a condition number of less than 1000 would definitely be considered good and a condition number of Inf, like you have in your Omega matrix can definitely be considered a singular matrix. So, your objective function is clearly not strongly convex. Since Omega is a diagonal matrix, I would simply investigate why its maximum diagonal element is so much larger than its minimum diagonal element...

To get a quantitative perspective on the condition number, you should understand that it is the factor by which percent numerical errors get magnified when solving a set of linear equations involving the matrix,

https://en.wikipedia.org/wiki/Condition_number#Matrices

So, if quadprog reaches a step where it must solve a set of linear equations with condition number K and there is percent error delta in the equation data, then the results will have percent error K*delta. Since you have a condition number of K=1e19, you will need percent error in the input data of delta=1e-19 % (which is beyond what double floating point precision can do) in order to have a 1% error in the solution.

Matt J 2020-3-11

编辑：Matt J 2020-3-11

And does quadprog generally use many iterations with low cost/time of each iteration, or few iterations with expensive cost of each iteration (lot of time)? I think the latter one based on my results, but I am not sure. If so, is there an explanation for?

If there is, I am sure that it is in the algorithm description that I gave you the link for. It should be possible to deduce computational expense of the different steps of the algorithm from that.

The algorithm has different implementations for a sparse Hessian matrix H and for a dense matrix. Generally, the sparse implementation is faster on large, sparse problems, and the dense implementation is faster on dense or small problems.'' Why is this the case?

Matrix arithmetic is faster for large matrices if they are sparse, and if you manipulate them in sparse form, as Matlab's sparse type does.

And maybe if you know, what is the difference between interior-point method and GPAD (Gradient Projection method)?

Well for one thing, GPAD is not confined to travel through the interior of the feasible set as the interior- point method is (hence its name).

Matt J 2020-3-18

编辑：Matt J 2020-3-18

What is the reason why it is chosen that quadprog can only solve convex problems and not non-convex problems? Is there another function that can solve non-convex QP problems?

It is only quadprog's interior-point algorithm that requires convexity. The trust-region-refelective method doesn't require convexity, but it does have certain restrictions on the kind of constraints you can have (only equality constraints or bounds but not both).

You can always use fmincon. However, the ill-conditioning of your problem can have undesirable effects no matter what solver you use. It generally slows convergence, for one thing. It also makes the result sensitive to floating point errors. Effectively, you have a non-unique set of optimal points when your problem is ill-conditioned. Which solution you get can vary greatly from computer to computer depending on its CPU architecture and the way it does floating point arithmetic.

It is still not clear to me why you were able to get nowhere with the investigation of your Omega matrix and why cond(Omega)=Inf. This has to mean that the diagonal elements are very different in scale, and might be expressed in a poor choice of units. It should be very easy to determine if this is the case.

What gives the best results? Using lower bound and upper bound or incorporating these bounds in equality constraints? What are the differences? And advantages and disadvantages of each method?

Bounds are simpler for many solvers to deal with than more general inequalities. If you lump the bounds in with the other linear inequalities, the solver does not know that some of your constraints are simpler than others and won't take advantage of that.

Matt J 2020-3-18

编辑：Matt J 2020-3-18

Just the lower and upper bound which are written as inequality constraints, so each constraint is as simple/complex. Does the solver then still not see that each inequality constraint is just an upper or lower bound and therefore are as simple/complex?

Not as far as I'm aware. It could probably deduce that from a pre-analysis of your Aineq matrix, but I don't think it would bother. It has no reason to expect you to express bounds through Aineq,bineq when you have the option of doing so more directly through lb,ub.

Is a possible advantage of using lower and upper bound also that a solution may be found faster, because the space in which the algorithm can search is predefined and for inequality constraints the space is not directly restriced?

For some algorithms, yes. Some of the legacy algorithms in the toolbox, however, do restrict their search to the polyhedron specified by Aineq*x<=bineq (the active-set methods do this, I think). It is, of course, harder to do that for general linear inequalities than for simple bounds.

Furthermore, an advantage of inequality constraints is that it is possible to have a lower/upper bound on a linear combination of the decision variables, which is not possible with lower and upper bound.

That's not a reason, though, to lump your bounds into Aineq, bineq. You can use lb,ub to specify the bounds, but still use Aineq,bineq for the more complicated constraints that can't be represented through lb,ub.

S. 2020-3-18

编辑：S. 2020-3-18

For some algorithms, yes. Some of the legacy algorithms in the toolbox, however, do restrict their search to the polyhedron specified by Aineq*x<=bineq (the active-set methods do this, I think). It is, of course, harder to do that for general linear inequalities than for simple bounds

What is exaclty polyhedron? I read a lot about is, but the interpretation is not really clear to me. It is also for example being discussed about polyhedral input and state constraints. Are input and state constraints always polyhedral? What does polyhedral exactly mean in that context?

Does the interior-point-convex algorithm also belong to the ''some algorithms,yes''?

So, what you are saying is that the specification of the polyhedron is harder/costs more time for linear equality constraints than for bounds? And therefore my sentence: ''because the space in which the algorithm can search is predefined and for inequality constraints the space is not directly restriced'' is not correct. It is just the shape of the polyhedron that is restricted faster by the algorithm for bounds than for linear inequality constraints.

That's not a reason, though, to lump your bounds into Aineq, bineq. You can use lb,ub to specify the bounds, but still use Aineq,bineq for the more complicated constraints that can't be represented through lb,ub.

But is is still an kind of advantage with respect to bounds, because with bounds constraints on linear combinations cannot be specified.

Matt J 2020-3-18

编辑：Matt J 2020-3-18

What is exaclty polyhedron? I read a lot about is, but the interpretation is not really clear to me. It is also for example being discussed about polyhedral input and state constraints.

I was abusing terminology somewhat. A polyhedron is a 3D shape with flat sides. A convex polyhedron can always be described by a set of linear inequalities Aineq*x<=bineq. An N-dimensional shape described by a set of linear inequalities Aineq*x<=bineq is called a convex polytope.It is a generalization of a polyhedron from 3D to N-D.

https://en.wikipedia.org/wiki/Convex_polytope

Does the interior-point-convex algorithm also belong to the ''some algorithms,yes''?

I believe so. I haven't seen documentation to confirm it, but an interior-point algorithm by definition is one that tries to confine its search to the interior of the feasible set.

So, what you are saying is that the specification of the polyhedron is harder/costs more time for linear equality constraints than for bounds?

There are various advantages to constraints consisting of simple bounds that different algorithms may take advantage of in different ways. For one thing, it is easier to find an interior-point in a region specified by bounds. This is of course of high importance to an "interior-point" algorithm.

Incidentally, I was mistaken earlier when I said that the interior-point-convex algorithm will not pre-analyze your linear inequalities Aineq*x<=bineq to see if any of them are pure bounds. I found a section of the documentation that says otherwise:

"The algorithm first tries to simplify the problem by removing redundancies and simplifying constraints. The tasks performed during the presolve step can include the following:...Check if any linear inequality constraint involves only one variable. If so, check for feasibility, and then change the linear constraint to a bound. "

But as you can see, the fact that it does this step confirms that quadprog does give bound constraints special handling.

Matt J 2020-3-23

在 MATLAB Online 中打开

The "output" argument from quadprog will contain the number of iterations used, as well as other info.

[x,fval,exitflag,output] = quadprog(___)

I will not have time to look at the scripts anytime soon, but it is irrelevant for now. The computation time per iteration is the first thing you should check.

S. 2020-3-23

Oke, then I know now how to do that.

So I have to check whether the computation per iteration scales with O(N) in stead the computation time of one QP scales with O(N). So only the number of required iterations could also scale with O(N)?

Furthermore, the very last question :), if another solver, gets the same optimized values x, but the fval is different. How is that possible? For example quadprog and Gurobi (another solver) are given me the same x and fval, but GPAD gives me the same x, but an fval, which is a factor 10 bigger compared to quadprog and Gurobi.

请先登录，再进行评论。

Quadprog message: The problem is non-convex

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

采纳的回答

42 个评论
显示 40更早的评论隐藏 40更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

Community Treasure Hunt

Quadprog message: The problem is non-convex

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

采纳的回答

42 个评论 显示 40更早的评论隐藏 40更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

Community Treasure Hunt

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

42 个评论
显示 40更早的评论隐藏 40更早的评论