transform linear inequality constraints into bound constraints for optimization solvers

Question

0 个投票

Suppose there are linear inequality constraints

and

are the parameters optimized by fmincon, lsqnonlin, or friends.

Since linear inequality constraints can generally not be enforced at intermediate iterations, I want to transform them into bound constraints which can be enforced at all iterations.

General idea:

If A is dense and has m rows (i.e, there are m constraints), then we can make a "change of variables"

, where

are the rows of A, for instance,

.

This leads to simple bound constraints

. Does that make sense?

Specific problem:

My parameters y represent interpolation values corresponding to interpolation points x and the resulting interpolation function is supposed to be convex on the interpolation domain. For the specific case of linear interpolation, @Matt J showed that convexity is realized by increasing second order finite differences

(which represents a set of linear inequality constraints)

This can be transformed into simpler bounds on the new parameters

by the change of variables

with

What I am really working with are smoother B-spline basis functions of higher degree using spapi function. @Bruno Luong showed that the linear inequality matrix A is given by

% k: spline order, x: points where values y live
B = spapi(k,x,eye(length(x)));
Bdd = fnder(B,2);
A = Bdd.coefs'

and

is a sufficient condition for convexity.

I am wondering whether a change of variables (like for linear interpolation functions) can be applied here to transform the inequalities into bounds?

Any opinions are greatly appreciated!

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

John D'Errico 2024-10-3

For example, it is possible to convert a set of linear inequality constraints into a bound constrained problem by the introduction of what are called slack variables.

That is, if you have an inequality

A*x >= b

then you CAN write it as an equality

A*x == b + y

where y is a new (slack) variable that is constrained to be non-negative. If you have multiple constraints, then you would introduce one slack variable for each inequality constraint.

Such slack variables are a common artifice in mathematics, used for example to solve the minimum sum of absolute values problem (versus the traditional linear least squares) but also used to formulate optimization problems using Lagrange multipliers. You can also view the question of whether a point lies inside a convex polytope, in terms of slack variables, one for each facet of the polytope. I can think of at least one other example.

However, you are asking to solve the problem in a different way, by use of a transformation. For example, given a vector x, which must form a monotonic increasing sequence, then we can transform the problem such that y=diff(x)>=0. And this is not always so easy.

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Bruno Luong 2024-10-3

编辑：Bruno Luong 2024-10-3

在 MATLAB Online 中打开

0 个投票

No you cannot. It is well known that any generic linear constraints

Aeq * x = beq
A * x <= b
lo <= x <= up

is equivalent to "standard form" and

D * y = e
y >= 0

After some linear transformation of x to y.

In general simple box constraints is NOT equivalent to the generic constraints (in two forms expressed above).

It is not a proof but geometrically box constraints are linear polytope with parallel faces, and the other two are generally generic polytopes.

You cannot simple do variable change for instant if the number of inequality constraints is larger than the number of decision parameters.

In any case solver such as linprog, quadprog, lsqlin do such transformation behind the scene for us, and user should not worry about doing that before hand.

Here is a video to explain the standard form and the conversion some of them using slack variables https://www.youtube.com/watch?v=db979cMaQ0M

14 个评论
显示 12更早的评论隐藏 12更早的评论

SA-W 2024-10-4

在 MATLAB Online 中打开

% Convex Data
n = 5;
x = 1:n;
y = exp(x); % x.^3; 
k = 4;  % Order: cubic spline
fref = spapi(k, x, y);  % Reference spline
tref = aptknt(x, k);  % Reference knots
% Knots for the second derivative basis
tdd = tref(3:end-2); 
% New function basis
N = cell(1, n-2);  
% Generate basis functions 
alphas = eye(n-2);  
for i = 1:n-2
    % Create spline using ith unit vector
    Ndd = spmak(tdd, alphas(:, i)');
    % Integrate twice to get function basis
    N{i} = fnint(fnint(Ndd));
end
% Collocation matrix
B = zeros(n, n);  
fnvals = zeros(n-2, n);  
for j = 1:n-2
    fnvals(j, :) = fnval(N{j}, x); 
end
B(:, 1:n-2) = fnvals';  
B(:, n-1) = x(:);  % linear term
B(:, n) = 1;  % constant term
% Solve for coefficients of basis
coef = B \ y(:);
% ############ Plot data check #################
xq = linspace(min(x), max(x), 100);  % Query points
yq = zeros(1, numel(xq));  % Preallocate output
% First contribution
for j = 1:n-2
    yq = yq + fnval(N{j}, xq) * coef(j);
end
yq = yq + coef(n-1) * xq + coef(n);  % Add linear and constant terms
f1 = plot(xq, fnval(fref, xq)); hold on;
f2 = plot(xq, yq);
f3 = plot(xq, yq - fnval(fref, xq));
legend([f1, f2, f3], 'spapi - reference bases', 'new bases', 'error');

I implemented your suggestion, however, with a basis produced by spmak. That was my only idea to produce a spline without using your BSFK hammer and coding deBoor myself. Is it always valid what I did and is the chosen bases appropriate?

Bruno Luong 2024-10-4

在 MATLAB Online 中打开

I add noise and enforce convexity to show the effect

% Convex Data

n = 50;

x = linspace(1,5,n);

y = exp(x); % x.^3;

sigma = 4;

y = y + sigma*randn(size(y));

k = 4; % Order: cubic spline

fref = spapi(k, x, y); % Reference spline

tref = aptknt(x, k); % Reference knots

% Knots for the second derivative basis

tdd = tref(3:end-2);

% New function basis

N = cell(1, n-2);

% Generate basis functions

alphas = eye(n-2);

for i = 1:n-2

% Create spline using ith unit vector

Ndd = spmak(tdd, alphas(:, i)');

% Integrate twice to get function basis

N{i} = fnint(fnint(Ndd));

end

% Collocation matrix

B = zeros(n, n);

fnvals = zeros(n-2, n);

for j = 1:n-2

fnvals(j, :) = fnval(N{j}, x);

end

B(:, 1:n-2) = fnvals';

B(:, n-1) = x(:); % linear term

B(:, n) = 1; % constant term

% lower bounds to enforce positive second derivative

lb = [zeros(1,n-2), -Inf(1,2)];

% Solve for coefficients of basis with constraints

coef = lsqlin(B, y(:), [],[],[],[],lb,[]);

Minimum found that satisfies the constraints. Optimization completed because the objective function is non-decreasing in feasible directions, to within the value of the optimality tolerance, and constraints are satisfied to within the value of the constraint tolerance.

% ############ Plot data check #################

xq = linspace(min(x), max(x), 400); % Query points

yq = zeros(1, numel(xq)); % Preallocate output

% First contribution

for j = 1:n-2

yq = yq + fnval(N{j}, xq) * coef(j);

end

yq = yq + coef(n-1) * xq + coef(n); % Add linear and constant terms

f1 = plot(xq, fnval(fref, xq)); hold on;

f2 = plot(xq, yq);

f3 = plot(xq, yq - fnval(fref, xq));

legend([f1, f2, f3], 'spapi - reference bases', 'new bases', 'error');

Bruno Luong 2024-10-5

编辑：Bruno Luong 2024-10-5

Is there a way to express the third derivative constraint at x(end) in terms of the new basis (as a linear inequality constraint)?

Uh, just compute the third derivative at x(end) of the basis, it give you the innequality constraints on coefficients (the third derivative at any point is a linear operator).

Or, even better, is there another basis that allows both constraints to be expressed by bounds only?

No you have already built the new basis in order to constraint the second derivative, there is nothing left (the constant and linear term are free but they do not have third derivative). As I said at one of the earlier post you have the chance of chose appropriate basis only when the number of inequalities is smaller or equal than the dimension of the problem. If you add more constraints, then they cannot me converted to bounded variable constraints.

The slack variable trick work well for LP objective function, but it rarely helpful for other objective function. For example the Weyl-Minkowski representation showed by Matt can be viewed as slack variables (lambda + mu), but they do not associates with a change of basis functions (a bijective transformation is change of basis), you need to use Weyl-Minkowski formula to compute point x in the admissible polyhedra, then compute the fit objective from x. That is OK but the number of slack variables should be reasonable to make the problem managable on computer.

Overall It looks like the question of transforming to bounds is a steril method to me, just use directly linear inequality constraints within the numerical solver rather than do all the non trivial transformation to convert to bound constraints. It migh shed a light theorically but has no numerically advantage IMO. But I'm happy to be prove to be wrong on this view.

SA-W 2024-10-5

No you have already built the new basis in order to constraint the second derivative, there is nothing left (the constant and linear term are free but they do not have third derivative). As I said at one of the earlier post you have the chance of chose appropriate basis only when the number of inequalities is smaller or equal than the dimension of the problem. If you add more constraints, then they cannot me converted to bounded variable constraints.

Makes totally sense.

The slack variable trick work well for LP objective function, but it rarely helpful for other objective function. For example the Weyl-Minkowski representation showed by Matt can be viewed as slack variables (lambda + mu), but they do not associates with a change of basis functions (a bijective transformation is change of basis), you need to use Weyl-Minkowski formula to compute point x in the admissible polyhedra, then compute the fit objective from x. That is OK but the number of slack variables should be reasonable to make the problem managable on computer.

I agree that this new basis makes only sense if we can get rid of all inequality constraints. The single constraint on the thid derivative is the only additional constraint I can think of for my problems. Given that, do you think a combined scheme (new basis for second deriv + Weyl-Minkowski representation for third deriv) is managable? We only have a halfspace for the constraints. Maybe this does not even work for a single constraint (but A must be a quadratric matrix instead) and you can quickly sort this idea out and I am happy with what we discussed here.

Bruno Luong 2024-10-5

I can't see how Weyl-Minkowski easily combine with other ideas.
The brute force of enumerate all combonations (vertices and extrem rays) rarely work when implementing on computer. Its value is on theoretcal side.

SA-W 2024-10-7

Maybe you can add something to this one here:

https://de.mathworks.com/matlabcentral/answers/2157760-scale-normalize-parameter-vector-for-optimization?s_tid=mlc_ans_email_ques

请先登录，再进行评论。

Answer 2

Matt J 2024-10-3

编辑：Matt J 2024-10-3

0 个投票

Yes, a set A*y>=0 is a cone and so can equivalently be expressed as

where

are the so-called extreme rays of the cone and

are non-negative coefficients. So, in theory, you could re-express the problem with

as the unknowns and which require only non-negativity constraints. However, finding the extreme rays

given the matrix A can be computationally demanding, depending on the dimension of the problem.

9 个评论
显示 7更早的评论隐藏 7更早的评论

Matt J 2024-10-4

编辑：Matt J 2024-10-4

For a small example, you can look here:

https://or.stackexchange.com/questions/6947/extreme-rays-of-a-small-polyhedral-cone-how-do-i-get-them

There is also 3rd party code for higher dimensional cases. The following is one offering, though I've never used it myself:

http://www.uic.unn.ru/~zny/skeleton/

Bruno Luong 2024-10-4

In the theoreme In page 23 of this document the extreme directions (rays directions) can be seen as vertices of an almost similar form of the original polyhedra.

To find a vertices of an polyhedra you need to pick a lot of combination of vectors (50) within a large number of vectors (96 = 2*48). This brute force would be impossible to achieve.

请先登录，再进行评论。

transform linear inequality constraints into bound constraints for optimization solvers

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

采纳的回答

14 个评论
显示 12更早的评论隐藏 12更早的评论

更多回答（1 个）

9 个评论
显示 7更早的评论隐藏 7更早的评论

类别

标签

Community Treasure Hunt

transform linear inequality constraints into bound constraints for optimization solvers

1 个评论 显示 -1更早的评论 隐藏 -1更早的评论

采纳的回答

14 个评论 显示 12更早的评论 隐藏 12更早的评论

更多回答（1 个）

9 个评论 显示 7更早的评论 隐藏 7更早的评论

类别

标签

另请参阅

Community Treasure Hunt

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

14 个评论
显示 12更早的评论隐藏 12更早的评论

9 个评论
显示 7更早的评论隐藏 7更早的评论