Minimization with Dense Structured Hessian, Linear Equalities
Hessian Multiply Function for Lower Memory
The fmincon
interior-point
and trust-region-reflective
algorithms, and the fminunc
trust-region
algorithm, can solve problems where the Hessian is dense but structured. For these problems, fmincon
and fminunc
do not compute H*Y
with the Hessian H
directly, because forming H
would be memory-intensive. Instead, you must provide fmincon
or fminunc
with a function that, given a matrix Y
and information about H
, computes W = H*Y
.
In this example, the objective function is nonlinear and linear equalities exist so fmincon
is used. The description applies to the trust-region reflective
algorithm; the fminunc
trust-region
algorithm is similar. For the interior-point
algorithm, see the HessianMultiplyFcn
option in Hessian Multiply Function. The objective function has the structure
where is a 1000-by-2 matrix. The Hessian of is dense, but the Hessian of is sparse. If the Hessian of is , then , the Hessian of , is
To avoid excessive memory usage that could happen by working with directly, the example provides a Hessian multiply function, hmfleq1
. This function, when passed a matrix Y
, uses sparse matrices Hinfo
, which corresponds to , and V
to compute the Hessian matrix product
W = H*Y = (Hinfo - V*V')*Y.
In this example, the Hessian multiply function needs , and V
to compute the Hessian matrix product. V
is a constant, so you can capture V
in a function handle to an anonymous function.
However, is not a constant and must be computed at the current x
. You can do this by computing in the objective function and returning as Hinfo
in the third output argument. By using optimoptions
to set the HessianMultiplyFcn
option to a handle to the hmfleq1
Hessian multiply function listed here, fmincon
knows to get the Hinfo
value from the objective function and pass it to the Hessian multiply function hmfleq1
.
Step 1: Write a file brownvv.m that computes the objective function, the gradient, and the sparse part of the Hessian.
The example passes brownvv
to fmincon
as the objective function. The brownvv.m
file is long and is not listed here. You can view the code with the command
type brownvv
Because brownvv
computes the gradient as well as the objective function, the example (Step 3) uses optimoptions
to set the SpecifyObjectiveGradient
option to true
.
Step 2: Write a function to compute Hessian-matrix products for H given a matrix Y.
Now, define a function hmfleq1
that uses Hinfo
, which is computed in brownvv
, and V
, which you can capture in a function handle to an anonymous function, to compute the Hessian matrix product W
where W = H*Y = (Hinfo - V*V')*Y
. This function must have the form
W = hmfleq1(Hinfo,Y)
The first argument must be the same as the third argument returned by the objective function brownvv
. The second argument to the Hessian multiply function is the matrix Y
(of W = H*Y
).
Because fmincon
expects the second argument Y
to be used to form the Hessian matrix product, Y
is always a matrix with n
rows, where n
is the number of dimensions in the problem. The number of columns in Y
can vary. Finally, you can use a function handle to an anonymous function to capture V
, so V
can be the third argument to hmfleq1
. This technique is described in more detail in Passing Extra Parameters. Here is a listing of the file hmfleq1.m
.
type hmfleq1.m
function W = hmfleq1(Hinfo,Y,V) %HMFLEQ1 Hessian-matrix product function for BROWNVV objective. % Documentation example. % W = hmfbx4(Hinfo,Y,V) computes W = (Hinfo-V*V')*Y % where Hinfo is a sparse matrix computed by BROWNVV % and V is a 2 column matrix. % Copyright 1984-2008 The MathWorks, Inc. W = Hinfo*Y - V*(V'*Y);
Step 3: Call a nonlinear minimization routine with a starting point and linear equality constraints.
Load the problem parameter, V
, and the sparse equality constraint matrices, Aeq
and beq
, from fleq1.mat
, which is available when you run this example. Use optimoptions
to set the SpecifyObjectiveGradient
option to true
and to set the HessianMultiplyFcn
option to a function handle that points to hmfleq1
. Call fmincon
with objective function brownvv
and with V
as an additional parameter. View the code that runs this procedure:
type runfleq1.m
function [fval,exitflag,output,x] = runfleq1 % RUNFLEQ1 demonstrates 'HessMult' option for FMINCON with linear % equalities. problem = load('fleq1'); % Get V, Aeq, beq V = problem.V; Aeq = problem.Aeq; beq = problem.beq; n = 1000; % problem dimension xstart = -ones(n,1); xstart(2:2:n,1) = ones(length(2:2:n),1); % starting point options = optimoptions(@fmincon,... 'Algorithm','trust-region-reflective',... 'SpecifyObjectiveGradient',true, ... 'HessianMultiplyFcn',@(Hinfo,Y)hmfleq1(Hinfo,Y,V),... 'Display','iter',... 'OptimalityTolerance',1e-9,... 'FunctionTolerance',1e-9); [x,fval,exitflag,output] = fmincon(@(x)brownvv(x,V),xstart,[],[],Aeq,beq,[],[], ... [],options);
To run this code, enter
[fval,exitflag,output,x] = runfleq1;
Norm of First-order Iteration f(x) step optimality CG-iterations 0 2297.63 1.41e+03 1 1084.59 6.3903 578 1 2 1084.59 100 578 3 3 1084.59 25 578 0 4 1084.59 6.25 578 0 5 1047.61 1.5625 240 0 6 761.592 3.125 62.4 2 7 761.592 6.25 62.4 4 8 746.478 1.5625 163 0 9 546.578 3.125 84.1 2 10 274.311 6.25 26.9 2 11 55.6193 11.6597 40 2 12 55.6193 25 40 3 13 22.2964 6.25 26.3 0 14 -49.516 6.25 78 1 15 -93.2772 1.5625 68 1 16 -207.204 3.125 86.5 1 17 -434.162 6.25 70.7 1 18 -681.359 6.25 43.7 2 19 -681.359 6.25 43.7 4 20 -698.041 1.5625 191 0 21 -723.959 3.125 256 7 22 -751.33 0.78125 154 3 23 -793.974 1.5625 24.4 3 24 -820.831 2.51937 6.11 3 25 -823.069 0.562132 2.87 3 26 -823.237 0.196753 0.486 3 27 -823.245 0.0621202 0.386 3 28 -823.246 0.0199951 0.11 6 29 -823.246 0.00731333 0.0404 7 30 -823.246 0.00505883 0.0185 8 31 -823.246 0.00126471 0.00268 9 32 -823.246 0.00149326 0.00521 9 33 -823.246 0.000373314 0.00091 9 Local minimum possible. fmincon stopped because the final change in function value relative to its initial value is less than the value of the function tolerance.
Convergence is rapid for a problem of this size with the PCG iteration cost increasing modestly as the optimization progresses. Feasibility of the equality constraints is maintained at the solution.
problem = load('fleq1'); % Get V, Aeq, beq V = problem.V; Aeq = problem.Aeq; beq = problem.beq; norm(Aeq*x-beq,inf)
ans = 2.3093e-14
Preconditioning
In this example, fmincon
cannot use H
to compute a preconditioner because H
only exists implicitly. Instead of H
, fmincon
uses Hinfo
, the third argument returned by brownvv
, to compute a preconditioner. Hinfo
is a good choice because it is the same size as H
and approximates H
to some degree. If Hinfo
were not the same size as H
, fmincon
would compute a preconditioner based on some diagonal scaling matrices determined from the algorithm. Typically, this would not perform as well.