Main Content

求解多变量非线性问题

此示例说明如何处理非线性问题中的大量变量。一般情况下,对于这类问题:

  • 使用占用内存较少的 BFGS (LBFGS) 黑塞矩阵逼近。此选项在默认 fminuncfmincon 算法中可用。

  • 如果有显式梯度,还可以使用有限差分黑塞矩阵和 'cg' 子问题算法。

  • 如果您有显式黑塞矩阵,请将黑塞矩阵表示为稀疏形式。

  • 虽然不是本示例的一部分,但如果您有结构问题并且可以计算黑塞矩阵与任意向量的乘积,则可尝试使用黑塞矩阵乘法函数。请参阅使用密集结构 Hessian 和线性等式进行最小化

示例使用了本示例末尾所示的用于一般非线性求解器 fminuncfminconhfminunc0obj 辅助函数。该函数是罗森布罗克函数的 N 维泛化,这是一个在数值上很难最小化的函数。在唯一点 x = ones(N,1) 处达到最小值 0。

该函数是显式平方和。因此,该示例还展示了使用最小二乘求解器的效率。对于最小二乘求解器 lsqnonlin,此示例使用本示例末尾所示的 hlsqnonlin0obj 辅助函数作为等效于 hfminunc0obj 函数的向量目标函数。

问题设立

将问题设置为对 1000 个变量使用 hfminunc0obj 目标函数。对于每个变量,将初始点 x0 设置为 -2。

fun = @hfminunc0obj;
N = 1e3;
x0 = -2*ones(N,1);

对于初始选项,指定不显示并且不限制函数计算或迭代的次数。

options = optimoptions("fminunc",Display="none",...
    MaxFunctionEvaluations=Inf,MaxIterations=Inf);

建立一个表来保留数据。为八次求解器运行指定标签,并定义用于收集运行时间、返回的函数值、退出标志、迭代次数和每次迭代时间的位置。

ExperimentLabels = ["BFGS_NoGrad", "LBFGS_NoGrad",...
    "BFGS_Grad", "LBFGS_Grad", "Analytic", "fin-diff-grads",...
    "LSQ_NoJacob", "LSQ_Jacob"];
timetable = table('Size',[8 5],'VariableTypes',["double" "double" "double" "double" "double"],...
    'VariableNames',["Time" "Fval" "Eflag" "Iters" "TimePerIter"],...
    'RowNames',ExperimentLabels);

以下代码节为每次求解器运行创建适当的选项,并尽可能将输出直接收集到表中。

BFGS·黑塞矩阵逼近,无梯度

opts = options;
opts.HessianApproximation = 'bfgs';
opts.SpecifyObjectiveGradient = false;
overallTime = tic;
[~,timetable.Fval("BFGS_NoGrad"),timetable.Eflag("BFGS_NoGrad"),output] =...
    fminunc(fun, x0, opts);
timetable.Time("BFGS_NoGrad") = toc(overallTime);
timetable.Iters("BFGS_NoGrad") = output.iterations;
timetable.TimePerIter("BFGS_NoGrad") =...
    timetable.Time("BFGS_NoGrad")/timetable.Iters("BFGS_NoGrad");

LBFGS 黑塞矩阵逼近,无梯度

opts = options;
opts.HessianApproximation = 'lbfgs';
opts.SpecifyObjectiveGradient = false;
overallTime = tic;
[~,timetable.Fval("LBFGS_NoGrad"),timetable.Eflag("LBFGS_NoGrad"),output] =...
    fminunc(fun, x0, opts);
timetable.Time("LBFGS_NoGrad") = toc(overallTime);
timetable.Iters("LBFGS_NoGrad") = output.iterations;
timetable.TimePerIter("LBFGS_NoGrad") =...
    timetable.Time("LBFGS_NoGrad")/timetable.Iters("LBFGS_NoGrad");

BFGS,带梯度

opts = options;
opts.HessianApproximation = 'bfgs';
opts.SpecifyObjectiveGradient = true;
overallTime = tic;
[~,timetable.Fval("BFGS_Grad"),timetable.Eflag("BFGS_Grad"),output] =...
    fminunc(fun, x0, opts);
timetable.Time("BFGS_Grad") = toc(overallTime);
timetable.Iters("BFGS_Grad") = output.iterations;
timetable.TimePerIter("BFGS_Grad") =...
    timetable.Time("BFGS_Grad")/timetable.Iters("BFGS_Grad");

LBFGS,带梯度

opts = options;
opts.HessianApproximation = 'lbfgs';
opts.SpecifyObjectiveGradient = true;
overallTime = tic;
[~,timetable.Fval("LBFGS_Grad"),timetable.Eflag("LBFGS_Grad"),output] =...
    fminunc(fun, x0, opts);
timetable.Time("LBFGS_Grad") = toc(overallTime);
timetable.Iters("LBFGS_Grad") = output.iterations;
timetable.TimePerIter("LBFGS_Grad") =...
    timetable.Time("LBFGS_Grad")/timetable.Iters("LBFGS_Grad");

解析黑塞矩阵,'trust-region' 算法

opts = options;
opts.Algorithm = 'trust-region';
opts.SpecifyObjectiveGradient = true;
opts.HessianFcn = "objective";
overallTime = tic;
[~,timetable.Fval("Analytic"),timetable.Eflag("Analytic"),output] =...
    fminunc(fun, x0, opts);
timetable.Time("Analytic") = toc(overallTime);
timetable.Iters("Analytic") = output.iterations;
timetable.TimePerIter("Analytic") =...
    timetable.Time("Analytic")/timetable.Iters("Analytic");

带梯度的有限差分黑塞矩阵,fmincon 求解器

opts = optimoptions("fmincon","SpecifyObjectiveGradient",true,...
    "Display","none","HessianApproximation","finite-difference",...
    SubproblemAlgorithm="cg",MaxFunctionEvaluations=Inf,MaxIterations=Inf);
overallTime = tic;
[~,timetable.Fval("fin-diff-grads"),timetable.Eflag("fin-diff-grads"),output] =...
    fmincon(fun, x0, [],[],[],[],[],[],[],opts);
timetable.Time("fin-diff-grads") = toc(overallTime);
timetable.Iters("fin-diff-grads") = output.iterations;
timetable.TimePerIter("fin-diff-grads") =...
    timetable.Time("fin-diff-grads")/timetable.Iters("fin-diff-grads");

最小二乘法,无雅可比矩阵

lsqopts = optimoptions("lsqnonlin","Display","none",...
    "MaxFunctionEvaluations",Inf,"MaxIterations",Inf);
fun = @hlsqnonlin0obj;
overallTime = tic;
[~,timetable.Fval("LSQ_NoJacob"),~,timetable.Eflag("LSQ_NoJacob"),output] =...
    lsqnonlin(fun, x0, [],[],lsqopts);
timetable.Time("LSQ_NoJacob") = toc(overallTime);
timetable.Iters("LSQ_NoJacob") = output.iterations;
timetable.TimePerIter("LSQ_NoJacob") =...
    timetable.Time("LSQ_NoJacob")/timetable.Iters("LSQ_NoJacob");

最小二乘法,使用雅可比矩阵

lsqopts.SpecifyObjectiveGradient = true;
overallTime = tic;
[~,timetable.Fval("LSQ_Jacob"),~,timetable.Eflag("LSQ_Jacob"),output] =...
    lsqnonlin(fun, x0, [],[],lsqopts);
timetable.Time("LSQ_Jacob") = toc(overallTime);
timetable.Iters("LSQ_Jacob") = output.iterations;
timetable.TimePerIter("LSQ_Jacob") =...
    timetable.Time("LSQ_Jacob")/timetable.Iters("LSQ_Jacob");

检查结果

disp(timetable)
                       Time        Fval       Eflag    Iters    TimePerIter
                      ______    __________    _____    _____    ___________

    BFGS_NoGrad       110.44    5.0083e-08      1      7137       0.015475 
    LBFGS_NoGrad      53.143     2.476e-07      1      4902       0.010841 
    BFGS_Grad         35.491    2.9865e-08      1      7105      0.0049952 
    LBFGS_Grad        1.2056    9.7505e-08      1      4907      0.0002457 
    Analytic          7.0991     1.671e-10      3      2301      0.0030852 
    fin-diff-grads     5.217    1.1422e-15      1      1382       0.003775 
    LSQ_NoJacob       94.708    3.7969e-25      1      1664       0.056916 
    LSQ_Jacob         6.5225    3.0056e-25      1      1664      0.0039197 

计时结果显示如下:

  • 对于此问题,带梯度的 LBFGS 黑塞矩阵逼近是迄今为止最快的。

  • 接下来运行速度最快的求解器是使用梯度有限差分黑塞矩阵的 fmincon、使用解析梯度和黑塞矩阵的信赖域 fminunc,以及使用解析雅可比矩阵的 lsqnonlin

  • 没有梯度的 fminunc BFGS 算法与没有雅可比矩阵的 lsqnonlin 求解器具有相似的速度。请注意,对于此问题,lsqnonlin 需要的迭代次数比 fminunc 少得多,但每次迭代花费的时间要长得多。

  • 导数对所有求解器的速度都有很大影响。

辅助函数

以下代码创建 hfminunc0obj 辅助函数。

function [f,G,H] = hfminunc0obj(x)
% Rosenbrock function in N dimensions
N = numel(x);
xx = x(1:N-1);
xx_plus = x(2:N);
f_vec = 100*(xx.^2 - xx_plus).^2 + (xx - 1).^2;
f = sum(f_vec);
if (nargout >= 2) % Gradient
    G = zeros(N,1);
    for k = 1:N
        if (k == 1)
            G(k) = 2*(x(k)-1) + 400*x(k)*(x(k)^2-x(k+1));
        elseif (k == N)
            G(k) = 200*x(k) - 200*x(k-1)^2;
        else
            G(k) = 202*x(k) - 200*x(k-1)^2 - 400*x(k)*(x(k+1) - x(k)^2) - 2;
        end
    end
    if nargout >= 3 % Hessian
        H = spalloc(N,N,3*N);
        for i = 2:(N-1)
            H(i,i) = 202 + 1200*x(i)^2 - 400*x(i+1);
            H(i,i-1) = -400*x(i-1);
            H(i,i+1) = -400*x(i);
        end
        H(1,1) = 2 + 1200*x(1)^2 - 400*x(2);
        H(1,2) = -400*x(1);
        H(N,N) = 200;
        H(N,N-1) = -400*x(N-1);
    end
end
end

以下代码创建 hlsqnonlin0obj 辅助函数。

function [f,G] = hlsqnonlin0obj(x)
% Rosenbrock function in N dimensions
N = numel(x);
xx = x(1:N-1);
xx_plus = x(2:N);
f_vec = [10*(xx.^2 - xx_plus), (xx - 1)];
f = reshape(f_vec',[],1); % Vector of length 2*(N-1)
% Jacobian
if (nargout >= 2)
    G = spalloc(2*(N-1),N,3*N); % Jacobian size 2*(N-1)-by-N with 3*N nonzeros
    for k = 1:(N-1)
        G(2*k-1,k) = 10*2*x(k);
        G(2*k-1,k+1) = -10;
        G(2*k,k) = 1;
    end
end
end

另请参阅

|

相关主题