求解多变量非线性问题
此示例说明如何处理非线性问题中的大量变量。一般情况下,对于这类问题:
使用占用内存较少的 BFGS (LBFGS) 黑塞矩阵逼近。此选项在默认
fminunc
和fmincon
算法中可用。如果有显式梯度,还可以使用有限差分黑塞矩阵和
'cg'
子问题算法。如果您有显式黑塞矩阵,请将黑塞矩阵表示为稀疏形式。
虽然不是本示例的一部分,但如果您有结构问题并且可以计算黑塞矩阵与任意向量的乘积,则可尝试使用黑塞矩阵乘法函数。请参阅使用密集结构 Hessian 和线性等式进行最小化。
示例使用了本示例末尾所示的用于一般非线性求解器 fminunc
和 fmincon
的 hfminunc0obj
辅助函数。该函数是罗森布罗克函数的 N 维泛化,这是一个在数值上很难最小化的函数。在唯一点 x = ones(N,1)
处达到最小值 0。
该函数是显式平方和。因此,该示例还展示了使用最小二乘求解器的效率。对于最小二乘求解器 lsqnonlin
,此示例使用本示例末尾所示的 hlsqnonlin0obj
辅助函数作为等效于 hfminunc0obj
函数的向量目标函数。
问题设立
将问题设置为对 1000 个变量使用 hfminunc0obj
目标函数。对于每个变量,将初始点 x0
设置为 -2。
fun = @hfminunc0obj; N = 1e3; x0 = -2*ones(N,1);
对于初始选项,指定不显示并且不限制函数计算或迭代的次数。
options = optimoptions("fminunc",Display="none",... MaxFunctionEvaluations=Inf,MaxIterations=Inf);
建立一个表来保留数据。为八次求解器运行指定标签,并定义用于收集运行时间、返回的函数值、退出标志、迭代次数和每次迭代时间的位置。
ExperimentLabels = ["BFGS_NoGrad", "LBFGS_NoGrad",... "BFGS_Grad", "LBFGS_Grad", "Analytic", "fin-diff-grads",... "LSQ_NoJacob", "LSQ_Jacob"]; timetable = table('Size',[8 5],'VariableTypes',["double" "double" "double" "double" "double"],... 'VariableNames',["Time" "Fval" "Eflag" "Iters" "TimePerIter"],... 'RowNames',ExperimentLabels);
以下代码节为每次求解器运行创建适当的选项,并尽可能将输出直接收集到表中。
BFGS·黑塞矩阵逼近,无梯度
opts = options; opts.HessianApproximation = 'bfgs'; opts.SpecifyObjectiveGradient = false; overallTime = tic; [~,timetable.Fval("BFGS_NoGrad"),timetable.Eflag("BFGS_NoGrad"),output] =... fminunc(fun, x0, opts); timetable.Time("BFGS_NoGrad") = toc(overallTime); timetable.Iters("BFGS_NoGrad") = output.iterations; timetable.TimePerIter("BFGS_NoGrad") =... timetable.Time("BFGS_NoGrad")/timetable.Iters("BFGS_NoGrad");
LBFGS 黑塞矩阵逼近,无梯度
opts = options; opts.HessianApproximation = 'lbfgs'; opts.SpecifyObjectiveGradient = false; overallTime = tic; [~,timetable.Fval("LBFGS_NoGrad"),timetable.Eflag("LBFGS_NoGrad"),output] =... fminunc(fun, x0, opts); timetable.Time("LBFGS_NoGrad") = toc(overallTime); timetable.Iters("LBFGS_NoGrad") = output.iterations; timetable.TimePerIter("LBFGS_NoGrad") =... timetable.Time("LBFGS_NoGrad")/timetable.Iters("LBFGS_NoGrad");
BFGS,带梯度
opts = options; opts.HessianApproximation = 'bfgs'; opts.SpecifyObjectiveGradient = true; overallTime = tic; [~,timetable.Fval("BFGS_Grad"),timetable.Eflag("BFGS_Grad"),output] =... fminunc(fun, x0, opts); timetable.Time("BFGS_Grad") = toc(overallTime); timetable.Iters("BFGS_Grad") = output.iterations; timetable.TimePerIter("BFGS_Grad") =... timetable.Time("BFGS_Grad")/timetable.Iters("BFGS_Grad");
LBFGS,带梯度
opts = options; opts.HessianApproximation = 'lbfgs'; opts.SpecifyObjectiveGradient = true; overallTime = tic; [~,timetable.Fval("LBFGS_Grad"),timetable.Eflag("LBFGS_Grad"),output] =... fminunc(fun, x0, opts); timetable.Time("LBFGS_Grad") = toc(overallTime); timetable.Iters("LBFGS_Grad") = output.iterations; timetable.TimePerIter("LBFGS_Grad") =... timetable.Time("LBFGS_Grad")/timetable.Iters("LBFGS_Grad");
解析黑塞矩阵,'trust-region'
算法
opts = options; opts.Algorithm = 'trust-region'; opts.SpecifyObjectiveGradient = true; opts.HessianFcn = "objective"; overallTime = tic; [~,timetable.Fval("Analytic"),timetable.Eflag("Analytic"),output] =... fminunc(fun, x0, opts); timetable.Time("Analytic") = toc(overallTime); timetable.Iters("Analytic") = output.iterations; timetable.TimePerIter("Analytic") =... timetable.Time("Analytic")/timetable.Iters("Analytic");
带梯度的有限差分黑塞矩阵,fmincon
求解器
opts = optimoptions("fmincon","SpecifyObjectiveGradient",true,... "Display","none","HessianApproximation","finite-difference",... SubproblemAlgorithm="cg",MaxFunctionEvaluations=Inf,MaxIterations=Inf); overallTime = tic; [~,timetable.Fval("fin-diff-grads"),timetable.Eflag("fin-diff-grads"),output] =... fmincon(fun, x0, [],[],[],[],[],[],[],opts); timetable.Time("fin-diff-grads") = toc(overallTime); timetable.Iters("fin-diff-grads") = output.iterations; timetable.TimePerIter("fin-diff-grads") =... timetable.Time("fin-diff-grads")/timetable.Iters("fin-diff-grads");
最小二乘法,无雅可比矩阵
lsqopts = optimoptions("lsqnonlin","Display","none",... "MaxFunctionEvaluations",Inf,"MaxIterations",Inf); fun = @hlsqnonlin0obj; overallTime = tic; [~,timetable.Fval("LSQ_NoJacob"),~,timetable.Eflag("LSQ_NoJacob"),output] =... lsqnonlin(fun, x0, [],[],lsqopts); timetable.Time("LSQ_NoJacob") = toc(overallTime); timetable.Iters("LSQ_NoJacob") = output.iterations; timetable.TimePerIter("LSQ_NoJacob") =... timetable.Time("LSQ_NoJacob")/timetable.Iters("LSQ_NoJacob");
最小二乘法,使用雅可比矩阵
lsqopts.SpecifyObjectiveGradient = true; overallTime = tic; [~,timetable.Fval("LSQ_Jacob"),~,timetable.Eflag("LSQ_Jacob"),output] =... lsqnonlin(fun, x0, [],[],lsqopts); timetable.Time("LSQ_Jacob") = toc(overallTime); timetable.Iters("LSQ_Jacob") = output.iterations; timetable.TimePerIter("LSQ_Jacob") =... timetable.Time("LSQ_Jacob")/timetable.Iters("LSQ_Jacob");
检查结果
disp(timetable)
Time Fval Eflag Iters TimePerIter ______ __________ _____ _____ ___________ BFGS_NoGrad 110.44 5.0083e-08 1 7137 0.015475 LBFGS_NoGrad 53.143 2.476e-07 1 4902 0.010841 BFGS_Grad 35.491 2.9865e-08 1 7105 0.0049952 LBFGS_Grad 1.2056 9.7505e-08 1 4907 0.0002457 Analytic 7.0991 1.671e-10 3 2301 0.0030852 fin-diff-grads 5.217 1.1422e-15 1 1382 0.003775 LSQ_NoJacob 94.708 3.7969e-25 1 1664 0.056916 LSQ_Jacob 6.5225 3.0056e-25 1 1664 0.0039197
计时结果显示如下:
对于此问题,带梯度的 LBFGS 黑塞矩阵逼近是迄今为止最快的。
接下来运行速度最快的求解器是使用梯度有限差分黑塞矩阵的
fmincon
、使用解析梯度和黑塞矩阵的信赖域fminunc
,以及使用解析雅可比矩阵的lsqnonlin
。没有梯度的
fminunc
BFGS 算法与没有雅可比矩阵的lsqnonlin
求解器具有相似的速度。请注意,对于此问题,lsqnonlin
需要的迭代次数比fminunc
少得多,但每次迭代花费的时间要长得多。导数对所有求解器的速度都有很大影响。
辅助函数
以下代码创建 hfminunc0obj
辅助函数。
function [f,G,H] = hfminunc0obj(x) % Rosenbrock function in N dimensions N = numel(x); xx = x(1:N-1); xx_plus = x(2:N); f_vec = 100*(xx.^2 - xx_plus).^2 + (xx - 1).^2; f = sum(f_vec); if (nargout >= 2) % Gradient G = zeros(N,1); for k = 1:N if (k == 1) G(k) = 2*(x(k)-1) + 400*x(k)*(x(k)^2-x(k+1)); elseif (k == N) G(k) = 200*x(k) - 200*x(k-1)^2; else G(k) = 202*x(k) - 200*x(k-1)^2 - 400*x(k)*(x(k+1) - x(k)^2) - 2; end end if nargout >= 3 % Hessian H = spalloc(N,N,3*N); for i = 2:(N-1) H(i,i) = 202 + 1200*x(i)^2 - 400*x(i+1); H(i,i-1) = -400*x(i-1); H(i,i+1) = -400*x(i); end H(1,1) = 2 + 1200*x(1)^2 - 400*x(2); H(1,2) = -400*x(1); H(N,N) = 200; H(N,N-1) = -400*x(N-1); end end end
以下代码创建 hlsqnonlin0obj
辅助函数。
function [f,G] = hlsqnonlin0obj(x) % Rosenbrock function in N dimensions N = numel(x); xx = x(1:N-1); xx_plus = x(2:N); f_vec = [10*(xx.^2 - xx_plus), (xx - 1)]; f = reshape(f_vec',[],1); % Vector of length 2*(N-1) % Jacobian if (nargout >= 2) G = spalloc(2*(N-1),N,3*N); % Jacobian size 2*(N-1)-by-N with 3*N nonzeros for k = 1:(N-1) G(2*k-1,k) = 10*2*x(k); G(2*k-1,k+1) = -10; G(2*k,k) = 1; end end end