## 自动微分背景

### 什么是自动微分？

• 正向模式自动微分通过与函数本身的求值运算同时执行基本导数运算来求出数值导数。如下一节所述，软件在计算图上执行这些计算。

• 反向模式自动微分使用正向模式计算图的扩展，使得能够通过反向遍历图来计算梯度。当软件运行代码来计算函数及其导数时，它会将操作记录在称为跟踪的数据结构中。

### 正向模式

`$f\left(x\right)={x}_{1}\mathrm{exp}\left(-\frac{1}{2}\left({x}_{1}^{2}+{x}_{2}^{2}\right)\right).$`

`$\begin{array}{c}\frac{df}{d{x}_{1}}=\frac{d{u}_{6}}{d{x}_{1}}\\ =\frac{\partial {u}_{6}}{\partial {u}_{-1}}+\frac{\partial {u}_{6}}{\partial {u}_{5}}\frac{\partial {u}_{5}}{\partial {x}_{1}}\\ =\frac{\partial {u}_{6}}{\partial {u}_{-1}}+\frac{\partial {u}_{6}}{\partial {u}_{5}}\frac{\partial {u}_{5}}{\partial {u}_{4}}\frac{\partial {u}_{4}}{\partial {x}_{1}}\\ =\frac{\partial {u}_{6}}{\partial {u}_{-1}}+\frac{\partial {u}_{6}}{\partial {u}_{5}}\frac{\partial {u}_{5}}{\partial {u}_{4}}\frac{\partial {u}_{4}}{\partial {u}_{3}}\frac{\partial {u}_{3}}{\partial {x}_{1}}\\ =\frac{\partial {u}_{6}}{\partial {u}_{-1}}+\frac{\partial {u}_{6}}{\partial {u}_{5}}\frac{\partial {u}_{5}}{\partial {u}_{4}}\frac{\partial {u}_{4}}{\partial {u}_{3}}\frac{\partial {u}_{3}}{\partial {u}_{1}}\frac{\partial {u}_{1}}{\partial {x}_{1}}.\end{array}$`

${\stackrel{˙}{u}}_{i}$ 表示表达式 ui 关于 x1 的导数。使用函数评估中得到的 ui 的评估值，计算 f 关于 x1 的偏导数，如下图所示。请注意，当您从上到下遍历图表时，${\stackrel{˙}{u}}_{i}$ 的所有值均变为可用。

### 反向模式

`${\overline{u}}_{i}=\frac{\partial f}{\partial {u}_{i}}.$`

`$\begin{array}{c}\frac{\partial f}{\partial {u}_{-1}}=\frac{\partial f}{\partial {u}_{1}}\frac{\partial {u}_{1}}{\partial {u}_{-1}}+\frac{\partial f}{\partial {u}_{6}}\frac{\partial {u}_{6}}{\partial {u}_{-1}}\\ ={\overline{u}}_{1}\frac{\partial {u}_{1}}{\partial {u}_{-1}}+{\overline{u}}_{6}\frac{\partial {u}_{6}}{\partial {u}_{-1}}.\end{array}$`

`${\overline{u}}_{-1}={\overline{u}}_{1}2{u}_{-1}+{\overline{u}}_{6}{u}_{5}.$`

`$f\left(x\right)={x}_{1}\mathrm{exp}\left(-\frac{1}{2}\left({x}_{1}^{2}+{x}_{2}^{2}\right)\right).$`

### Optimization Toolbox 中的自动微分

• 对于一般的非线性目标函数，`fmincon` 默认选择反向 AD。对于非线性约束函数，如果其非线性约束的数量小于变量数目，`fmincon` 默认选择反向 AD。否则，`fmincon` 默认选择正向 AD。

• 对于一般的非线性目标函数，`fminunc` 默认选择反向 AD。

• 对于最小二乘目标函数，`fmincon``fminunc` 对目标函数默认选择正向 AD。有关基于问题的最小二乘目标函数的定义，请参阅编写基于问题的最小二乘法的目标函数

• 当目标向量中的元素数大于或等于变量数时，`lsqnonlin` 默认选择正向 AD。否则，`lsqnonlin` 默认选择反向 AD。

• 当方程数大于或等于变量数时，`fsolve` 默认选择正向 AD。否则，`fsolve` 默认选择反向 AD。

```options = optimoptions('fmincon','SpecifyObjectiveGradient',true,... 'SpecifyConstraintGradient',true); problem.options = options;```

## 参考

[1] Baydin, Atilim Gunes, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. "Automatic Differentiation in Machine Learning: A Survey." The Journal of Machine Learning Research, 18(153), 2018, pp. 1–43. Available at https://arxiv.org/abs/1502.05767.

[2] Automatic differentiation. Wikipedia. Available at https://en.wikipedia.org/wiki/Automatic_differentiation.