Preconditioned stochastic gradient descent

版本 1.2.0.0 (567.8 KB) 作者: Xilin Li

Upgrading stochastic gradient descent method to second order optimization method

关注

5.0

(1)

742.0 次下载

更新时间 2016/7/23

查看许可证

This package demonstrates the method proposed in paper http://arxiv.org/abs/1512.04202 which shows how to upgrade a stochastic gradient descent (SGD) method to a second order optimization method by preconditioning. More materials (pseudo code, more examples and papers) are put on https://sites.google.com/site/lixilinx/home/psgd.

Descriptions of enclosed files
binary_pattern.m
This file generates the zebra stripe like binary pattern to be learned by our four tested algorithms.

plain_SGD.m
This demo shows how to use a standard SGD to train a neural network by minimizing logistic loss. As usual, SGD requires some tuning work. Convergence is too slow for small step sizes, too bad for large step sizes.

preconditioned_SGD_dense.m
This demo shows how to precondition a SGD to improve its convergence using a dense preconditioner. We do need to calculate the gradient twice at each iteration, but the convergence is much faster, and less tuning effort is required. The step size is normalized, and a value in range [0.01, 0.1] seems good.

preconditioned_SGD_sparse.m
This demo shows how to approximate a preconditioner as direct sums and/or Kronecker products of smaller matrices. In practice, the scales of problem can be so large that we need to sparsely represent a preconditioner to make its estimation affordable.

preconditioner_kron.m
This function shows how to adaptively estimate a Kronecker product approximation of a preconditioner for parameters in matrix form.

preconditioner.m
This function shows how to adaptively estimate a preconditioner via gradient perturbation analysis.

RMSProp_SGD.m
This demo implements a popular variation of SGD for neural network training: RMSProp. Similar to the standard SGD, its tuning could be difficult.

引用格式

Xilin Li (2024). Preconditioned stochastic gradient descent (https://www.mathworks.com/matlabcentral/fileexchange/54525-preconditioned-stochastic-gradient-descent), MATLAB Central File Exchange. 检索来源 2024/8/27.

MATLAB 版本兼容性

创建方式 R2015a

兼容任何版本

平台兼容性

Windows macOS Linux

类别

AI, Data Science, and Statistics > Deep Learning Toolbox > Image Data Workflows > Pattern Recognition and Classification >

在 Help Center 和 MATLAB Answers 中查找有关 Pattern Recognition and Classification 的更多信息

标签添加标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

版本	已发布	发行说明
1.2.0.0	2016/7/23	The step size normalization factor in preconditioner estimation is changed to max(max(abs(grad))).	下载
1.1.0.0	2016/7/3	revised preconditioner estimation method. Specifically, Q = Q - step_sizegradQ/(max(abs(diag(grad))) + eps); is changed to Q = Q - step_sizegradQ/max(max(abs(diag(grad))), 1);	下载
1.0.0.0	2015/12/16		下载

Preconditioned stochastic gradient descent

引用格式

MATLAB 版本兼容性

平台兼容性

类别

标签 添加标签

Community Treasure Hunt

探索实时编辑器

WeChat

标签添加标签