trainbr
Bayesian regularization backpropagation
Description
net.trainFcn = 'trainbr'
sets the network trainFcn
property.
[
trains the network with trainedNet
,tr
] = train(net
,...)trainbr
.
trainbr
is a network training function that updates the weight and bias
values according to Levenberg-Marquardt optimization. It minimizes a combination of squared
errors, weights, and biases, and then determines the correct combination so as to produce a
network that generalizes well. The process is called Bayesian regularization.
Training occurs according to trainbr
training parameters, shown here
with their default values:
net.trainParam.epochs
— Maximum number of epochs to train. The default value is 1000.net.trainParam.goal
— Performance goal. The default value is 0.net.trainParam.mu
— Marquardt adjustment parameter. The default value is 0.005.net.trainParam.mu_dec
— Decrease factor formu
. The default value is 0.1.net.trainParam.mu_inc
— Increase factor formu
. The default value is 10.net.trainParam.mu_max
— Maximum value for mu. The default value is1e10
.net.trainParam.max_fail
— Maximum validation failures. The default value is0
.net.trainParam.min_grad
— Minimum performance gradient. The default value is1e-7
.net.trainParam.show
— Epochs between displays (NaN
for no displays). The default value is 25.net.trainParam.showCommandLine
— Generate command-line output. The default value isfalse
.net.trainParam.showWindow
— Show training GUI. The default value istrue
.net.trainParam.time
— Maximum time to train in seconds. The default value isinf
.
Validation stops are disabled by default (max_fail = 0
) so that
training can continue until an optimal combination of errors and weights is found. However,
some weight/bias minimization can still be achieved with shorter training times if validation
is enabled by setting max_fail
to 6 or some other strictly positive
value.
Examples
Input Arguments
Output Arguments
Limitations
This function uses the Jacobian for calculations, which assumes that performance is a mean
or sum of squared errors. Therefore networks trained with this function must use either the
mse
or sse
performance function.
More About
Algorithms
trainbr
can train any network as long as its weight, net input, and
transfer functions have derivative functions.
Bayesian regularization minimizes a linear combination of squared errors and weights. It also modifies the linear combination so that at the end of training the resulting network has good generalization qualities. See MacKay (Neural Computation, Vol. 4, No. 3, 1992, pp. 415 to 447) and Foresee and Hagan (Proceedings of the International Joint Conference on Neural Networks, June, 1997) for more detailed discussions of Bayesian regularization.
This Bayesian regularization takes place within the Levenberg-Marquardt algorithm.
Backpropagation is used to calculate the Jacobian jX
of performance
perf
with respect to the weight and bias variables X
.
Each variable is adjusted according to Levenberg-Marquardt,
jj = jX * jX je = jX * E dX = -(jj+I*mu) \ je
where E
is all errors and I
is the identity
matrix.
The adaptive value mu
is increased by mu_inc
until
the change shown above results in a reduced performance value. The change is then made to the
network, and mu
is decreased by mu_dec
.
Training stops when any of these conditions occurs:
The maximum number of
epochs
(repetitions) is reached.The maximum amount of
time
is exceeded.Performance is minimized to the
goal
.The performance gradient falls below
min_grad
.mu
exceedsmu_max
.
References
[1] MacKay, David J. C. "Bayesian interpolation." Neural computation. Vol. 4, No. 3, 1992, pp. 415–447.
[2] Foresee, F. Dan, and Martin T. Hagan. "Gauss-Newton approximation to Bayesian learning." Proceedings of the International Joint Conference on Neural Networks, June, 1997.
Version History
Introduced before R2006a