Optimize Neural Network Training Speed and Memory
Memory Reduction
Depending on the particular neural network, simulation and gradient calculations can occur in MATLAB® or MEX. MEX is more memory efficient, but MATLAB can be made more memory efficient in exchange for time.
To determine whether MATLAB or MEX is being used, use the 'showResources'
option, as
shown in this general form of the syntax:
net2 = train(net1,x,t,'showResources','yes')
If MATLAB is being used and memory limitations are a problem, the amount of temporary
storage needed can be reduced by a factor of N
, in exchange for
performing the computations N
times sequentially on each of
N
subsets of the data.
net2 = train(net1,x,t,'reduction',N);
This is called memory reduction.
Fast Elliot Sigmoid
Some simple computing hardware might not support the exponential function directly, and
software implementations can be slow. The Elliot sigmoid elliotsig
function performs the same role as the symmetric sigmoid
tansig
function, but avoids the exponential function.
Here is a plot of the Elliot sigmoid:
n = -10:0.01:10; a = elliotsig(n); plot(n,a)
Next, elliotsig
is compared with
tansig
.
a2 = tansig(n); h = plot(n,a,n,a2); legend(h,'elliotsig','tansig','Location','NorthWest')
To train a neural network using elliotsig
instead of
tansig
, transform the network’s transfer functions:
[x,t] = bodyfat_dataset; net = feedforwardnet; view(net) net.layers{1}.transferFcn = 'elliotsig'; view(net) net = train(net,x,t); y = net(x)
Here, the times to execute elliotsig
and
tansig
are compared. elliotsig
is approximately
four times faster on the test system.
n = rand(5000,5000); tic,for i=1:100,a=tansig(n); end, tansigTime = toc; tic,for i=1:100,a=elliotsig(n); end, elliotTime = toc; speedup = tansigTime / elliotTime speedup = 4.1406
However, while simulation is faster with elliotsig
, training is not
guaranteed to be faster, due to the different shapes of the two transfer functions. Here, 10
networks are each trained for tansig
and
elliotsig
, but training times vary significantly even on the same
problem with the same network.
[x,t] = bodyfat_dataset; tansigNet = feedforwardnet; tansigNet.trainParam.showWindow = false; elliotNet = tansigNet; elliotNet.layers{1}.transferFcn = 'elliotsig'; for i=1:10, tic, net = train(tansigNet,x,t); tansigTime = toc, end for i=1:10, tic, net = train(elliotNet,x,t), elliotTime = toc, end