Edit Shallow Neural Network Properties
Tip
To learn how to define your own layers for deep learning networks, see Define Custom Deep Learning Layers.
Deep Learning Toolbox™ software provides a flexible network object type that allows many kinds of
networks to be created and then used with functions such as init
, sim
, and train
.
Type the following to see all the network creation functions in the toolbox.
help nnnetwork
This flexibility is possible because networks have an object-oriented representation. The representation allows you to define various architectures and assign various algorithms to those architectures.
To create custom networks, start with an empty network (obtained with the network
function) and set its properties as desired.
net = network
The network object consists of many properties that you can set to specify the structure and behavior of your network.
The following sections show how to create a custom network by using these properties.
Custom Network
Before you can build a network you need to know what it looks like. For dramatic purposes (and to give the toolbox a workout) this section leads you through the creation of the wild and complicated network shown below.
Each of the two elements of the first network input is to accept values ranging between 0 and 10. Each of the five elements of the second network input ranges from −2 to 2.
Before you can complete your design of this network, the algorithms it employs for initialization and training must be specified.
Each layer's weights and biases are initialized with the Nguyen-Widrow layer
initialization method (initnw
). The network is trained with
Levenberg-Marquardt backpropagation (trainlm
), so that, given example input vectors, the outputs of the third layer
learn to match the associated target vectors with minimal mean squared error (mse
).
Network Definition
The first step is to create a new network. Type the following code to create a network and view its many properties:
net = network
Architecture Properties
The first group of properties displayed is labeled architecture
properties. These properties allow you to select the number of inputs and layers and their
connections.
Number of Inputs and Layers. The first two properties displayed in the dimensions group are
numInputs
and numLayers
. These properties allow
you to select how many inputs and layers you want the network to have.
net = dimensions: numInputs: 0 numLayers: 0 ...
Note that the network has no inputs or layers at this time.
Change that by setting these properties to the number of inputs and number of layers in the custom network diagram.
net.numInputs = 2; net.numLayers = 3;
net.numInputs
is the number of input sources, not the number of
elements in an input vector (net.inputs{i}.size
).
Bias Connections. Type net
and press Enter to view
its properties again. The network now has two inputs and three layers.
net = Neural Network: dimensions: numInputs: 2 numLayers: 3
Examine the next four properties in the connections group:
biasConnect: [0; 0; 0] inputConnect: [0 0; 0 0; 0 0] layerConnect: [0 0 0; 0 0 0; 0 0 0] outputConnect: [0 0 0]
These matrices of 1s and 0s represent the presence and absence of bias, input weight, layer weight, and output connections. They are currently all zeros, indicating that the network does not have any such connections.
The bias connection matrix is a 3-by-1 vector. To create a bias connection to the
ith layer you can set net.biasConnect(i)
to
1
. Specify that the first and third layers are to have bias
connections, as the diagram indicates, by typing the following code:
net.biasConnect(1) = 1; net.biasConnect(3) = 1;
You could also define those connections with a single line of code.
net.biasConnect = [1; 0; 1];
Input and Layer Weight Connections. The input connection matrix is 3-by-2, representing the presence of connections from
two sources (the two inputs) to three destinations (the three layers). Thus,
net.inputConnect(i,j)
represents the presence of an input weight
connection going to the ith layer from the jth
input.
To connect the first input to the first and second layers, and the second input to the second layer (as indicated by the custom network diagram), type
net.inputConnect(1,1) = 1; net.inputConnect(2,1) = 1; net.inputConnect(2,2) = 1;
or this single line of code:
net.inputConnect = [1 0; 1 1; 0 0];
Similarly, net.layerConnect(i.j)
represents the presence of a
layer-weight connection going to the ith layer from the
jth layer. Connect layers 1, 2, and 3 to layer 3 as
follows:
net.layerConnect = [0 0 0; 0 0 0; 1 1 1];
Output Connections. The output connections are a 1-by-3 matrix, indicating that they connect to one destination (the external world) from three sources (the three layers).
To connect layers 2 and 3 to the network output, type
net.outputConnect = [0 1 1];
Number of Outputs
Type net
and press Enter to view
the updated properties. The final three architecture properties are read-only values,
which means their values are determined by the choices made for other properties. The
first read-only property in the dimension group is the number of outputs:
numOutputs: 2
By defining output connection from layers 2 and 3, you specified that the network has two outputs.
Subobject Properties
The next group of properties in the output display is subobjects:
subobjects: inputs: {2x1 cell array of 2 inputs} layers: {3x1 cell array of 3 layers} outputs: {1x3 cell array of 2 outputs} biases: {3x1 cell array of 2 biases} inputWeights: {3x2 cell array of 3 weights} layerWeights: {3x3 cell array of 3 weights}
Inputs
When you set the number of inputs (net.numInputs
) to 2, the
inputs
property becomes a cell array of two input structures. Each ith input
structure (net.inputs{i}
) contains additional properties associated
with the ith input.
To see how the input structures are arranged, type
net.inputs ans = [1x1 nnetInput] [1x1 nnetInput]
To see the properties associated with the first input, type
net.inputs{1}
The properties appear as follows:
ans = name: 'Input' feedbackOutput: [] processFcns: {} processParams: {1x0 cell array of 0 params} processSettings: {0x0 cell array of 0 settings} processedRange: [] processedSize: 0 range: [] size: 0 userdata: (your custom info)
If you set the exampleInput
property, the range
,
size
, processedSize
, and
processedRange
properties will automatically be updated to match the
properties of the value of exampleInput
.
Set the exampleInput
property as follows:
net.inputs{1}.exampleInput = [0 10 5; 0 3 10];
If you examine the structure of the first input again, you see that it now has new values.
The property processFcns
can be set to one or more processing
functions. Type help nnprocess
to see a list of these functions.
Set the second input vector ranges to be from −2 to 2 for five elements as follows:
net.inputs{1}.processFcns = {'removeconstantrows','mapminmax'};
View the new input properties. You will see that processParams
,
processSettings
, processedRange
and
processedSize
have all been updated to reflect that inputs will be
processed using removeconstantrows
and mapminmax
before being given to the network when the network is simulated or
trained. The property processParams
contains the default parameters for
each processing function. You can alter these values, if you like. See the reference page
for each processing function to learn more about their parameters.
You can set the size of an input directly when no processing functions are used:
net.inputs{2}.size = 5;
Layers. When you set the number of layers (net.numLayers
) to 3, the
layers
property becomes a cell array of three-layer structures. Type the following line of code to see
the properties associated with the first layer.
net.layers{1} ans = Neural Network Layer name: 'Layer' dimensions: 0 distanceFcn: (none) distanceParam: (none) distances: [] initFcn: 'initwb' netInputFcn: 'netsum' netInputParam: (none) positions: [] range: [] size: 0 topologyFcn: (none) transferFcn: 'purelin' transferParam: (none) userdata: (your custom info)
Type the following three lines of code to change the first layer’s size to 4
neurons, its transfer function to tansig
, and its initialization function to the Nguyen-Widrow function, as
required for the custom network diagram.
net.layers{1}.size = 4; net.layers{1}.transferFcn = 'tansig'; net.layers{1}.initFcn = 'initnw';
The second layer is to have three neurons, the logsig
transfer function, and be initialized with initnw
. Set the second layer’s properties to the desired values as
follows:
net.layers{2}.size = 3; net.layers{2}.transferFcn = 'logsig'; net.layers{2}.initFcn = 'initnw';
The third layer’s size and transfer function properties don't need to be changed, because the defaults match those shown in the network diagram. You need to set only its initialization function, as follows:
net.layers{3}.initFcn = 'initnw';
Outputs. Use this line of code to see how the outputs
property is
arranged:
net.outputs ans = [] [1x1 nnetOutput] [1x1 nnetOutput]
Note that outputs
contains two output structures, one for layer 2
and one for layer 3. This arrangement occurs automatically when
net.outputConnect
is set to [0 1 1]
.
View the second layer’s output structure with the following expression:
net.outputs{2} ans = Neural Network Output name: 'Output' feedbackInput: [] feedbackDelay: 0 feedbackMode: 'none' processFcns: {} processParams: {1x0 cell array of 0 params} processSettings: {0x0 cell array of 0 settings} processedRange: [3x2 double] processedSize: 3 range: [3x2 double] size: 3 userdata: (your custom info)
The size
is automatically set to 3 when the second layer’s size
(net.layers{2}.size
) is set to that value. Look at the third
layer’s output structure if you want to verify that it also has the correct
size
.
Outputs have processing properties that are automatically applied to target values before they are used by the network during training. The same processing settings are applied in reverse on layer output values before they are returned as network output values during network simulation or training.
Similar to input-processing properties, setting the exampleOutput
property automatically causes size
, range
,
processedSize
, and processedRange
to be updated.
Setting processFcns
to a cell array list of processing function names
causes processParams
, processSettings
,
processedRange
to be updated. You can then alter the
processParam
values, if you want to.
Biases, Input Weights, and Layer Weights. Enter the following commands to see how bias and weight structures are arranged:
net.biases net.inputWeights net.layerWeights
Here are the results of typing net.biases
:
ans = [1x1 nnetBias] [] [1x1 nnetBias]
Each contains a structure where the corresponding connections
(net.biasConnect
, net.inputConnect
, and
net.layerConnect
) contain a 1.
Look at their structures with these lines of code:
net.biases{1} net.biases{3} net.inputWeights{1,1} net.inputWeights{2,1} net.inputWeights{2,2} net.layerWeights{3,1} net.layerWeights{3,2} net.layerWeights{3,3}
For example, typing net.biases{1}
results in the following
output:
initFcn: (none) learn: true learnFcn: (none) learnParam: (none) size: 4 userdata: (your custom info)
Specify the weights’ tap delay lines in accordance with the network diagram by
setting each weight’s delays
property:
net.inputWeights{2,1}.delays = [0 1]; net.inputWeights{2,2}.delays = 1; net.layerWeights{3,3}.delays = 1;
Network Functions
Type net
and press Return again to
see the next set of properties.
functions: adaptFcn: (none) adaptParam: (none) derivFcn: 'defaultderiv' divideFcn: (none) divideParam: (none) divideMode: 'sample' initFcn: 'initlay' performFcn: 'mse' performParam: .regularization, .normalization plotFcns: {} plotParams: {1x0 cell array of 0 params} trainFcn: (none) trainParam: (none)
Each of these properties defines a function for a basic network operation.
Set the initialization function to initlay
so the network initializes itself according to the layer
initialization functions already set to initnw
, the Nguyen-Widrow initialization function.
net.initFcn = 'initlay';
This meets the initialization requirement of the network.
Set the performance function to mse
(mean squared error) and the training function to trainlm
(Levenberg-Marquardt backpropagation) to meet the final requirement
of the custom network.
net.performFcn = 'mse'; net.trainFcn = 'trainlm';
Set the divide function to dividerand
(divide training data
randomly).
net.divideFcn = 'dividerand';
During supervised training, the input and target data are randomly divided into training, test, and validation data sets. The network is trained on the training data until its performance begins to decrease on the validation data, which signals that generalization has peaked. The test data provides a completely independent test of network generalization.
Set the plot functions to plotperform
(plot training, validation and
test performance) and plottrainstate
(plot the state of the
training algorithm with respect to epochs).
net.plotFcns = {'plotperform','plottrainstate'};
Weight and Bias Values
Before initializing and training the network, type net
and press
Return, then look at the weight and bias group of
network properties.
weight and bias values: IW: {3x2 cell} containing 3 input weight matrices LW: {3x3 cell} containing 3 layer weight matrices b: {3x1 cell} containing 2 bias vectors
These cell arrays contain weight matrices and bias vectors in the same positions that the connection properties
(net.inputConnect
, net.layerConnect
,
net.biasConnect
) contain 1s and the subobject properties
(net.inputWeights
, net.layerWeights
,
net.biases
) contain structures.
Evaluating each of the following lines of code reveals that all the bias vectors and weight matrices are set to zeros.
net.IW{1,1}, net.IW{2,1}, net.IW{2,2} net.LW{3,1}, net.LW{3,2}, net.LW{3,3} net.b{1}, net.b{3}
Each input weight net.IW{i,j}
, layer weight
net.LW{i,j}
, and bias vector net.b{i}
has as many
rows as the size of the ith layer
(net.layers{i}.size
).
Each input weight net.IW{i,j}
has as many columns as the size of
the jth input (net.inputs{j}.size
) multiplied by
the number of its delay values
(length(net.inputWeights{i,j}.delays)
).
Likewise, each layer weight has as many columns as the size of the
jth layer (net.layers{j}.size
) multiplied by the
number of its delay values
(length(net.layerWeights{i,j}.delays)
).
Network Behavior
Initialization
Initialize your network with the following line of code:
net = init(net);
Check the network's biases and weights again to see how they have changed:
net.IW{1,1}, net.IW{2,1}, net.IW{2,2} net.LW{3,1}, net.LW{3,2}, net.LW{3,3} net.b{1}, net.b{3}
For example,
net.IW{1,1} ans = -0.3040 0.4703 -0.5423 -0.1395 0.5567 0.0604 0.2667 0.4924
Training
Define the following cell array of two input vectors (one with two elements, one with five) for two time steps (i.e., two columns).
X = {[0; 0] [2; 0.5]; [2; -2; 1; 0; 1] [-1; -1; 1; 0; 1]};
You want the network to respond with the following target sequences for the second layer, which has three neurons, and the third layer with one neuron:
T = {[1; 1; 1] [0; 0; 0]; 1 -1};
Before training, you can simulate the network to see whether the initial network's
response Y
is close to the target T
.
Y = sim(net,X) Y = [3x1 double] [3x1 double] [ 1.7148] [ 2.2726]
The cell array Y
is the output sequence of the network, which is
also the output sequence of the second and third layers. The values you got for the second
row can differ from those shown because of different initial weights and biases. However,
they will almost certainly not be equal to targets T
, which is also
true of the values shown.
The next task is optional. On some occasions you may wish to alter the training
parameters before training. The following line of code displays the default
Levenberg-Marquardt training parameters (defined when you set
net.trainFcn
to trainlm
).
net.trainParam
The following properties should be displayed.
ans = Show Training Window Feedback showWindow: true Show Command Line Feedback showCommandLine: false Command Line Frequency show: 25 Maximum Epochs epochs: 1000 Maximum Training Time time: Inf Performance Goal goal: 0 Minimum Gradient min_grad: 1e-07 Maximum Validation Checks max_fail: 6 Mu mu: 0.001 Mu Decrease Ratio mu_dec: 0.1 Mu Increase Ratio mu_inc: 10 Maximum mu mu_max: 10000000000
You will not often need to modify these values. See the documentation for the training function for information about what each of these means. They have been initialized with default values that work well for a large range of problems, so there is no need to change them here.
Next, train the network with the following call:
net = train(net,X,T);
Training launches the neural network training window. To open the performance and training state plots, click the plot buttons.
After training, you can simulate the network to see if it has learned to respond correctly:
Y = sim(net,X) [3x1 double] [3x1 double] [ 1.0000] [ -1.0000]
The second network output (i.e., the second row of the cell array
Y
), which is also the third layer’s output, matches the target sequence
T
.