resnet3dNetwork
Syntax
Description
creates a 3-D residual neural network with the specified image input size and number of
classes. net
= resnet3dNetwork(inputSize
,numClasses
)
To create a 2-D residual network, use resnetNetwork
.
specifies additional options using one or more name-value arguments. For example,
net
= resnet3dNetwork(inputSize
,numClasses
,Name=Value
)BottleneckType="none"
returns a 3-D residual neural network without
bottleneck components.
Examples
Create 3-D Residual Network
Create a 3-D residual network with a bottleneck architecture.
imageSize = [224 224 224 3]; numClasses = 10; net = resnet3dNetwork(imageSize,numClasses)
net = dlnetwork with properties: Layers: [176x1 nnet.cnn.layer.Layer] Connections: [191x2 table] Learnables: [214x3 table] State: [106x3 table] InputNames: {'input'} OutputNames: {'softmax'} Initialized: 1 View summary with summary.
Analyze the network using the analyzeNetwork
function.
analyzeNetwork(net)
3-D Residual Network with Custom Stack Depth
Create a 3-D ResNet-101 network using a custom stack depth.
imageSize = [224 224 64 3]; numClasses = 10; stackDepth = [3 4 23 3]; numFilters = [64 128 256 512]; net = resnet3dNetwork(imageSize,numClasses, ... StackDepth=stackDepth, ... NumFilters=numFilters)
net = dlnetwork with properties: Layers: [346x1 nnet.cnn.layer.Layer] Connections: [378x2 table] Learnables: [418x3 table] State: [208x3 table] InputNames: {'input'} OutputNames: {'softmax'} Initialized: 1 View summary with summary.
Analyze the network.
analyzeNetwork(net)
Input Arguments
inputSize
— Network image input size
vector of positive integers
Network image input size, specified as one of these values:
Vector of positive integers of the form
[h w d]
— Input has a height, width, and depth ofh
,w
, andd
, respectively.Vector of positive integers of the form
[h w d c]
— Input has a height, width, depth, and number of channels ofh
,w
,d
, andc
, respectively. For RGB images,c
is3
, and for grayscale images,c
is1
.
The values of inputSize
depend on the
InitialPoolingLayer
argument:
If
InitialPoolingLayer
is"max"
or"average"
, then the spatial dimension sizes must be greater than or equal tok*2^(D+1)
, wherek
is the value ofInitialStride
in the first convolutional layer in the corresponding direction, andD
is the number of downsampling blocks.If
InitialPoolingLayer
is"none"
, then the spatial dimension sizes must be greater than or equal tok*2^D
, wherek
is the value ofInitialStride
in the first convolutional layer in the corresponding direction.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
numClasses
— Number of classes for classification tasks
positive integer
Number of classes for classification tasks, specified as a positive integer.
The function returns a neural network for classification tasks with the specified number of classes by setting the output size of the last fully connected layer to numClasses
.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: net =
resnet3dNetwork(inputSize,numClasses,BottleneckType="none")
returns a 3-D
residual neural network without bottleneck components.
InitialFilterSize
— Filter size in first convolutional layer
7
(default) | positive integer | vector of positive integers
Filter size in the first convolutional layer, specified as one of these values:
Positive integer — First convolutional layer has filters with a height and width of the specified value.
Vector of positive integers of the form
[h w d]
— First convolutional layer has filters with a height, width, and depth ofh
,w
, andd
, respectively.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
InitialNumFilters
— Number of filters in first convolutional layer
64
(default) | positive integer
Number of filters in the first convolutional layer, specified as a positive integer. The number of initial filters determines the number of channels (feature maps) in the output of the first convolutional layer in the residual network.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
InitialStride
— Stride in first convolutional layer
2
(default) | positive integer | vector of positive integers
Stride in the first convolutional layer, specified as one of these values:
Positive integer — First convolutional layer has a stride of the specified value in the horizontal, vertical, and depth directions.
Vector of positive integers of the form
[h w d]
— First convolutional layer has a stride ofh
,w
, andd
in the horizontal, vertical, and depth directions, respectively.
The stride defines the step size for traversing the input in the horizontal, vertical, and depth directions.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
InitialPoolingLayer
— First pooling layer
"max"
(default) | "average"
| "none"
First pooling layer before the initial residual block, specified as one of these values:
"max"
— Use a max pooling layer before the initial residual block. For more information, seemaxPooling3dLayer
."average"
— Use an average pooling layer before the initial residual block. For more information, seeaveragePooling3dLayer
."none"
— Do not use a pooling layer before the initial residual block.
ResidualBlockType
— Residual block type
"batchnorm-before-add"
(default) | "batchnorm-after-add"
Residual block type, specified as one of these values:
The ResidualBlockType
argument specifies the location of the batch
normalization layer in the standard and downsampling residual blocks. For more
information, see Residual Network.
BottleneckType
— Block bottleneck type
"downsample-first-conv"
(default) | "none"
Block bottleneck type, specified as one of these values:
"downsample-first-conv"
— Use bottleneck residual blocks that perform downsampling, using a stride of 2, in the first convolutional layer of the downsampling residual blocks. A bottleneck residual block consists of three layers: a convolutional layer with filters of size 1 for downsampling the channel dimension, a convolutional layer with filters of size 3, and a convolutional layer with filters of size 1 for upsampling the channel dimension.The number of filters in the final convolutional layer is four times that in the first two convolutional layers.
"none"
— Do not use bottleneck residual blocks. The residual blocks consist of two convolutional layers with filters of size 3.
A bottleneck block reduces the number of channels by a factor of four by performing a convolution with filters of size 1 before performing convolution with filters of size 3. Networks with and without bottleneck blocks have a similar level of computational complexity, but the total number of features propagating in the residual connections is four times larger when you use bottleneck units. Therefore, using a bottleneck increases the efficiency of the network [1].
For more information on the layers in each residual block, see Residual Network.
StackDepth
— Number of residual blocks in each stack
[3 4 6 3]
(default) | vector of positive integers
Number of residual blocks in each stack, specified as a vector of positive integers.
For example, if the stack depth is [3 4 6 3]
, the network has four stacks,
with three blocks, four blocks, six blocks, and three blocks.
Specify the number of filters in the convolutional layers of each stack using the NumFilters
argument. StackDepth
must have the same number of elements as NumFilters
.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
NumFilters
— Number of filters in convolutional layers of each stack
[64 128 256 512]
(default) | vector of positive integers
Number of filters in the convolutional layers of each stack, specified as a vector of positive integers.
If
BottleneckType
is"downsample-first-conv"
, then the number of filters in each of the first two convolutional layers in each block of each stack isNumFilters
. The final convolutional layer has four times the number of filters in each of the first two convolutional layers.For example, if
NumFilters
is[4 5]
andBottleneckType
is"downsample-first-conv"
, then in the first stack, the first two convolutional layers in each block have 4 filters and the final convolutional layer in each block has 16 filters. In the second stack, the first two convolutional layers in each block have 5 filters and the final convolutional layer has 20 filters.If
BottleneckType
is"none"
, then the number of filters in each convolutional layer in each stack isNumFilters
.
NumFilters
must have the same number of elements as
StackDepth
.
The NumFilters
value determines the layers on the residual connection in
the initial residual block. The residual connection has a convolutional layer when you
meet one of these conditions:
BottleneckType
is"downsample-first-conv"
, andInitialNumFilters
is not equal to four times the first element ofNumFilters
.BottleneckType
is"none"
, andInitialNumFilters
is not equal to the first element ofNumFilters
.
For more information about the layers in each residual block, see Residual Network.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Normalization
— Data normalization to apply
"zerocenter"
(default) | "zscore"
Data normalization to apply every time data forward-propagates through the input layer, specified as one of these options:
"zerocenter"
— Subtract the mean of the training data."zscore"
— Subtract the mean and then divide by the standard deviation of the training data.
The trainnet
function automatically calculates the mean and standard deviation of the training
data.
Initialize
— Flag to initialize learnable parameters
true
or 1
(default) | false
or 0
Flag to initialize learnable parameters, specified as a logical 1
(true
) or 0
(false
).
Output Arguments
net
— Residual neural network
dlnetwork
object
Residual neural network, returned as a dlnetwork
object.
More About
Residual Network
Residual networks (ResNets) are a type of deep network that consists of building blocks that have residual connections (also known as skip or shortcut connections). These connections allow the input to skip the convolutional units of the main branch, thus providing a simpler path through the network. By allowing the parameter gradients to flow more easily from the final layers to the earlier layers of the network, residual connections mitigate the problem of vanishing gradients during early training.
The structure of a residual network is flexible. The key component is the inclusion of the residual connections within residual blocks. A group of residual blocks is called a stack. A ResNet architecture consists of initial layers, followed by stacks containing residual blocks, and then the final layers. A network has three types of residual blocks:
Initial residual block — This block occurs at the start of the first stack. The layers in the residual connection of the initial residual block determine if the block preserves the activation sizes or performs downsampling.
Standard residual block — This block occurs multiple times in each stack, after the first downsampling residual block. The standard residual block preserves the activation sizes.
Downsampling residual block — This block occurs once, at the start of each stack. The first convolutional unit in the downsampling block downsamples the spatial dimensions by a factor of two.
A typical stack has a downsampling residual block, followed by
m
standard residual blocks, where m
is a positive
integer. The first stack is the only stack that begins with an initial residual block.
The initial, standard, and downsampling residual blocks can be bottleneck or nonbottleneck blocks.
A bottleneck block reduces the number of channels by a factor of four by performing a convolution with filters of size 1 before performing convolution with filters of size 3. Networks with and without bottleneck blocks have a similar level of computational complexity, but the total number of features propagating in the residual connections is four times larger when you use bottleneck units. Therefore, using a bottleneck increases the efficiency of the network [1].
The options you set determine the layers inside each block.
Block Layers
Name | Initial Layers | Initial Residual Block | Standard Residual Block
(BottleneckType="downsample-first-conv" ) | Standard Residual Block
(BottleneckType="none" ) | Downsampling Residual Block | Final Layers |
Description | A residual network starts with these layers, in order:
| The main branch of the initial residual block has the same layers as a standard residual block. The
If | The standard residual block with bottleneck units has these layers, in order:
The standard block has a residual connection from the output of the previous block to the addition layer. Set the position of the addition layer using the
| The standard residual block without bottleneck units has these layers, in order:
The standard block has a residual connection from the output of the previous block to the addition layer. Set
the position of the addition layer using the
| The downsampling residual block is the same as the standard
block (either with or without the bottleneck) but with a stride of
size The
layers on the residual connection depend on the value of
The downsampling block halves the height and width of the input, and increases the number of channels. | A residual network ends with these layers, in order:
|
Example Visualization |
| Example of an initial residual block for a network without a bottleneck and with the batch normalization layer before the addition layer.
| Example of the standard residual block for a network with a bottleneck and with the batch normalization layer before the addition layer.
| Example of the standard residual block for a network without a bottleneck and with the batch normalization layer before the addition layer.
| Example of a downsampling residual block for a network without a bottleneck and with the batch normalization layer before the addition layer.
|
|
The convolution and fully connected layer weights are initialized using the He weight initialization method [3].
Tips
When working with small images, set the
InitialPoolingLayer
option to"none"
to remove the initial pooling layer and reduce the amount of downsampling.Residual networks are usually named ResNet-X, where X is the depth of the network. The depth of a network is defined as the largest number of sequential convolutional or fully connected layers on a path from the network input to the network output. You can use this formula to compute the depth of your network:
where si is the depth of stack i.
References
[1] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Deep Residual Learning for Image Recognition.” Preprint, submitted December 10, 2015. https://arxiv.org/abs/1512.03385.
[2] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Identity Mappings in Deep Residual Networks.” Preprint, submitted July 25, 2016. https://arxiv.org/abs/1603.05027.
[3] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification." In Proceedings of the 2015 IEEE International Conference on Computer Vision, 1026–34. Washington, DC: IEEE Computer Vision Society, 2015.
Version History
Introduced in R2024a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)