Main Content

imageInputLayer

Image input layer

Description

An image input layer inputs 2-D images to a neural network and applies data normalization.

For 3-D image input, use image3dInputLayer.

Creation

Description

layer = imageInputLayer(inputSize) returns an image input layer and specifies the InputSize property.

layer = imageInputLayer(inputSize,Name=Value) sets optional properties using one or more name-value arguments.

example

Input Arguments

expand all

Size of the input data, specified as a row vector of integers [h w c], where h, w, and c correspond to the height, width, and number of channels respectively.

  • For grayscale images, specify a vector with c equal to 1.

  • For RGB images, specify a vector with c equal to 3.

  • For multispectral or hyperspectral images, specify a vector with c equal to the number of channels.

For 3-D image or volume input, use image3dInputLayer.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: imageInputLayer([28 28 3],Name="input") creates an image input layer with input size [28 28 3] and name 'input'.

Data normalization to apply every time data is forward propagated through the input layer, specified as one of the following:

  • "zerocenter" — Subtract the mean specified by Mean.

  • "zscore" — Subtract the mean specified by Mean and divide by StandardDeviation.

  • "rescale-symmetric" — Rescale the input to be in the range [-1, 1] using the minimum and maximum values specified by Min and Max, respectively.

  • "rescale-zero-one" — Rescale the input to be in the range [0, 1] using the minimum and maximum values specified by Min and Max, respectively.

  • "none" — Do not normalize the input data.

  • function handle — Normalize the data using the specified function. The function must be of the form Y = f(X), where X is the input data and the output Y is the normalized data.

If the input data is complex-valued and the SplitComplexInputs option is 0 (false), then the Normalization option must be "zerocenter", "zscore", "none", or a function handle. (since R2024a)

Before R2024a: To input complex-valued data into the network, the SplitComplexInputs option must be 1 (true).

Tip

The software, by default, automatically calculates the normalization statistics when you use the trainnet function. To save time when training, specify the required statistics for normalization and set the ResetInputNormalization option in trainingOptions to 0 (false).

The ImageInputLayer object stores the Normalization property as a character vector or a function handle.

Normalization dimension, specified as one of the following:

  • "auto" – If the ResetInputNormalization training option is 0 (false) and you specify any of the normalization statistics (Mean, StandardDeviation, Min, or Max), then normalize over the dimensions matching the statistics. Otherwise, recalculate the statistics at training time and apply channel-wise normalization.

  • "channel" – Channel-wise normalization.

  • "element" – Element-wise normalization.

  • "all" – Normalize all values using scalar statistics.

The ImageInputLayer object stores the NormalizationDimension property as a character vector.

Mean for zero-center and z-score normalization, specified as a h-by-w-by-c array, a 1-by-1-by-c array of means per channel, a numeric scalar, or [], where h, w, and c correspond to the height, width, and the number of channels of the mean, respectively.

To specify the Mean property, the Normalization property must be "zerocenter" or "zscore". If Mean is [], then the software automatically sets the property at training or initialization time:

  • The trainnet function calculates the mean using the training data and uses the resulting value.

  • The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to 0.

Mean can be complex-valued. (since R2024a) If Mean is complex-valued, then the SplitComplexInputs option must be 0 (false).

Before R2024a: Split the mean into real and imaginary parts and set split the input data into real and imaginary parts by setting the SplitComplexInputs option to 1 (true).

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64
Complex Number Support: Yes

Standard deviation for z-score normalization, specified as a h-by-w-by-c array, a 1-by-1-by-c array of means per channel, a numeric scalar, or [], where h, w, and c correspond to the height, width, and the number of channels of the standard deviation, respectively.

To specify the StandardDeviation property, the Normalization property must be "zscore". If StandardDeviation is [], then the software automatically sets the property at training or initialization time:

  • The trainnet function calculates the standard deviation using the training data and uses the resulting value.

  • The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to 1.

StandardDeviation can be complex-valued. (since R2024a) If StandardDeviation is complex-valued, then the SplitComplexInputs option must be 0 (false).

Before R2024a: Split the standard deviation into real and imaginary parts and set split the input data into real and imaginary parts by setting the SplitComplexInputs option to 1 (true).

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64
Complex Number Support: Yes

Minimum value for rescaling, specified as a h-by-w-by-c array, a 1-by-1-by-c array of minima per channel, a numeric scalar, or [], where h, w, and c correspond to the height, width, and the number of channels of the minima, respectively.

To specify the Min property, the Normalization must be "rescale-symmetric" or "rescale-zero-one". If Min is [], then the software automatically sets the property at training or initialization time:

  • The trainnet function calculates the minimum value using the training data and uses the resulting value.

  • The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to -1 and 0 when Normalization is "rescale-symmetric" and "rescale-zero-one", respectively.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Maximum value for rescaling, specified as a h-by-w-by-c array, a 1-by-1-by-c array of maxima per channel, a numeric scalar, or [], where h, w, and c correspond to the height, width, and the number of channels of the maxima, respectively.

To specify the Max property, the Normalization must be "rescale-symmetric" or "rescale-zero-one". If Max is [], then the software automatically sets the property at training or initialization time:

  • The trainnet function calculates the maximum value using the training data and uses the resulting value.

  • The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to 1.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Flag to split input data into real and imaginary components specified as one of these values:

  • 0 (false) – Do not split input data.

  • 1 (true) – Split data into real and imaginary components.

When SplitComplexInputs is 1, then the layer outputs twice as many channels as the input data. For example, if the input data is complex-valued with numChannels channels, then the layer outputs data with 2*numChannels channels, where channels 1 through numChannels contain the real components of the input data and numChannels+1 through 2*numChannels contain the imaginary components of the input data. If the input data is real, then channels numChannels+1 through 2*numChannels are all zero.

If the input data is complex-valued and SplitComplexInputs is 0 (false), then the layer passes the complex-valued data to the next layers. (since R2024a)

Before R2024a: To input complex-valued data into a neural network, the SplitComplexInputs option of the input layer must be 1 (true).

For an example showing how to train a network with complex-valued data, see Train Network with Complex-Valued Data.

Layer name, specified as a character vector or a string scalar. For Layer array input, the trainnet and dlnetwork functions automatically assign names to layers with the name "".

The ImageInputLayer object stores the Name property as a character vector.

Data Types: char | string

Properties

expand all

Image Input

This property is read-only.

Size of the input data, specified as a row vector of integers [h w c], where h, w, and c correspond to the height, width, and number of channels respectively.

  • For grayscale images, specify a vector with c equal to 1.

  • For RGB images, specify a vector with c equal to 3.

  • For multispectral or hyperspectral images, specify a vector with c equal to the number of channels.

For 3-D image or volume input, use image3dInputLayer.

This property is read-only.

Data normalization to apply every time data is forward propagated through the input layer, specified as one of the following:

  • "zerocenter" — Subtract the mean specified by Mean.

  • "zscore" — Subtract the mean specified by Mean and divide by StandardDeviation.

  • "rescale-symmetric" — Rescale the input to be in the range [-1, 1] using the minimum and maximum values specified by Min and Max, respectively.

  • "rescale-zero-one" — Rescale the input to be in the range [0, 1] using the minimum and maximum values specified by Min and Max, respectively.

  • "none" — Do not normalize the input data.

  • function handle — Normalize the data using the specified function. The function must be of the form Y = f(X), where X is the input data and the output Y is the normalized data.

If the input data is complex-valued and the SplitComplexInputs option is 0 (false), then the Normalization option must be "zerocenter", "zscore", "none", or a function handle. (since R2024a)

Before R2024a: To input complex-valued data into the network, the SplitComplexInputs option must be 1 (true).

Tip

The software, by default, automatically calculates the normalization statistics when you use the trainnet function. To save time when training, specify the required statistics for normalization and set the ResetInputNormalization option in trainingOptions to 0 (false).

The ImageInputLayer object stores this property as a character vector or a function handle.

Normalization dimension, specified as one of the following:

  • "auto" – If the ResetInputNormalization training option is 0 (false) and you specify any of the normalization statistics (Mean, StandardDeviation, Min, or Max), then normalize over the dimensions matching the statistics. Otherwise, recalculate the statistics at training time and apply channel-wise normalization.

  • "channel" – Channel-wise normalization.

  • "element" – Element-wise normalization.

  • "all" – Normalize all values using scalar statistics.

The ImageInputLayer object stores this property as a character vector.

Mean for zero-center and z-score normalization, specified as a h-by-w-by-c array, a 1-by-1-by-c array of means per channel, a numeric scalar, or [], where h, w, and c correspond to the height, width, and the number of channels of the mean, respectively.

To specify the Mean property, the Normalization property must be "zerocenter" or "zscore". If Mean is [], then the software automatically sets the property at training or initialization time:

  • The trainnet function calculates the mean using the training data and uses the resulting value.

  • The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to 0.

Mean can be complex-valued. (since R2024a) If Mean is complex-valued, then the SplitComplexInputs option must be 0 (false).

Before R2024a: Split the mean into real and imaginary parts and split the input data into real and imaginary parts by setting the SplitComplexInputs option to 1 (true).

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64
Complex Number Support: Yes

Standard deviation for z-score normalization, specified as a h-by-w-by-c array, a 1-by-1-by-c array of means per channel, a numeric scalar, or [], where h, w, and c correspond to the height, width, and the number of channels of the standard deviation, respectively.

To specify the StandardDeviation property, the Normalization property must be "zscore". If StandardDeviation is [], then the software automatically sets the property at training or initialization time:

  • The trainnet function calculates the standard deviation using the training data and uses the resulting value.

  • The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to 1.

StandardDeviation can be complex-valued. (since R2024a) If StandardDeviation is complex-valued, then the SplitComplexInputs option must be 0 (false).

Before R2024a: Split the standard deviation into real and imaginary parts and split the input data into real and imaginary parts by setting the SplitComplexInputs option to 1 (true).

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64
Complex Number Support: Yes

Minimum value for rescaling, specified as a h-by-w-by-c array, a 1-by-1-by-c array of minima per channel, a numeric scalar, or [], where h, w, and c correspond to the height, width, and the number of channels of the minima, respectively.

To specify the Min property, the Normalization must be "rescale-symmetric" or "rescale-zero-one". If Min is [], then the software automatically sets the property at training or initialization time:

  • The trainnet function calculates the minimum value using the training data and uses the resulting value.

  • The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to -1 and 0 when Normalization is "rescale-symmetric" and "rescale-zero-one", respectively.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Maximum value for rescaling, specified as a h-by-w-by-c array, a 1-by-1-by-c array of maxima per channel, a numeric scalar, or [], where h, w, and c correspond to the height, width, and the number of channels of the maxima, respectively.

To specify the Max property, the Normalization must be "rescale-symmetric" or "rescale-zero-one". If Max is [], then the software automatically sets the property at training or initialization time:

  • The trainnet function calculates the maximum value using the training data and uses the resulting value.

  • The initialize function and the dlnetwork function when the Initialize option is 1 (true) sets the property to 1.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

This property is read-only.

Flag to split input data into real and imaginary components specified as one of these values:

  • 0 (false) – Do not split input data.

  • 1 (true) – Split data into real and imaginary components.

When SplitComplexInputs is 1, then the layer outputs twice as many channels as the input data. For example, if the input data is complex-valued with numChannels channels, then the layer outputs data with 2*numChannels channels, where channels 1 through numChannels contain the real components of the input data and numChannels+1 through 2*numChannels contain the imaginary components of the input data. If the input data is real, then channels numChannels+1 through 2*numChannels are all zero.

If the input data is complex-valued and SplitComplexInputs is 0 (false), then the layer passes the complex-valued data to the next layers. (since R2024a)

Before R2024a: To input complex-valued data into a neural network, the SplitComplexInputs option of the input layer must be 1 (true).

For an example showing how to train a network with complex-valued data, see Train Network with Complex-Valued Data.

Layer

Layer name, specified as a character vector or string scalar. For Layer array input, the trainnet and dlnetwork functions automatically assign names to layers with the name "".

The ImageInputLayer object stores this property as a character vector.

Data Types: char | string

This property is read-only.

Number of inputs of the layer. The layer has no inputs.

Data Types: double

This property is read-only.

Input names of the layer. The layer has no inputs.

Data Types: cell

This property is read-only.

Number of outputs from the layer, returned as 1. This layer has a single output only.

Data Types: double

This property is read-only.

Output names, returned as {'out'}. This layer has a single output only.

Data Types: cell

Examples

collapse all

Create an image input layer for 28-by-28 color images.

inputlayer = imageInputLayer([28 28 3])
inputlayer = 
  ImageInputLayer with properties:

                      Name: ''
                 InputSize: [28 28 3]
        SplitComplexInputs: 0

   Hyperparameters
          DataAugmentation: 'none'
             Normalization: 'zerocenter'
    NormalizationDimension: 'auto'
                      Mean: []

Include an image input layer in a Layer array.

layers = [
    imageInputLayer([28 28 1])
    convolution2dLayer(5,20)
    reluLayer
    maxPooling2dLayer(2,Stride=2)
    fullyConnectedLayer(10)
    softmaxLayer]
layers = 
  6x1 Layer array with layers:

     1   ''   Image Input       28x28x1 images with 'zerocenter' normalization
     2   ''   2-D Convolution   20 5x5 convolutions with stride [1  1] and padding [0  0  0  0]
     3   ''   ReLU              ReLU
     4   ''   2-D Max Pooling   2x2 max pooling with stride [2  2] and padding [0  0  0  0]
     5   ''   Fully Connected   10 fully connected layer
     6   ''   Softmax           softmax

Algorithms

expand all

References

[1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "ImageNet Classification with Deep Convolutional Neural Networks." Communications of the ACM 60, no. 6 (May 24, 2017): 84–90. https://doi.org/10.1145/3065386.

[2] Cireşan, D., U. Meier, J. Schmidhuber. "Multi-column Deep Neural Networks for Image Classification". IEEE Conference on Computer Vision and Pattern Recognition, 2012.

Extended Capabilities

Version History

Introduced in R2016a

expand all