Main Content

patchGANDiscriminator

Create PatchGAN discriminator network

Since R2021a

Description

net = patchGANDiscriminator(inputSize) creates a PatchGAN discriminator network for input of size inputSize. For more information about the PatchGAN network architecture, see PatchGAN Discriminator Network.

This function requires Deep Learning Toolbox™.

example

net = patchGANDiscriminator(inputSize,Name=Value) controls properties of the PatchGAN network using name-value arguments.

You can create a 1-by-1 PatchGAN discriminator network, called a pixel discriminator network, by specifying the NetworkType name-value argument as "pixel". For more information about the pixel discriminator network architecture, see Pixel Discriminator Network.

example

Examples

collapse all

Specify the input size of the network for a color image of size 256-by-256 pixels.

inputSize = [256 256 3];

Create the PatchGAN discriminator network with the specified input size.

net = patchGANDiscriminator(inputSize)
net = 
  dlnetwork with properties:

         Layers: [13x1 nnet.cnn.layer.Layer]
    Connections: [12x2 table]
     Learnables: [16x3 table]
          State: [6x3 table]
     InputNames: {'input_top'}
    OutputNames: {'conv2d_final'}
    Initialized: 1

  View summary with summary.

Display the network.

analyzeNetwork(net)

Specify the input size of the network for a color image of size 256-by-256 pixels.

inputSize = [256 256 3];

Create the pixel discriminator network with the specified input size.

net = patchGANDiscriminator(inputSize,"NetworkType","pixel")
net = 
  dlnetwork with properties:

         Layers: [7x1 nnet.cnn.layer.Layer]
    Connections: [6x2 table]
     Learnables: [8x3 table]
          State: [2x3 table]
     InputNames: {'input_top'}
    OutputNames: {'conv2d_final'}
    Initialized: 1

  View summary with summary.

Display the network.

analyzeNetwork(net)

Input Arguments

collapse all

Network input size, specified as a 3-element vector of positive integers. inputSize has the form [H W C], where H is the height, W is the width, and C is the number of channels. If the input to the discriminator is a channel-wise concatenated dlarray (Deep Learning Toolbox) object, then C must be the concatenated size.

Example: [28 28 3] specifies an input size of 28-by-28 pixels for a 3-channel image.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: net = patchGANDiscriminator(inputSize,FilterSize=5) creates a discriminator whose convolution layers have a filter of size 5-by-5 pixels.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: net = patchGANDiscriminator(inputSize,"FilterSize",5) creates a discriminator whose convolution layers have a filter of size 5-by-5 pixels.

Type of discriminator network, specified as one of these values.

  • "patch" – Create a PatchGAN discriminator

  • "pixel" – Create a pixel discriminator, which is a 1-by-1 PatchGAN discriminator

Data Types: char | string

Number of downsampling operations of the network, specified as a positive integer. The discriminator network downsamples the input by a factor of 2^NumDownsamplingBlocks. This argument is ignored when you specify NetworkType as "pixel".

Number of filters in the first discriminator block, specified as a positive integer.

Filter size of convolution layers, specified as a positive integer or 2-element vector of positive integers of the form [height width]. When you specify the filter size as a scalar, the filter has equal height and width. Typical filters have height and width between 1 and 4. This argument has an effect only when you specify NetworkType as "patch".

Style of padding used in the network, specified as one of these values.

PaddingValueDescriptionExample
Numeric scalarPad with the specified numeric value

[314159265][2222222222222222314222215922222652222222222222222]

"symmetric-include-edge"Pad using mirrored values of the input, including the edge values

[314159265][5115995133144113314415115995622655662265565115995]

"symmetric-exclude-edge"Pad using mirrored values of the input, excluding the edge values

[314159265][5626562951595141314139515951562656295159514131413]

"replicate"Pad using repeated border elements of the input

[314159265][3331444333144433314441115999222655522265552226555]

Weight initialization used in convolution layers, specified as "glorot", "he", "narrow-normal", or a function handle. For more information, see Specify Custom Weight Initialization Function (Deep Learning Toolbox).

Activation function to use in the network, specified as one of these values. For more information and a list of available layers, see Activation Layers (Deep Learning Toolbox).

  • "relu" — Use a reluLayer (Deep Learning Toolbox)

  • "leakyRelu" — Use a leakyReluLayer (Deep Learning Toolbox) with a scale factor of 0.2

  • "elu" — Use an eluLayer (Deep Learning Toolbox)

  • A layer object

Activation function after the final convolution layer, specified as one of these values. For more information and a list of available layers, see Activation Layers (Deep Learning Toolbox).

  • "tanh" — Use a tanhLayer (Deep Learning Toolbox)

  • "sigmoid" — Use a sigmoidLayer (Deep Learning Toolbox)

  • "softmax" — Use a softmaxLayer (Deep Learning Toolbox)

  • "none" — Do not use a final activation layer

  • A layer object

Normalization operation to use after each convolution, specified as one of these values. For more information and a list of available layers, see Normalization Layers (Deep Learning Toolbox).

Prefix to all layer names in the network, specified as a string or character vector.

Data Types: char | string

Output Arguments

collapse all

PatchGAN discriminator network, returned as a dlnetwork (Deep Learning Toolbox) object.

More About

collapse all

PatchGAN Discriminator Network

A PatchGAN discriminator network consists of an encoder module that downsamples the input by a factor of 2^NumDownsamplingBlocks. The default network follows the architecture proposed by Zhu et. al. [2].

The encoder module consists of an initial block of layers that performs one downsampling operation, NumDownsamplingBlocks–1 downsampling blocks, and a final block.

The table describes the blocks of layers that comprise the encoder module.

Block TypeLayersDiagram of Default Block
Initial block
  • An imageInputLayer (Deep Learning Toolbox)

  • A convolution2dLayer (Deep Learning Toolbox) with a stride of [2 2] that performs downsampling

  • An activation layer specified by the ActivationLayer name-value argument

Image input layer, 2-D convolution layer, leaky ReLU layer

Downsampling block
  • A convolution2dLayer (Deep Learning Toolbox) with a stride of [2 2] to perform downsampling

  • An optional normalization layer, specified by the NormalizationLayer name-value argument

  • An activation layer specified by the ActivationLayer name-value argument

2-D convolution layer, batch normalization layer, leaky ReLU layer

Final block
  • A convolution2dLayer (Deep Learning Toolbox) with a stride of [1 1]

  • An optional normalization layer, specified by the NormalizationLayer name-value argument

  • An activation layer specified by the ActivationLayer name-value argument

  • A second convolution2dLayer (Deep Learning Toolbox) with a stride of [1 1] and 1 output channel

  • An optional activation layer specified by the FinalActivationLayer name-value argument

2-D convolution layer, batch normalization layer, leaky ReLU layer, 2-D convolution layer

Pixel Discriminator Network

A pixel discriminator network consists of an initial block and final block that return an output of size [H W C]. This network does not perform downsampling. The default network follows the architecture proposed by Zhu et. al. [2].

The table describes the blocks of layers that comprise the network.

Block TypeLayersDiagram of Default Block
Initial block
  • An imageInputLayer (Deep Learning Toolbox)

  • A convolution2dLayer (Deep Learning Toolbox) with a stride of [1 1]

  • An activation layer specified by the ActivationLayer name-value argument

Image input layer, 2-D convolution layer, leaky ReLU layer

Final block
  • A convolution2dLayer (Deep Learning Toolbox) with a stride of [1 1]

  • An optional normalization layer, specified by the NormalizationLayer name-value argument

  • An activation layer specified by the ActivationLayer name-value argument

  • A second convolution2dLayer (Deep Learning Toolbox) with a stride of [1 1] and 1 output channel

  • An optional activation layer specified by the FinalActivationLayer name-value argument

2-D convolution layer, batch normalization layer, leaky ReLU layer, 2-D convolution layer

References

[1] Isola, Phillip, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. "Image-to-Image Translation with Conditional Adversarial Networks." In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5967–76. Honolulu, HI: IEEE, 2017. https://arxiv.org/abs/1611.07004.

[2] Zhu, Jun-Yan, Taesung Park, and Tongzhou Wang. "CycleGAN and pix2pix in PyTorch." https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix.

Version History

Introduced in R2021a