How to change GAN example to generate images with a larger size?

15 次查看(过去 30 天)
How can I change the original GAN example (https://www.mathworks.com/help/deeplearning/ug/train-generative-adversarial-network.html) to generate images with a bigger size, e.g., 128*128.
The example works with 64*64 colored images and thus produces low resolution images. I guess this size was choosen to shorten the training time.
The images are augmented before feeding them into the generator:
augimds = augmentedImageDatastore([64 64],imds,'DataAugmentation',augmenter);
The generator contains 4 transposed convolutional layers and the discriminator contains 5 convoultional layers. Both generator and discriminator use a number of 64 filters.
I edited the code by changing the image size of the augmids and add an additional transposed convolutional layer (with modified parameters) to the generator and modifiy discriminator code accordingly. Code is shown below:
% The generator
filterSize = 5;
numFilters = 128;
numLatentInputs = 100;
projectionSize = [4 4 512];
layersGenerator = [
imageInputLayer([1 1 numLatentInputs],'Normalization','none','Name','in')
projectAndReshapeLayer(projectionSize,numLatentInputs,'proj');
transposedConv2dLayer(filterSize,8*numFilters,'Name','tconv1')
batchNormalizationLayer('Name','bnorm1')
reluLayer('Name','relu1')
transposedConv2dLayer(filterSize,4*numFilters,'Stride',1,'Cropping','same','Name','tconv2')
batchNormalizationLayer('Name','bnorm2')
reluLayer('Name','relu2')
transposedConv2dLayer(filterSize,2*numFilters,'Stride',2,'Cropping','same','Name','tconv3')
batchNormalizationLayer('Name','bnorm3')
reluLayer('Name','relu3')
transposedConv2dLayer(filterSize,numFilters,'Stride',2,'Cropping','same','Name','tconv4')
batchNormalizationLayer('Name','bnorm4')
reluLayer('Name','relu4')
transposedConv2dLayer(filterSize,3,'Stride',2,'Cropping','same','Name','tconv5')
tanhLayer('Name','tanh')];
% The discriminator
dropoutProb = 0.5;
numFilters = 128;
scale = 0.2;
inputSize = [128 128 3];
filterSize = 5;
layersDiscriminator = [
imageInputLayer(inputSize,'Normalization','none','Name','in')
dropoutLayer(0.5,'Name','dropout')
convolution2dLayer(filterSize,numFilters,'Stride',2,'Padding','same','Name','conv1')
leakyReluLayer(scale,'Name','lrelu1')
convolution2dLayer(filterSize,2*numFilters,'Stride',2,'Padding','same','Name','conv2')
batchNormalizationLayer('Name','bn2')
leakyReluLayer(scale,'Name','lrelu2')
convolution2dLayer(filterSize,4*numFilters,'Stride',2,'Padding','same','Name','conv3')
batchNormalizationLayer('Name','bn3')
leakyReluLayer(scale,'Name','lrelu3')
convolution2dLayer(filterSize,8*numFilters,'Stride',2,'Padding','same','Name','conv4')
batchNormalizationLayer('Name','bn4')
leakyReluLayer(scale,'Name','lrelu4')
convolution2dLayer(4,1,'Name','conv5')];
lgraphDiscriminator = layerGraph(layersDiscriminator);
But the modifications produced an error in this line of code:
[gradientsGenerator, gradientsDiscriminator, stateGenerator, scoreGenerator, scoreDiscriminator] = ...
dlfeval(@modelGradients, dlnetGenerator, dlnetDiscriminator, dlX, dlZ, flipFactor);
Specificllay in the dlfeval:
Error using dlfeval (line 43)
Value to differentiate must be a traced dlarray scalar.
I am trying to figure out the raltionship between the image size and other parameters: num of fiters, number of layers, ... so I can modify them to generate images with different sizes other than 64*64.
Thanks
  1 个评论
Anthony Herdman
Anthony Herdman 2020-9-23
Tarunbir indicated that you need to change the stider in 'conv2' to '2' but I believe you also need to change a conv layer for the Discriminator.
THe following solved it for me (see bolded line below). For the Discriminator, I changed the 'Stride' from "2" to "4" on one of the convolution networks (in this case 'conv3') to get "Activations" of 1x1x3 in the conv5. If you don't do this, the "Activations" in conv5 will be 5x5x3 which will lead to a 5x5x1xM predictions when calling the function "ModelGradients.m" (see line "dlYPred = forward(dlnetDiscriminator, dlX);"
Hope this works for you.
layersDiscriminator = [
imageInputLayer(inputSize,'Normalization','none','Name','in')
dropoutLayer(0.5,'Name','dropout')
convolution2dLayer(filterSize,numFilters,'Stride',2,'Padding','same','Name','conv1')
leakyReluLayer(scale,'Name','lrelu1')
convolution2dLayer(filterSize,2*numFilters,'Stride',2,'Padding','same','Name','conv2')
batchNormalizationLayer('Name','bn2')
leakyReluLayer(scale,'Name','lrelu2')
convolution2dLayer(filterSize,4*numFilters,'Stride',4,'Padding','same','Name','conv3')
batchNormalizationLayer('Name','bn3')
leakyReluLayer(scale,'Name','lrelu3')
convolution2dLayer(filterSize,8*numFilters,'Stride',2,'Padding','same','Name','conv4')
batchNormalizationLayer('Name','bn4')
leakyReluLayer(scale,'Name','lrelu4')
convolution2dLayer(4,1,'Name','conv5')];

请先登录,再进行评论。

回答(5 个)

Tarunbir Gambhir
Tarunbir Gambhir 2020-9-3
Generative Adversarial Networks consists of Generator and Discriminator networks that train together to generate data with characteristics of the real data. If the size of the real data is changed, both the networks need to be altered to accommodate this change.
The output of the Generator network needs to be of the same size as that of the real images. For your case, which is image of size 128 x 128 x 3, the layer "tconv2" of the Generator network should have the following specifications:
transposedConv2dLayer(filterSize,4*numFilters,'Stride',2,'Cropping','same','Name','tconv2')
Explanation: After the "proj" layer, data has the shape of 4 x 4 x 512. After the "tconv1" layer, data has the shape of 8 x 8 x 1024. You can lookup how to calculate the output size of a transposed convolution layer. For the further 4 layers, the output size needs to double at every layer in order to get the output shape of 128 x 128 x 3. Keeping “Cropping” as “same” ensures that the output size equals inputSize .* Stride, refer Transposed Convolution Layer.
After this you need to ensure that the Discriminator network should output a single value for every input image of size 128 x 128 x 3. This is because the Discriminator model outputs probabilities after the sigmoid function for every datapoint (refer here). For your case, the layer "conv5" should have the following specifications:
convolution2dLayer(8,1,'Name','conv5')
Explanation: You can go through the MATLAB documentation on 2D Convolution layers to understand how the kernel size affects the output size of that layer.
For debugging, I suggest you run the following script to ensure that the Generator and Discriminator networks gives the output of correct size.
[outputY,~] = forward(dlnetMODEL,inputX);
disp(size(outputY));
Note: The required results can be obtained by modifying any of the "conv"/"tconv" layers or adding more "conv" layers or by adding a global average pooling layer at the end of the network. Although they should have the correct parameters.
  2 个评论
Whussa
Whussa 2020-9-17
Has someone solved this yet? I tried to implement Tarunbir´s answer but still get the same error.
This is my code (general GAN example):
datasetFolder = fullfile('/Users/bilder gan');
imds = imageDatastore(datasetFolder, ...
'IncludeSubfolders',true);
augmenter = imageDataAugmenter('RandXReflection',false);
augimds = augmentedImageDatastore([128 128],imds,'DataAugmentation',augmenter);
%%
filterSize = 5;
numFilters = 128;
numLatentInputs = 100;
projectionSize = [4 4 512];
layersGenerator = [
imageInputLayer([1 1 numLatentInputs],'Normalization','none','Name','in')
projectAndReshapeLayer(projectionSize,numLatentInputs,'proj');
transposedConv2dLayer(filterSize,8*numFilters,'Name','tconv1')
batchNormalizationLayer('Name','bnorm1')
reluLayer('Name','relu1')
transposedConv2dLayer(filterSize,4*numFilters,'Stride',1,'Cropping','same','Name','tconv2')
batchNormalizationLayer('Name','bnorm2')
reluLayer('Name','relu2')
transposedConv2dLayer(filterSize,2*numFilters,'Stride',2,'Cropping','same','Name','tconv3')
batchNormalizationLayer('Name','bnorm3')
reluLayer('Name','relu3')
transposedConv2dLayer(filterSize,numFilters,'Stride',2,'Cropping','same','Name','tconv4')
batchNormalizationLayer('Name','bnorm4')
reluLayer('Name','relu4')
transposedConv2dLayer(filterSize,3,'Stride',2,'Cropping','same','Name','tconv5')
tanhLayer('Name','tanh')];
lgraphGenerator = layerGraph(layersGenerator);
%%
dlnetGenerator = dlnetwork(lgraphGenerator);
%%
dropoutProb = 0.5;
numFilters = 128;
scale = 0.2;
inputSize = [128 128 3];
filterSize = 5;
layersDiscriminator = [
imageInputLayer(inputSize,'Normalization','none','Name','in')
dropoutLayer(0.5,'Name','dropout')
convolution2dLayer(filterSize,numFilters,'Stride',2,'Padding','same','Name','conv1')
leakyReluLayer(scale,'Name','lrelu1')
convolution2dLayer(filterSize,2*numFilters,'Stride',2,'Padding','same','Name','conv2')
batchNormalizationLayer('Name','bn2')
leakyReluLayer(scale,'Name','lrelu2')
convolution2dLayer(filterSize,4*numFilters,'Stride',2,'Padding','same','Name','conv3')
batchNormalizationLayer('Name','bn3')
leakyReluLayer(scale,'Name','lrelu3')
convolution2dLayer(filterSize,8*numFilters,'Stride',2,'Padding','same','Name','conv4')
batchNormalizationLayer('Name','bn4')
leakyReluLayer(scale,'Name','lrelu4')
convolution2dLayer(8,1,'Name','conv5')];
lgraphDiscriminator = layerGraph(layersDiscriminator);
%%
dlnetDiscriminator = dlnetwork(lgraphDiscriminator);
%%
numEpochs = 500;
miniBatchSize = 128;
augimds.MiniBatchSize = miniBatchSize;
%%
learnRate = 0.0002;
gradientDecayFactor = 0.5;
squaredGradientDecayFactor = 0.999;
%%
executionEnvironment = "auto";
%%
flipFactor = 0.3;
%%
validationFrequency = 100;
%% TRAINING
trailingAvgGenerator = [];
trailingAvgSqGenerator = [];
trailingAvgDiscriminator = [];
trailingAvgSqDiscriminator = [];
%%
numValidationImages = 25;
ZValidation = randn(1,1,numLatentInputs,numValidationImages,'single');
%%
dlZValidation = dlarray(ZValidation,'SSCB');
%%
f = figure;
f.Position(3) = 2*f.Position(3);
%%
imageAxes = subplot(1,2,1);
scoreAxes = subplot(1,2,2);
%%
lineScoreGenerator = animatedline(scoreAxes,'Color',[0 0.447 0.741]);
lineScoreDiscriminator = animatedline(scoreAxes, 'Color', [0.85 0.325 0.098]);
legend('Generator','Discriminator');
ylim([0 1])
xlabel("Iteration")
ylabel("Score")
grid on
%%
iteration = 0;
start = tic;
% Loop over epochs.
for epoch = 1:numEpochs
% Reset and shuffle datastore.
reset(augimds);
augimds = shuffle(augimds);
% Loop over mini-batches.
while hasdata(augimds)
iteration = iteration + 1;
% Read mini-batch of data.
data = read(augimds);
% Ignore last partial mini-batch of epoch.
if size(data,1) < miniBatchSize
continue
end
% Concatenate mini-batch of data and generate latent inputs for the
% generator network.
X = cat(4,data{:,1}{:});
X = single(X);
Z = randn(1,1,numLatentInputs,size(X,4),'single');
% Rescale the images in the range [-1 1].
X = rescale(X,-1,1,'InputMin',0,'InputMax',255);
% Convert mini-batch of data to dlarray and specify the dimension labels
% 'SSCB' (spatial, spatial, channel, batch).
dlX = dlarray(X, 'SSCB');
dlZ = dlarray(Z, 'SSCB');
% If training on a GPU, then convert data to gpuArray.
if (executionEnvironment == "auto" && canUseGPU) || executionEnvironment == "gpu"
dlX = gpuArray(dlX);
dlZ = gpuArray(dlZ);
end
% Evaluate the model gradients and the generator state using
% dlfeval and the modelGradients function listed at the end of the
% example.
[gradientsGenerator, gradientsDiscriminator, stateGenerator, scoreGenerator, scoreDiscriminator] = ...
dlfeval(@modelGradients, dlnetGenerator, dlnetDiscriminator, dlX, dlZ, flipFactor);
dlnetGenerator.State = stateGenerator;
% Update the discriminator network parameters.
[dlnetDiscriminator,trailingAvgDiscriminator,trailingAvgSqDiscriminator] = ...
adamupdate(dlnetDiscriminator, gradientsDiscriminator, ...
trailingAvgDiscriminator, trailingAvgSqDiscriminator, iteration, ...
learnRate, gradientDecayFactor, squaredGradientDecayFactor);
% Update the generator network parameters.
[dlnetGenerator,trailingAvgGenerator,trailingAvgSqGenerator] = ...
adamupdate(dlnetGenerator, gradientsGenerator, ...
trailingAvgGenerator, trailingAvgSqGenerator, iteration, ...
learnRate, gradientDecayFactor, squaredGradientDecayFactor);
% Every validationFrequency iterations, display batch of generated images using the
% held-out generator input
if mod(iteration,validationFrequency) == 0 || iteration == 1
% Generate images using the held-out generator input.
dlXGeneratedValidation = predict(dlnetGenerator,dlZValidation);
% Tile and rescale the images in the range [0 1].
I = imtile(extractdata(dlXGeneratedValidation));
I = rescale(I);
% Display the images.
subplot(1,2,1);
image(imageAxes,I)
xticklabels([]);
yticklabels([]);
title("Generated Images");
end
% Update the scores plot
subplot(1,2,2)
addpoints(lineScoreGenerator,iteration,...
double(gather(extractdata(scoreGenerator))));
addpoints(lineScoreDiscriminator,iteration,...
double(gather(extractdata(scoreDiscriminator))));
% Update the title with training progress information.
D = duration(0,0,toc(start),'Format','hh:mm:ss');
title(...
"Epoch: " + epoch + ", " + ...
"Iteration: " + iteration + ", " + ...
"Elapsed: " + string(D))
drawnow
end
end
%% generate new images
ZNew = randn(1,1,numLatentInputs,25,'single');
dlZNew = dlarray(ZNew,'SSCB');
%%
dlXGeneratedNew = predict(thirdTryNet,dlZNew);
%%
I = imtile(extractdata(dlXGeneratedNew));
I = rescale(I);
figure
image(I)
axis off
title("Generated Images")
%%
thirdTryNet=dlnetGenerator;
save thirdTryNet

请先登录,再进行评论。


Stavros
Stavros 2022-7-6
I achived to produce 128x128 images but what about bigger images such as 512x512?
  2 个评论
Ziqi Sun
Ziqi Sun 2022-10-17
Hey man, can you share the code. I follow the code above but I got negative training variance error...
Cecilia Di Ruberto
Cecilia Di Ruberto 2022-11-2
Hi, I got the same error, "Expected TrainedVariance" to be positive. I followed all the suggestions.
Please, if you produced the bigger images can you share the code? Thanks in advance

请先登录,再进行评论。


Suhail Mahmud
Suhail Mahmud 2022-11-8
I was able to generate 128 by 128 pixel image by using the following code:
augimds = augmentedImageDatastore([128 128],imds,DataAugmentation=augmenter,ColorPreprocessing="gray2rgb");
%% This is the Generator
filterSize = 5;
numFilters = 128;
numLatentInputs = 100;
projectionSize = [4 4 512];
layersGenerator = [
featureInputLayer(numLatentInputs)
projectAndReshapeLayer(projectionSize)
transposedConv2dLayer(filterSize,8*numFilters)
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,4*numFilters,Stride=2,Cropping="same")
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,2*numFilters,Stride=2,Cropping="same")
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,numFilters,Stride=2,Cropping="same")
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,3,Stride=2,Cropping="same")
tanhLayer];
netG = dlnetwork(layersGenerator);
%% This is the Discrimanator
dropoutProb = 0.75;
numFilters = 64;
scale = 0.2;
inputSize = [128 128 3];
filterSize = 5;
layersDiscriminator = [
imageInputLayer(inputSize,Normalization="none")
dropoutLayer(dropoutProb)
convolution2dLayer(filterSize,numFilters,Stride=2,Padding="same")
leakyReluLayer(scale)
convolution2dLayer(filterSize,2*numFilters,Stride=2,Padding="same")
batchNormalizationLayer
leakyReluLayer(scale)
convolution2dLayer(filterSize,4*numFilters,Stride=2,Padding="same")
batchNormalizationLayer
leakyReluLayer(scale)
convolution2dLayer(filterSize,8*numFilters,Stride=2,Padding="same")
batchNormalizationLayer
leakyReluLayer(scale)
convolution2dLayer(8,1)
sigmoidLayer];
netD = dlnetwork(layersDiscriminator);
All the remaining part of the code will be same as the example of GAN Example. Just make sure you have a good computational resource to run the code. Best of Luck.

Fred Liu
Fred Liu 2022-11-11
You can try the following code, hope it will help.
I also thank the previous contribution code, but unfortunately I saw it later, and I will also implement it.
Generator
filterSize = 5;
numFilters = 128;
numLatentInputs = 100;
projectionSize = [4 4 512];
layersGenerator = [
featureInputLayer(numLatentInputs)
projectAndReshapeLayer(projectionSize)
transposedConv2dLayer(filterSize,8*numFilters)
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,4*numFilters,Stride=2,Cropping="same")
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,2*numFilters,Stride=2,Cropping="same")
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,numFilters,Stride=2,Cropping="same")
batchNormalizationLayer
reluLayer
transposedConv2dLayer(filterSize,3,Stride=2,Cropping="same")
tanhLayer];
Discriminator
dropoutProb = 0.5;
numFilters = 128;
scale = 0.2;
inputSize = [128 128 3];
filterSize = 5;
layersDiscriminator = [
imageInputLayer(inputSize,Normalization="none")
dropoutLayer(dropoutProb)
convolution2dLayer(filterSize,numFilters,Stride=2,Padding="same")
leakyReluLayer(scale)
dropoutLayer(dropoutProb)
convolution2dLayer(filterSize,2*numFilters,Stride=2,Padding="same")
batchNormalizationLayer
leakyReluLayer(scale)
dropoutLayer(dropoutProb)
convolution2dLayer(filterSize,4*numFilters,Stride=2,Padding="same")
batchNormalizationLayer
leakyReluLayer(scale)
dropoutLayer(dropoutProb)
convolution2dLayer(filterSize,8*numFilters,Stride=2,Padding="same")
batchNormalizationLayer
leakyReluLayer(scale)
convolution2dLayer(8,1)
sigmoidLayer];
Training Options
learnRate = 0.0001;
gradientDecayFactor = 0.5;
squaredGradientDecayFactor = 0.999;

Jonathan
Jonathan 2022-11-11
We show a few low color images to see what we are training our models.
# Display 10 real images
fig, axs = plt.subplots(2, 5, sharey=False, tight_layout=True, figsize=(16,9), facecolor='white')
n=0
for i in range(0,2):
for j in range(0,5):
axs[i,j].matshow(data_lowres[n])
n=n+1
plt.show()

类别

Help CenterFile Exchange 中查找有关 Image Data Workflows 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by