Augment Images for Deep Learning Workflows
This example shows how you can perform common kinds of randomized image augmentation such as geometric transformations, cropping, and adding noise.
Image Processing Toolbox functions enable you to implement common styles of image augmentation. This example demonstrates five common types of transformations:
The example then shows how to apply augmentation to image data in datastores using a combination of multiple types of transformations.
You can use augmented training data to train a network. For an example of training a network using augmented images, see Prepare Datastore for Image-to-Image Regression.
Read and display a sample image. To compare the effect of the different types of image augmentation, each transformation uses the same input image.
imOriginal = imresize(imread("kobi.png"),0.25);
imshow(imOriginal)
Random Image Warping Transformations
The randomAffine2d
(Image Processing Toolbox) function creates a randomized 2-D affine transformation from a combination of rotation, translation, scale (resizing), reflection, and shear. You can specify which transformations to include and the range of transformation parameters. If you specify the range as a 2-element numeric vector, then randomAffine2d
selects the value of a parameter from a uniform probability distribution over the specified interval. For more control of the range of parameter values, you can specify the range using a function handle.
Control the spatial bounds and resolution of the warped image created by imwarp
(Image Processing Toolbox) by using the affineOutputView
(Image Processing Toolbox) function.
Rotation
Create a randomized rotation transformation that rotates the input image by an angle selected randomly from the range [-45, 45] degrees.
tform = randomAffine2d(Rotation=[-45 45]); outputView = affineOutputView(size(imOriginal),tform); imAugmented = imwarp(imOriginal,tform,OutputView=outputView); imshow(imAugmented)
Translation
Create a translation transformation that shifts the input image horizontally and vertically by a distance selected randomly from the range [-50, 50] pixels.
tform = randomAffine2d(XTranslation=[-50 50],YTranslation=[-50 50]); outputView = affineOutputView(size(imOriginal),tform); imAugmented = imwarp(imOriginal,tform,OutputView=outputView); imshow(imAugmented)
Scale
Create a scale transformation that resizes the input image using a scale factor selected randomly from the range [1.2, 1.5]. This transformation resizes the image by the same factor in the horizontal and vertical directions.
tform = randomAffine2d(Scale=[1.2,1.5]); outputView = affineOutputView(size(imOriginal),tform); imAugmented = imwarp(imOriginal,tform,OutputView=outputView); imshow(imAugmented)
Reflection
Create a reflection transformation that flips the input image with 50% probability in each dimension.
tform = randomAffine2d(XReflection=true,YReflection=true); outputView = affineOutputView(size(imOriginal),tform); imAugmented = imwarp(imOriginal,tform,OutputView=outputView); imshow(imAugmented)
Shear
Create a horizontal shear transformation with the shear angle selected randomly from the range [-30, 30].
tform = randomAffine2d(XShear=[-30 30]); outputView = affineOutputView(size(imOriginal),tform); imAugmented = imwarp(imOriginal,tform,OutputView=outputView); imshow(imAugmented)
Control Range of Transformation Parameters Using Custom Selection Function
In the preceding transformations, the range of transformation parameters was specified by two-element numeric vectors. For more control of the range of the transformation parameters, specify a function handle instead of a numeric vector. The function handle takes no input arguments and yields a valid value for each parameter.
For example, this code selects a rotation angle from a discrete set of 90 degree rotation angles.
angles = 0:90:270; tform = randomAffine2d(Rotation=@() angles(randi(4))); outputView = affineOutputView(size(imOriginal),tform); imAugmented = imwarp(imOriginal,tform,OutputView=outputView); imshow(imAugmented)
Control Fill Value
When you warp an image using a geometric transformation, pixels in the output image can map to a location outside the bounds of the input image. In that case, imwarp
assigns a fill value to those pixels in the output image. By default, imwarp
selects black as the fill value. You can change the fill value by specifying the 'FillValues'
name-value argument.
Create a random rotation transformation, then apply the transformation and specify a gray fill value.
tform = randomAffine2d(Rotation=[-45 45]);
outputView = affineOutputView(size(imOriginal),tform);
imAugmented = imwarp(imOriginal,tform,OutputView=outputView, ...
FillValues=[128 128 128]);
imshow(imAugmented)
Cropping Transformations
To create output images of a desired size, use the randomWindow2d
(Image Processing Toolbox) and centerCropWindow2d
(Image Processing Toolbox) functions. Be careful to select a window that includes the desired content in the image.
Specify the desired size of the cropped region as a 2-element vector of the form [height, width].
targetSize = [200,100];
Crop the image to the target size from the center of the image.
win = centerCropWindow2d(size(imOriginal),targetSize); imCenterCrop = imcrop(imOriginal,win); imshow(imCenterCrop)
Crop the image to the target size from a random location in the image.
win = randomWindow2d(size(imOriginal),targetSize); imRandomCrop = imcrop(imOriginal,win); imshow(imRandomCrop)
Color Transformations
You can randomly adjust the hue, saturation, brightness, and contrast of a color image by using the jitterColorHSV
(Image Processing Toolbox) function. You can specify which color transformations are included and the range of transformation parameters.
You can randomly adjust the brightness and contrast of grayscale images by using basic math operations.
Hue Jitter
Hue specifies the shade of color, or a color's position on a color wheel. As hue varies from 0 to 1, colors vary from red through yellow, green, cyan, blue, purple, magenta, and back to red. Hue jitter shifts the apparent shade of colors in an image.
Adjust the hue of the input image by a small positive offset selected randomly from the range [0.05, 0.15]. Colors that were red now appear more orange or yellow, colors that were orange appear yellow or green, and so on.
imJittered = jitterColorHSV(imOriginal,Hue=[0.05 0.15]); montage({imOriginal,imJittered})
Saturation Jitter
Saturation is the purity of color. As saturation varies from 0 to 1, hues vary from gray (indicating a mixture of all colors) to a single pure color. Saturation jitter shifts how dull or vibrant colors are.
Adjust the saturation of the input image by an offset selected randomly from the range [-0.4, -0.1]. The colors in the output image appear more muted, as expected when the saturation decreases.
imJittered = jitterColorHSV(imOriginal,Saturation=[-0.4 -0.1]); montage({imOriginal,imJittered})
Brightness Jitter
Brightness is the amount of hue. As brightness varies from 0 to 1, colors go from black to white. Brightness jitter shifts the darkness and lightness of an input image.
Adjust the brightness of the input image by an offset selected randomly from the range [-0.3, -0.1]. The image appears darker, as expected when the brightness decreases.
imJittered = jitterColorHSV(imOriginal,Brightness=[-0.3 -0.1]); montage({imOriginal,imJittered})
Contrast Jitter
Contrast jitter randomly adjusts the difference between the darkest and brightest regions in an input image.
Adjust the contrast of the input image by a scale factor selected randomly from the range [1.2, 1.4]. The contrast increases, such that shadows become darker and highlights become brighter.
imJittered = jitterColorHSV(imOriginal,Contrast=[1.2 1.4]); montage({imOriginal,imJittered})
Brightness and Contrast Jitter of Grayscale Images
You can apply randomized brightness and contrast jitter to grayscale images by using basic math operations.
Convert the sample image to grayscale. Specify a random contrast scale factor in the range [0.8, 1] and a random brightness offset in the range [-0.15, 0.15]. Multiply the image by the contrast scale factor, then add the brightness offset.
imGray = im2gray(im2double(imOriginal)); contrastFactor = 1-0.2*rand; brightnessOffset = 0.3*(rand-0.5); imJittered = imGray.*contrastFactor + brightnessOffset; imJittered = im2uint8(imJittered); montage({imGray,imJittered})
Randomized Color-to-Grayscale
One type of color augmentation randomly drops the color information from an RGB image while preserving the number of channels expected by the network. This code shows a "random grayscale" transformation in which an RGB image is randomly converted with 80% probability to a three channel output image where R == G == B.
desiredProbability = 0.8; if rand <= desiredProbability imJittered = repmat(rgb2gray(imOriginal),[1 1 3]); end imshow(imJittered)
Other Image Processing Operations
Use the transform
function to apply any combination of Image Processing Toolbox functions to input images. Adding noise and blur are two common image processing operations used in deep learning applications.
Synthetic Noise
To apply synthetic noise to an input image, use the imnoise
(Image Processing Toolbox) function. You can specify which noise model to use, such as Gaussian, Poisson, salt and pepper, and multiplicative noise. You can also specify the strength of the noise.
imSaltAndPepperNoise = imnoise(imOriginal,"salt & pepper",0.1); imGaussianNoise = imnoise(imOriginal,"gaussian"); montage({imSaltAndPepperNoise,imGaussianNoise})
Synthetic Blur
To apply randomized Gaussian blur to an image, use the imgaussfilt
(Image Processing Toolbox) function. You can specify the amount of smoothing.
sigma = 1+5*rand; imBlurred = imgaussfilt(imOriginal,sigma); imshow(imBlurred)
Apply Augmentation to Image Data in Datastores
In practical deep learning problems, the image augmentation pipeline typically combines multiple operations. Datastores are a convenient way to read and augment collections of images.
This section of the example shows how to define data augmentation pipelines that augment datastores in the context of training image classification and image regression problems.
First, create an imageDatastore
that contains unprocessed images. The image datastore in this example contains digit images with labels.
digitDatasetPath = fullfile(matlabroot,"toolbox","nnet", ... "nndemos","nndatasets","DigitDataset"); imds = imageDatastore(digitDatasetPath, ... IncludeSubfolders=true,LabelSource="foldernames"); imds.ReadSize = 6;
Image Classification
In image classification, the classifier should learn that a randomly altered version of an image still represents the same image class. To augment data for image classification, it is sufficient to augment the input images while leaving the corresponding categorical labels unchanged.
Augment images in the pristine image datastore with random Gaussian blur, salt and pepper noise, and randomized scale and rotation. These operations are defined in the helper function classificationAugmentationPipeline
at the end of this example. Apply data augmentation to the training data by using the transform
function.
dsTrain = transform(imds,@classificationAugmentationPipeline, ...
IncludeInfo=true);
Visualize a sample of the output coming from the augmented pipeline.
dataPreview = preview(dsTrain);
montage(dataPreview(:,1))
title("Augmented Images for Image Classification")
Image Regression
Image augmentation for image-to-image regression is more complicated because you must apply identical geometric transformations to the input and response images. Associate pairs of input and response images by using the combine
function. Transform one or both images in each pair by using the transform
function.
Combine two identical copies of the image datastore imds
. When data is read from the combined datastore, image data is returned in a two-column cell array, where the first column represents network input images and the second column contains network responses.
dsCombined = combine(imds,imds);
montage(preview(dsCombined)',Size=[6 2])
title("Combined Input and Response Pairs Before Augmentation")
Augment each pair of training images with a series of image processing operations:
Resize the input and response image to 32-by-32 pixels.
Add salt and pepper noise to the input image only.
Create a transformation that has randomized scale and rotation.
Apply the same transformation to the input and response image.
These operations are defined in the helper function imageRegressionAugmentationPipeline
at the end of this example. Apply data augmentation to the training data by using the transform
function.
dsTrain = transform(dsCombined,@imageRegressionAugmentationPipeline);
montage(preview(dsTrain)',Size=[6 2])
title("Combined Input and Response Pairs After Augmentation")
For a complete example that includes training and evaluating an image-to-image regression network, see Prepare Datastore for Image-to-Image Regression.
Supporting Functions
The classificationAugmentationPipeline
helper function augments images for classification. dataIn
and dataOut
are two-element cell arrays, where the first element is the network input image and the second element is the categorical label.
function [dataOut,info] = classificationAugmentationPipeline(dataIn,info) dataOut = cell([size(dataIn,1),2]); for idx = 1:size(dataIn,1) temp = dataIn{idx}; % Add randomized Gaussian blur temp = imgaussfilt(temp,1.5*rand); % Add salt and pepper noise temp = imnoise(temp,"salt & pepper"); % Add randomized rotation and scale tform = randomAffine2d(Scale=[0.95,1.05],Rotation=[-30 30]); outputView = affineOutputView(size(temp),tform); temp = imwarp(temp,tform,OutputView=outputView); % Form a two-element cell array with the input image and expected response dataOut(idx,:) = {temp,info.Label(idx)}; end end
The imageRegressionAugmentationPipeline
helper function augments images for image-to-image regression. dataIn
and dataOut
are two-element cell arrays, where the first element is the network input image and the second element is the network response image.
function dataOut = imageRegressionAugmentationPipeline(dataIn) dataOut = cell([size(dataIn,1),2]); for idx = 1:size(dataIn,1) % Resize images to 32-by-32 pixels and convert to data type single inputImage = im2single(imresize(dataIn{idx,1},[32 32])); targetImage = im2single(imresize(dataIn{idx,2},[32 32])); % Add salt and pepper noise inputImage = imnoise(inputImage,"salt & pepper"); % Add randomized rotation and scale tform = randomAffine2d(Scale=[0.9,1.1],Rotation=[-30 30]); outputView = affineOutputView(size(inputImage),tform); % Use imwarp with the same tform and outputView to augment both images % the same way inputImage = imwarp(inputImage,tform,OutputView=outputView); targetImage = imwarp(targetImage,tform,OutputView=outputView); dataOut(idx,:) = {inputImage,targetImage}; end end