Before R2018a, to perform custom image preprocessing for training deep learning
networks, you had to specify a custom read function using the
readFcn
property of imageDatastore
.
However, reading files using a custom read function was slow because
imageDatastore
did not prefetch files.
In R2018a, four classes including
matlab.io.datastore.MiniBatchable
were introduced as a
solution to perform custom image preprocessing with support for prefetching,
shuffling, and parallel training. Implementing a custom mini-batch datastore using
matlab.io.datastore.MiniBatchable
has several challenges and limitations.
In addition to specifying the preprocessing operations, you must also
define properties and methods to support reading data in batches,
reading data by index, and partitioning and shuffling data.
You must specify a value for the NumObservations
property, but this value may be ill-defined or difficult to define in
real-world applications.
Custom mini-batch datastores are not flexible enough to support common
deep learning workflows, such as deployed workflows using GPU Coder™.
Starting in R2019a, built-in datastores natively support prefetch, shuffling, and
parallel training when reading batches of data. The transform
function is the preferred way to perform custom data
preprocessing, or transformations. The combine
function is the preferred way to concatenate read data from
multiple datastores, including transformed datastores. Concatenated data can serve
as the network inputs and expected responses for training deep learning networks.
The transform
and combine
functions have
several advantages over matlab.io.datastore.MiniBatchable
.
The functions enable data preprocessing and concatenation for all
types of datastores, including imageDatastore
.
The transform
function only requires you to define
the data processing pipeline.
When used on a deterministic datastore, the functions support
tall
data types and MapReduce.
The functions support deployed workflows.
There are no plans to remove the read
method of
matlab.io.datastore.MiniBatchable
at this time.