How can I make my neural network support any size of image input?
33 次查看（过去 30 天）
There are three levels of code writing to do a vision-related deep learning task.
Highest Level: complete layerGraph and train with trainNetwork function.
Middle level: build a layerGraph without loss. Instead, calculate loss and gradient in an eval function. One can also specify customed learning rate schedule. This level allows some customization, and still exploits the easy-to-use highest level features.
Lowest level: this level has no concept of layer. Coders have to take care of the parameters themself. It's really messy and time-consuming to build and train a network in this way.
My question is: Highest level and middle level all requires a certain size of input, i.e, an imageInputLayer. But imageInputLayer only supports for fixed image size. I do not want to trouble myself with lowest level coding. So how could I make my NN take inputs of any size?
Ryan Comeau 2020-5-10
I wish it was possible to just dump images of multiple sizes as well. Unfortunately though each image would yield a differen size of convultion maps and a different number. How then would it make sense to pass these into a full connection layer and curve fit these convolutions maps? It would be like sorting oranges by size, but half of your input oranges are apples, it would be a strange task.
There is however a solution to this problem. Your input images need to be scaled to the size of your network input size. This is one of the preprocessing steps that is important. Here is some code that could resize all of your images:
number_rows=200; %depending on the input size of your network
number_cols=300; %depending on input size.
It may seem unintuitive, but computers don't see the same way we do and the scale of things doesn't always matter.
Hope this helps,