Since the convolution2dLayer and imageInputLayer have been replaced, the output of the ImageInputLayer would be different now because initially for the zero-center nomalization the mean used was different and also the features extracted/output from the replaced convolution layer would be different and may not be useful. If you are training the network on the new dataset with image input size 227*227*5 then above all doesn't matter. Instead if you are using it for feature extraction & your data is very different from the original data, then the features extracted deeper in the network might be less useful for your task.
Here are few suggestions while retrianing:
- Try freezing the weights of the original layers by setting the WeightLearnRateFactor and BiasLearnRateFactor to zero for convolution2dLayer and the same WeightLearnRateFactor & BiasLearnRateFactor for the fullyConnectedLayer too.
- Or retrain the complete network without freezing weights of any particular layers.