this note is from official MATLAB documentation for trainingOptions , i think it's exactly what you're looking for.
"When training finishes, view the Results showing the final validation accuracy and the reason that training finished. The final validation metrics are labeled Final in the plots. If your network contains batch normalization layers, then the final validation metrics can different to the validation metrics evaluated during training. This is because the mean and variance statistics used for batch normalization can be different after training completes. For example, if the 'BatchNormalizationStatisics' training option is 'population', then after training, the software finalizes the batch normalization statistics by passing through the training data once more and uses the resulting mean and variance. If the 'BatchNormalizationStatisics' training option is 'moving', then the software approximates the statistics during training using a running estimate and uses the latest values of the statistics. "
So it seems in your case network has better performance on validation data when batch normalization parameters is finilized after training finished.