Folding batch normalization into preceding convolution

Kai Tan

2019 2 17

0 个回答

3 次查看（30 天）

1 个投票

Trying to port a trained MATLAB CNN models to another framework. I would like to get rid of batch norm (BN) layers by folding the parameters into the preceding convolution layers. I use the following formulation:

% layer is BatchNorm layer
m = layer.TrainedMean;
v = layer.TrainedVariance;
offset = layer.Offset;
scale = layer.Scale;
ep = layer.Epsilon;
% adjust the weights:
for j = 1:size(scale, 3) % the number of output channels
    denom = sqrt(v(1,1,j)+ep);
    % adjust the 4D convolution weights
    new_w(:,:,:,j) = scale(1,1,j)*w(:,:,:,j)/denom;
    % adjust the convolution biases
    new_b(1,1,j) = scale(1,1,j)*(b(1,1,j)-m(1,1,j))/denom + offset(1,1,j);
end

However, when I compare the outputs of the BN layer to the outputs of the convolution layer, where the BN parameters were folded into, I am getting different results.

Left (BN layer), Right (Conv Layer with BN parameters folded).