Hi, Generally trainNetwork framework is faster, it is optimised to take care of many things. But it's not that the custom training loop will always be slower. You can use the minibatchqueue, gpuArray and dlArray to speed things. Typically, if it is possible to use trainNetwork prefer using this.
If you can provide more information about your workflow and the hardware you were using it will be helpful for us to investigate more on this.