I figured out a solution to this issue from other resource.
The problem comes from the negative value returned by "state". The original code is as below:
[gradients,loss,state] = dlfeval(@networkGradients,X,gtBox,gtClass,gtMask,dlnet,params);
dlnet.State = state;
Replace the last line (dlnet.State = state;) with the followings to ensure that all values assigned to "dlnet.State" are positive.
idx = dlnet.State.Parameter == "TrainedVariance";
boundAwayFromZero = @(X) max(X, eps('single'));
dlnet.State(idx,:) = dlupdate(boundAwayFromZero, dlnet.State(idx,:));
This will make the code work then.
But then I am now facing another problem. The training process takes so much time (days), probably because the network is really huge. I thought my GPU should be good enough but it turns out that even setting the mini-batch size to 2 requires more memory on GPU than what I have. For now, only cpu is capable of performing such computation.
My GPU is as follows:
Name: 'GeForce GTX 1080'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 11.2000
ToolkitVersion: 11
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 8.5899e+09
AvailableMemory: 7.4505e+09
MultiprocessorCount: 20
ClockRateKHz: 1771000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
Hope this information helps those who want to train their own mask R-CNN on MATLAB.