Mask R-CNN maximum number of detected instances per image.

Question

ANDREA MACI 2024-6-2

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2124696-mask-r-cnn-maximum-number-of-detected-instances-per-image

评论： ANDREA MACI 2024-6-27

Hi, I've been working on the mask r-cnn following the documentation instructions. I've got everything tot work but I stumbled upon a potential library mistake. Let me explain better my situation: I am working on a dataset with ~250 images (split between training and validation) and with just 1 category. Each image might have 40-50 instances up to 600-650 instances.

The problem is this, mask r-cnn can only detect up to 100 instances by class definition. I believe this is hurting the training of the network - however I cannot confirm this because I have to run the training by remote, by command prompt, since I don't have a GPU powerful enough to run the training locally. My evidence is that the network, after the training, performs somewhat well on images with 40-50 instances, while it performs horrible on images with a lot of instances. In fact, when I evaluate my network on the validation set (something that I can do on my own computer), the network outputs at most 100 masks per image.

My "local" fix: I edited the maskrcnn.m file of the library. I went to the directory "C:\Program\Files\MATLAB\R2023b\toolbox\vision\vision\@maskrcnn\maskrcnn.m" and at line 172 of the code, instead of

NumStrongestRegionsPrediction = 100

I put (expecting to not detect more than 800 instances, given my ground truth data)

NumStrongestRegionsPrediction = 800

which fixes my issue at least at validation time. However, since my training is run without this fix and, given my results, I am writing here to ask what I can do about this issue, I am basically certain my code is correct.

Again, all I can observe at training time is the training loss, which converges to a good number, however sometimes it outputs a bigger number, probably because it encounters the batch with the images with a lot of instances - in other words, the network isn't learning enough out of these images and mistakes/training loss.

I can provide more information if needed, however for now I want to keep the post simple.

2 个评论
显示无隐藏无

John D'Errico 2024-6-2

It is totally amazing at how often people claim they are certain their code is correct. Surely, nothing you could have done could possibly be wrong? SIGH.

I would point out it is a terribly poor idea to edit supplied code. The rule I have always understood and used is, you change it, you own it. Once you change supplied code, then any problems are now yours.

Anyway, I would suggest you post this as a tech support issue, not Answers. They may have had valid reasons to limit that parameter.

ANDREA MACI 2024-6-2

Thank you for your answer, John.

As I've said, my code works fine. The training of the neural network works and it has great results, besides on the images with too many instances. I am aware of the dangers of editing supplied code, I didn't encounter new problems by changing it. If anything, it worked better, give my instance segmentation task. I can provide and post my code here if you want to have a look at it.

I am new to writing here, which section of the support should I write to? Moreover, I am a student, and as written in the "Product Usage" section of MATLAB Tech Support: "Technical support from MathWorks is available for activation, installation and bug-related issues", so I don't even know if that's going to work.

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

aditi bagora 2024-6-26

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2124696-mask-r-cnn-maximum-number-of-detected-instances-per-image#answer_1477336

Hi Andrea,

I understand that you are trying to detect objects using mask-rcnn and you observed model performing well for smaller instances when compared to larger instances.

The parameter "NumStrongestRegionsPrediction" controls the number of regions with high prediction values to output. Setting it to a value 8x will lead to more instances in the output. It seems that it is solving your issue. But, increasing the value can also increase the number of false positive instances.

I would suggest you to not change the code locally instead use the parameter "threshold". Changing the threshold gives controls over selection of number of instances with a prediction value. Please note setting the threshold to lower values may also lead to false positive detections.

Also, If the model is unable to detect more instances with higher prediction values it is a clear indication that model is unable to learn properly from the training data. In that case, you need to analyse your model, training data and maybe re-train your model.

Hope this helps!

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

ANDREA MACI 2024-6-27

Thank you for your answer,

It is true that I get more false positives, however the results I get are still more accurate. I will explain what I think happens at training time which is the reason why the model doesn't perform well. It doesn't have to do with the threshold parameter at inference time, the model is very sure about the instances I obtain (instances easily have a score >0.99).

If the model can only detect 100 instances at most, then it will be penalyzed heavily when it trains on images with 600 instances; because the maskrcnn, according to the ground truth data, is doing a bad job and therefore it will keep "changing its mind" about what's an instance and what is not an instance despite all of them being instances. I'm not sure I've explained myself properly. Moreover, I don't have any evidence to support my point.

All I know is that, in the trainMaskRCNN() function, the parameters "NumStrongestRegions, NumRegionsToSample" seem to not change the performance of the model while training. I'm also pretty sure it doesn't inherently have to do with the size of the instances (this is taken into account by the anchor boxes).

The original paper I'm working on is "Materials swelling revealed through automated semantic segmentation of cavities in electron microscopy images", they obtain overall good results by implementing a MaskRCNN in python, of which I'm not familiar of, regarding the implementation of the neural network, but I don't know anything else, so you might be correct that a ResNet50 backbone isn't enough for this task.

Thank you again for taking your time. In the mean time, since the first time I posted, I've simply accepted the results I've obtained.

请先登录，再进行评论。

Mask R-CNN maximum number of detected instances per image.

2 个评论
显示无隐藏无

回答（1 个）

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Mask R-CNN maximum number of detected instances per image.

2 个评论 显示 无隐藏 无

回答（1 个）

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

2 个评论
显示无隐藏无

1 个评论
显示 -1更早的评论隐藏 -1更早的评论