Problems using NVIDIA GeForce RTX 3090 for Deep Learning
9 次查看(过去 30 天)
显示 更早的评论
I am having trouble using NVIDIA GeForce RTX 3090 cards for training neural networks with the Deep Learning Toolbox in MATLAB. The problems arise both as error messages and strange behaviour during the training processes of several different CNNs using two different MATLAB releases.
When using MATLAB R2020b, the following error i given when trying to start a training process for a CNN:
Error using trainNetwork (line 183)
GPU support for convolutional neural networks requires a GPU device with compute capability 3.0 or higher.
When switching to using MATLAB R2019a, the following error message occurs when e.g. training a CNN for segmentation using 256x256 image data as input and a batch size of 30 (as well as in several other cases with other types of data):
Error using trainNetwork (line 165)
Unexpected error calling cuDNN: CUDNN_STATUS_EXECUTION_FAILED.
When switching to using smaller batch sizes for this particular training process (to batch sizes 10 and 20), the training process shows a strange behaviour, with a slowly decreasing but almost static loss value (see the two images below from two completely different training processes for two different CNNs). This similar behaviour could be observed for different segmentation tasks using different data and different CNN architectures. When instead using NVIDIA TITAN RTX cards to perform these training processes, they were executed without problems, showing no similarity to each other.
According to https://se.mathworks.com/matlabcentral/answers/631134-rtx-3080-recompiling-issue-in-matlab-2020a#comment_1173538, incorrect behaviour has been observed with the new NVIDIA cards that use the Ampere architecture, especially when training CNNs. However, no workaround solution for the incorrect behaviour is mentioned.
Is there currently any known solution to these problems?
0 个评论
回答(3 个)
Stephan
2020-12-7
Even if you don't like it, the best and probably most up-to-date answer to this question is:
2 个评论
Stephan
2020-12-7
编辑:Stephan
2020-12-7
Since the answer is from Sep 2020 and there is the workaround and also the note that it will be available in future releases i dont think so. If, then you should update your release as soon as an update is available and have a look to the release notes of the corresponding update:
But i think this is what will happen:
Walter Adame Gonzalez
2020-12-13
Hello Julius!
I have also tried to run a CNN training on my rtx 3090 gpu using MatLab 2020b with my own 256x256x3 dataset. It shows exactly the same behavior than what you are reporting (plateau almost immediately at 80% accuracy for my validation images) and sometimes at the end of the training there is a validation accuracy drop to around 60% (only on the last validation accuracy calculation).
Also tried to run the training on 2020a release and 2019b release with the same abnormal outcome. Running the training on a MX150 from NVIDIA and also on my cpu (core i7 10700) shows a normal behavior. I've implemented a code to run the training on python (since I got tired of failing) on my 3090. Just let me know if you would like me to share my python code with you.
Best,
Walter
0 个评论
Walter Adame Gonzalez
2020-12-17
Good News!
I received MatLab 2021a pre-release version and it works now. No backwards compatibility needed, no compiling problems. works smoothly. Test ran on rtx 3090, drivers up to date december 16th. Good luck!
2 个评论
M J
2020-12-19
So it's official with the 2021a version? Training networks works fine with the rtx30 series?
Roland Kruse
2021-2-4
Training CNNs with R2021a on RTX 30xx works well with me, too, much unlike with R2020b and forward compatibility. Predict, however, does not work reliably, I get out-of-memory errors.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Parallel and Cloud 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!