Why do I get CUDA execution errors when training my network on a GPU?

4 次查看(过去 30 天)
Why do I get the following error when training my neural network:
An unexpected error occurred during CUDA execution. The CUDA error was:
all CUDA-capable devices are busy or unavailable
The above only happens on a GPU and not on the CPU.

采纳的回答

MathWorks Support Team
编辑:MathWorks Support Team 2021-5-19
We suspect that the most likely issue is a kernel execution timeout.
To confirm this you can try running some GPUarray commands, such as:
A = gpuArray(rand(10))
B = A+1
If the above runs without any warnings and errors, it is likely due to kernel timeouts.
Some possible workarounds:
  1. You have to scale down your problem to make sure it does not timeout (e.g. with a smaller network, or data size) or use a different card that does not timeout.
  2. Some GPUs allow one to set the compute mode to computations (TCC) only but others don't. As a possible workaround check if your GPU allows changing to that mode.
  3. Another possible workaround is to modify the registry to increase the TDR delay value as explained in the web page below:

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 GPU Computing 的更多信息

产品


版本

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by