Matlab 2013a GPU memory leak
2 次查看(过去 30 天)
显示 更早的评论
I have been running some very long loops (millions of iterations) where , in each iteration, I call a few CUDA kernels via feval using pre-allocated arrays of fixed size. I noticed that the host memory grows linearly with the number of iterations and in the end matlab crashes. While I was trying to isolate the problem I found out the following: - Using feval to call a CUDA kernel , you have to have all the arguments of the function already cast as gpuArray's, even if you pass scalar variables. This also applies to functions like gpuArray.rand or randn:
n = 1e4;
for i = 1:1e6
out = gpuArray.rand(n,1,'single');
end
The above code causes the host memory to grow for the duration of the execution (about 100Mb per 250K iterations) If instead of n=1e4; you write n=gpuArray(1e4); the subsequent loop does not cause the memory to grow. I also found out the the above loop executes much faster when n is in the host memory vs. when n is a gpuArray (about 3 times faster).
-Even more puzzling is the following example:
x = gpuArray.rand(1e4,1,'single');
for i = 1:1e6
out = sqrt(x);
end
The above loop does not cause MATLAB's memory footprint to grow. However, if we change sqrt(x) with sqrt(1./x) then we get the memory blowup again. I am using MATLAB 2013a 64-bit on windows 7 professional. My video card is a gtx 650 2gb. Thanks in advance for any insights.
3 个评论
Ben Tordoff
2013-6-4
Thanks Michael, you are indeed right and this appears to be a bug introduced in R2013a. There is no realistic work-around I can provide right now, but I will post an update here once I have some more helpful suggestions.
The reason why the memory does not leak with certain calls is that they force a synchronisation event (in your first example, SQRT can error so has to wait to see if the error was hit; in the second the scalar parameter "n" has to be transferred back to host memory, which also causes a sync). You could achieve the same by inserting a "wait(gpu)" after every call:
gpu = gpuDevice();
for ii=1:1e8
out = gpuArray.rand(1e3,1,'single');
wait(gpu);
disp(i)
end
but that will also slow things down a lot and is hardly a practical solution.
采纳的回答
Ben Tordoff
2013-6-18
Hi Michael, could you read the following bug-report and try the workaround it contains (being careful about the backing-up step!):
If this does not fix the problem, please let me know as soon as possible.
Ben
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 GPU Computing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!