Sparse gpuArray accumulation in for-loop

1 次查看(过去 30 天)
I met a problem 'Out of memory' in Sparse gpuArray accumulation in my for-loop.
The following code is in a function. I need to accumulate the result 'KernelCurrent' of every loop into the grobal gpuArray Sparse 'Kernel'. In this function, 'KernelCurrent' is also a gpuArray Sparse and has the same size as 'Kernel'; (Size: 262144×262144)
I have tested all the other line of code in this function, which showed that the 'Out of memory' problem is caused by the operation of addition(accumulation). The storage memories requested for both 'Kernel' and 'KernelCurrent' is exactly less than the 'AvailableMemory' of the gpuDevice.
Kernel = gpuArray(sparse(num_row, num_col))
for
.
.
.
KernelCurrent = Result_oneLoop; % 'KernelCurrent' has the same size as 'Kernel'
Kernel = Kernel + KernelCurrent; % Causing the 'Out of mamory' problem
end
The gpuDevice that I can access:
Are there alternative method of coding for solving this problem ? Thanks in advance!
  2 个评论
Andrea Picciau
Andrea Picciau 2019-10-9
Hi Chen,
How many elements do your sparse matrices have?
CHEN ZIXIANG
CHEN ZIXIANG 2019-10-10
Hi Andrea,
The size of the sparse matrices is 262144×262144.(For both Kernel and KernelCurrent)

请先登录,再进行评论。

采纳的回答

Matt J
Matt J 2019-10-10
编辑:Matt J 2019-10-10
I would guess that your Kernel matrix is becoming less and less sparse as you accumulate until its memory consumption is growing beyond the GPU's capacity. Add the line below and re-run to check.
Kernel = gpuArray(sparse(num_row, num_col))
for
.
.
.
KernelCurrent = Result_oneLoop;
Kernel = Kernel + KernelCurrent;
percent_density=nnz(Kernel)/numel(Kernel)*100, %<---- Add this
end
How large does the percent_density become before the "Out of memory" occurs?
  1 个评论
CHEN ZIXIANG
CHEN ZIXIANG 2019-10-11
Thank you for your answer!
Yes, the sparsity decreases very quickly as the accumulation goes on.
I finally try to keep the sparsity of Kernel by a Sparsity controlling vector(the size is 262144×1) with entries of 1 and 0(Only 6 elements of the vector is of value 1),now the code becomes:
Kernel = sparse([]);
parfor
.
.
.
KernelCurrent = Result_oneLoop; % 'KernelCurrent' has the size of (262144×1)
KernelCurrent = KernelCurrent.*Sparsity_Control_Vector;
Kernel = [Kernel, KernelCurrent];
end
As you can see, I don't apply 'gpuArray' anymore. However, the parallel computing pool still works. And now my problem is solved.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 GPU Computing 的更多信息

产品


版本

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by