CUDA Kernels and long vectors

7 次查看(过去 30 天)
Raphael
Raphael 2013-2-22
Hello!
At the moment i'm writing my first programs with MATLAB and Cuda. I wrote a kernel in .cu file, compiled to ptx and execute it with a feval function.
[dResultGPU]= feval(k_calc,dResultGPU,g_x,g_y,nLines,nColumns);
I have a ThreadBlockSize of 1024 and a maximum gridsize of [65535 65535].
Everything works fine so far, but i have some troubles with the indexing in my kernel.
When I want to add vectors with length 10e7 i am not able to get the index in the kernels right.
In my m-File I set the my grid size like
nBlocks = ceil(nPoints / k_calc.ThreadBlockSize(1));
if nBlocks <= 65535
k_calc.GridSize = nBlocks;
else
k_calc.GridSize = [65535 ceil(nBlocks/65535)];
end
In my example with a vector with 10e7 elements, 10e7 is bigger than 65535*1024, so i have a gridsize of [65535 2].
In my cu-kernel I tried the index
int idx = blockDim.x * blockIdx.x + threadIdx.x;
but this is wrong for elements with index greater than 65535*1024. Which cuda-variable tells me in which row of my grid i am?
gridDim.x gives me only the Dimensions not the current location as far as i know.
Thank you very much, Raphael

回答(1 个)

Ben Tordoff
Ben Tordoff 2013-2-26
编辑:Ben Tordoff 2013-2-26
If you want to go down the "x" dimension first, you probably want
int const globalBlockIdx = blockIdx.y * gridDim.x + blockIdx.x;
int const globalThreadIdx = globalBlockIdx * blockDim.x + threadIdx.x;
or something similar. This assumes your grid is only 2D and your blocks are only 1D.

类别

Help CenterFile Exchange 中查找有关 GPU Computing 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by