ptxas fatal : Unresolved extern function 'cudaGetPa​rameterBuf​ferV2' with matlab 2017a on GTX1080

4 次查看(过去 30 天)
Hi,
I am having errors trying to use dynamic parallelism on my GTX1080 card. I have the cuda programs in .cu file and I compile and run from Matlab R2017a.
Call from Matlab:
system('nvcc child_kernel.cu parent_kernel.cu -dc -gencode=arch=compute_61,code=compute_61 -m64 -rdc=true -lcudadevrt -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64" -ptx -Wno-deprecated-gpu-targets -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin"');
kernel_test = parallel.gpu.CUDAKernel('parent_kernel.ptx', 'parent_kernel.cu', 'parent_kernel');
kernel_test.ThreadBlockSize = [1,0,0];
kernel_test.GridSize = [1,0,0];
I get the following error:
Error using parallel.gpu.CUDAKernel
An error occurred during PTX compilation of <image>.
The information log was:
The error log was:
ptxas fatal : Unresolved extern function 'cudaGetParameterBufferV2'
The CUDA error code was: CUDA_ERROR_NO_BINARY_FOR_GPU.
Error in test_cuda (line 36)
kernel_test = parallel.gpu.CUDAKernel('test_cuda_fncall_frm_cuda.ptx', 'test_cuda_fncall_frm_cuda.cu',
'test_cuda_fncall_frm_cuda');
Device info:
Name: 'GeForce GTX 1080'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 8
ToolkitVersion: 8
cmd:
nvcc child_kernel.cu parent_kernel.cu -dc -gencode=arch=compute_61,code=compute_61 -m64 -rdc=true -lcudadevrt -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64" -ptx -Wno-deprecated-gpu-targets -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin"

采纳的回答

Joss Knight
Joss Knight 2017-7-15
编辑:Joss Knight 2017-7-15
Dynamic parallelism is not supported in MATLAB CUDAKernel objects. You need to use a MEX function instead. Sorry.

更多回答(2 个)

Pavel Sinha
Pavel Sinha 2017-7-18
Thanks!
Is there any extra advantage writing a mex cuda wrapper compared to using CUDAKernel object, given I donot use Dynamic parallelism.
Also, while using CUDAKernel object, once the kernel objects are compiled, is there any delay in actual launching of the cuda kernels by matlab compared to the mex counterpart.
I have all the data loaded in GPU and then wish to launch a series of these CUDAKernels one after the other. Is there any advantage in terms of speed if I were to write 1 Mex function that calls 1 cuda kernel and then internally launches multiple cuda kernel one after the other. My prime objective is speed even 10% speed up would matter.

Pavel Sinha
Pavel Sinha 2017-7-18
Also, I am using R1071a. Does the matlab convn use cuDNN? If not, then is there any convolution function in matlab that uses cuDNN functions or any wrapper function in matlab to use cuDNN?

类别

Help CenterFile Exchange 中查找有关 GPU Computing 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by