Error using gpucoder.profile (line 41)
显示 更早的评论
I have a problem running:
cfg = coder.gpuConfig('exe');
cfg.GpuConfig.MallocMode = 'discrete';
gpucoder.profile('tvd_sim2_MEX',ARGS{1},'CodegenConfig',cfg,...
'CodegenArguments','-d profilingdir','Threshold',0.001);
I get error:
Error using gpucoder.profile (line 41)
Incorrect class for expression 'x': expected 'double' but found 'coder.PrimitiveType'.
What to do?
I have NVIDIA GPU Computing Toolkit\CUDA\v10.2
Microsoft Visual C++ 2019 (C)
Matlab R2020b
CUDA 5.2 compute capability
Thanks!
6 个评论
Justin Hontz
2022-11-3
This appears to be a limitation with gpucoder.profile in that it does not properly handle certain codegen input specification types created with e.g. coder.typeof. I will submit an enhancement request for this.
For your particular example, can you try manually converting the codegen inputs to the runtime inputs and then pass the runtime inputs to gpucoder.profile? For instance, coder.typeof(1) should be converted to 1.
Emiliano Rosso
2022-11-4
编辑:Emiliano Rosso
2022-11-4
Justin Hontz
2022-11-4
In your example, can you try changing
ARGS{1}{1} = coder.typeof(0,[mex1 mex1]);
to something like
ARGS{1}{1} = zeros(mex1, mex1);
and do something similar for the other inputs? If you want the input to be passed on GPU, you can instead do
ARGS{1}{1} = zeros(mex1, mex1, 'gpuArray');
but for SIL execution, I'm not sure there is much reason to want to use GPU inputs.
Emiliano Rosso
2022-11-5
Justin Hontz
2022-11-7
The code generated for profiling is roughly the same as the code generated from usual SIL codegen, though with some additional profiling API calls inserted in some places to enable the profiling to work as expected.
The error was occurring because the profiler was trying to pass the codegen input specification value (e.g. produced by coder.typeof) to the SIL executable, which is not valid as a runtime input.
The 'Gpu' option of coder.typeof simply controls whether the input of the generated entry-point function will be passed on GPU or not. This can improve performance by eliminating cudaMemcpy calls each time the entry-point function is executed in the case the input comes from GPU (e.g. a GPU array input for MEX).
Emiliano Rosso
2022-11-8
回答(0 个)
类别
在 帮助中心 和 File Exchange 中查找有关 Get Started with GPU Coder 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!