Why using .mex created by MATLAB Coder and GPU Coder doesn't give exactly same results?

Hi
I created an image processing function and transformed it into .mex to speed-up my process. As I have an Nvidia GPU cuda 6.1 compute capability, I thought to create .mex with GPU Coder also.
But why output of both .mex file is not exactly same? I passed same input image and other parameters, but I find ouput image of GPU coded mex little more blurry. See below comarison carefully..
Can somebody explain why? I used below command to create GPU coded mex
cfg = coder.gpuConfig('mex');
cfg.GpuConfig.CompilerFlags = '--fmad=false';
cfg.GpuConfig.ComputeCapability = '6.1';
codegen -args {inputImage, otherParameters} -config cfg imProcessFunction
For detailed analysis, I subtracted above images and then I got below image
difference = CPUimage-GPUimage;

11 个评论

How is the visual performance of using the GPU more interactively? In other words we need to isolate whether this is due to using the gpu at all, or due to mexing the GPU.
Differences in gpu results are generally expected due to different order of operations.
NVIDIA GPU hardware FP isn't identical to X_86, necessarily, is it? There's also rounding modes of Intel coprocessor; don't have any idea what TMW uses but I don't think it's at all surprising the two aren't identical.
There may be things you can do with compiler switches to make more nearly like each other but don't think there's any guarantee can be the same. transcendentals can be a factor if there's an of those in the algorithm.
@ Walter I didn't understand your 1st comment. I mean, what can I do?
But let me go through link in your 2nd comment..
My quick search didn't uncover the link Walter shows and while a detailed read will provide a lot of info, very quickly found two points...
"4.5. Differences from x86 NVIDIA GPUs differ from the x86 architecture in that rounding modes are encoded within each floating point instruction instead of dynamically using a floating point control word. Trap handlers for floating point exceptions are not supported. On the GPU there is no status flag to indicate when calculations have overflowed, underflowed, or have involved inexact arithmetic."
Simply different compilers can use different optimization levels or build different execution chain for the same calculation so that order isn't necessarily the same and floating point isn't necessarily exactly commutative so those effects can show up.
"5.1. Mathematical Function Accuracy ... The consequence is that different math libraries cannot be expected to compute exactly the same result for a given input. This applies to GPU programming as well. Functions compiled for the GPU will use the NVIDIA CUDA math library implementation while functions compiled for the CPU will use the host compiler math library implementation (e.g., glibc on Linux). Because these implementations are independent and neither is guaranteed to be correctly rounded, the results will often differ slightly."
As noted, you may be able to do something with the compiler to try to make rounding more nearly the same (providing you can determine what mode TMW is using) but there's probably nothing you can do about any differences in the libraries.
What Walter was suggesting in his first comment was to try to duplicate the GPU code interactively instead of compiled but I don't think that will likely help because the GPU instructions have to run in that environment so whatever is different is well, "just different".
You could look at pieces of the algorithm perhaps and try to isolate particular calculations and maybe eventually isolate which part is the culprit but I would not hold out much hope for "fixing" it.
Thanks @dpb, for your consideration and crafted summary.
What and how can I do something with compiler?
And what is TMW?
TMW --> The Mathworks, publishers of Matlab
As for the compilers, you'd have to study documentation for them regarding whatever options they have; I have neither product so have no knowledge at all of either. And, they're so paranoid they won't even let me read the documentation without a license so I can't go looking to see what can see, sorry... :(
@dpb
which documentation exactly you are referring to.
May be I can attach a copy here..

请先登录,再进行评论。

回答(0 个)

类别

帮助中心File Exchange 中查找有关 GPU Computing 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by