GPU performance with short vectors

2 次查看(过去 30 天)
Hello - I see GPU computation underperforming when used for vector manipulation with short lengths.
>> a = rand(1000000, 100,'gpuArray');
>> b= gather(a);
>> tic; for i=1:100 ; eval('q = zeros(1000000,1);for i = 1:100; q = b(:,i)+q;end') ; end;doc
Elapsed time is 45.489811 seconds.
>>tic; for i=1:100 ; eval('qq = zeros(1000000,1);for i = 1:100; q = a(:,i)+q;end') ; end;toc
Elapsed time is 0.875140 seconds.
same when done for short vectors see GPU computation under performing:
>> a = rand(200, 100,'gpuArray');
>>b= gather(a);
>> tic; for i=1:100 ; eval('q = zeros(200,1);for i = 1:100; q = b(:,i)+q;end') ; end;doc
Elapsed time is 0.021727 seconds.
>>tic; for i=1:100 ; eval('qq = zeros(200,1);for i = 1:100; q = a(:,i)+q;end') ; end;toc
Elapsed time is 0.833865 seconds.
Any insight will be appreciated.
Thank you.

采纳的回答

Joss Knight
Joss Knight 2016-4-20
编辑:Joss Knight 2016-4-20
Computation in a GPU core is significantly slower than in a modern CPU core. It makes up for that by having a lot of them - thousands. If you don't give it thousands of things to do at once, you're never going to beat the CPU.
In your simple computation above you are unnecessarily using a loop. This may have been for illustrative purposes, but if it reflects your actual code, you will gain back your performance by removing the loop, i.e.
q = sum(a, [], 2);

更多回答(1 个)

Walter Roberson
Walter Roberson 2016-3-30
Do not use eval() for this. use timeit()
  3 个评论
MatlabNinja
MatlabNinja 2016-3-30
Thank you for your insight. time and gputimeit gives very similar results and shows similar trend where smaller vector(a & b above) had worse run performance when run on GPU.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Mathematics 的更多信息

标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by