Question about major difference in computation speed with gpuArray's

Question

1 个投票

I've been trying to optimize my code recently for a project and noticed an interesting phenomenon that occurs with it. I have tried to google around what possibly would create it, but nothing so far has had a good answer.

The following code that runs extremely fast is:

x = linspace(-20,20,25);
z = linspace(0,100,29);
Columns =5;
singleframeofdata = gpuArray(rand(2816,128,'single'));
fgpu = gpuArray(rand(2816,1,'single'));
tofgpu = rand(length(z),length(x),128,'single');
SingleFrameOfDatarep = repmat(singleframeofdata,1,length(z)*length(x));
y = -2i*pi*-1*fgpu*reshape(tofgpu,1,size(tofgpu,1)*size(tofgpu,2)*size(tofgpu,3),1);
tic
 holder = SingleFrameOfDatarep.*y;
toc
clear holder
tic
 SingleFrameOfDatarep = SingleFrameOfDatarep.*y;
toc

The value of holder returns around 0.09s while SingleFrameOfDatarep will return around 0.00009s. Now i know that because the second calculation uses in place operations it will operate faster.

However, if i change x = linspace(-20,20,25) to x = linspace(-20,20,26) a drastic slow occurs. The value of holder returns around 0.09s again while SingleFrameOfDatarep will return around 0.07s. The original code ran ~ 770X faster than the second code.

Now my only thought/explanation on this is that when the elements of an array gets too large, matlab will create a new variable like it does for holder and this allocation time is where the slowdown occurs but i am not fully sure about this nor do i know how to test/check for this.

Could anyone point me in the correct direction to read on this or give a possible explanation/solution for this?

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Joss Knight 2017-6-22

I can't check your code right now but I can say two things. Firstly MATLAB does have a memory pool and when GPU memory overflows the pool there are raw allocations; those allocations force synchronization and that's slow. Secondly, your timing with tic and toc is flawed because the GPU operates asynchronously. This means when toc is reporting the time the previous command is still running. What happens when you insert wait(gpuDevice) before each tic and before each toc? You may find the timings change completely.

Finally, you should use gpuArray.rand not gpuArray(rand(...)). The former creates random data directly on the GPU; the latter does it slowly on the CPU then copies the data over to the device.

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Matt J 2017-6-22

编辑：Matt J 2017-6-22

在 MATLAB Online 中打开

0 个投票

The times that you see are probably false. You shouldn't be using tic() and toc() to time GPU operations. You should be using gputimeit(), as below. I see no significant speed difference between any of the cases that you tested, when implemented this way.

    x = linspace(-20,20,25);
    z = linspace(0,100,29);
    singleframeofdata = gpuArray(rand(2816,128,'single'));
    fgpu = gpuArray(rand(2816,1,'single'));
    tofgpu = rand(length(z),length(x),128,'single');
    SingleFrameOfDatarep = repmat(singleframeofdata,1,length(z)*length(x));
   y = -2i*pi*-1*fgpu*reshape(tofgpu,1,size(tofgpu,1)*size(tofgpu,2)*size(tofgpu,3),1);
    gputimeit(@() fun(SingleFrameOfDatarep,y) )
    gputimeit(@() hfun(SingleFrameOfDatarep,y) )
       function SingleFrameOfDatarep=fun(SingleFrameOfDatarep,y)
          SingleFrameOfDatarep=SingleFrameOfDatarep.*y;
        function holder=hfun(SingleFrameOfDatarep,y)
          holder=SingleFrameOfDatarep.*y;

Incidentally also, your code should get a bit faster (and certainly conserve memory) if you use bsxfun instead of repmat.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Hunter Palcich 2017-6-23

Using the code you proved I see as well that the times are the same. Also thank you for letting me know about the bsxfun.

请先登录，再进行评论。

Question about major difference in computation speed with gpuArray's

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

采纳的回答

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

更多回答（0 个）

类别

标签

Community Treasure Hunt

Question about major difference in computation speed with gpuArray's

1 个评论 显示 -1更早的评论 隐藏 -1更早的评论

采纳的回答

1 个评论 显示 -1更早的评论 隐藏 -1更早的评论

更多回答（0 个）

类别

标签

另请参阅

Community Treasure Hunt

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论