From CPU code to GPU

9 次查看(过去 30 天)
MartinM
MartinM 2021-8-19
Hello Everybody
I have an heavy code, it's a Split-Step Fourier program, with different external function.
The main vector (the propagating one) is 2^16 to 2^18 point. It currently work as a classic CPU program.
My computer has a CUDADevice with properties: Name: 'Quadro RTX 4000'. So I try to tranlate my program.
Before that I test the speed of the GPU vs CPU with some code I found here. And It's faster, Perfect
To translate my program I do this modification
A=gpuArray(A);
B=gpuArray(B);
And I try to do it for the most variable I have, it's quite long and painfull
I also do it in the function I need to use, to be sure that the most part of the variable are GpuArray
BUT, it's slower...
I never use GPU before, so I guess I am doing something wrong..
Thanks for your time
Martin

回答(2 个)

MartinM
MartinM 2021-8-19
for exemple
clc,clear all
close all
%% CPU
tic
num.n=1*2^18;
num.tspan = 2e-09;
num.dt=num.tspan/(num.n-1);
T = zeros(1, num.n);
for k=1:1:num.n
T(k)=(k-1)*num.dt-num.tspan/2;
end
toc
%% GPU
clear all
close all
tic
num.n=gpuArray(1*2^18);
num.tspan =gpuArray(2e-09);
num.dt=gpuArray(num.tspan/(num.n-1));
T = gpuArray(zeros(1, num.n));
for k=1:1:num.n
T(k)=(k-1)*num.dt-num.tspan/2;
end
toc
the result is
Elapsed time is 0.027964 seconds.
Elapsed time is 47.296786 seconds.

Joss Knight
Joss Knight 2021-9-16
You need to vectorize your code. The GPU is not intended for performing this kind of looping series of operations on scalar variables. For instance use
k = gpuArray(1:num.n);
T=(k-1)*num.dt-num.tspan/2;
to compute every element of T at once.
You do not need to convert every variable to a gpuArray, just your inputs.

类别

Help CenterFile Exchange 中查找有关 GPU Computing 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by