How do I optimize this code to run efficiently on the GPU?

1 次查看(过去 30 天)
Dear Matlab user,
I try making optimization of function listed below for GPU computing. I try many version of GPU algorithm but look for me that always is GPU slower. I really appreciate any suggestion or help.
%%Declaration of variables
K=4;
C11n = rand(K,508032);
[x1, x2] = size(C11n);
C22b = zeros(x1,x1*(x2/2),'double');
C2 = zeros(K,K,x2/2,'double');
E=eye(x1);
A=reshape(C11n,K,2,x2/2);
AT=permute(A,[2 1 3]);
%%CPU code
tic
for k=1:x2/2
C2(:,:,k)=E-(A(:,:,k)*inv(AT(:,:,k)*A(:,:,k))*AT(:,:,k));
end
toc
%%GPU code
% Declaration of variables
C22 = gpuArray(zeros(K,K,x2/2,'double'));
E=gpuArray(eye(x1));
A=gpuArray(reshape(C11n,K,2,x2/2));
tic
for k=1:x2/2
C22(:,:,k)=E-(A(:,:,k)*inv(AT(:,:,k)*A(:,:,k))*AT(:,:,k));
end
toc
with best regards
Jan

回答(2 个)

Ashish Uthama
Ashish Uthama 2013-11-27
A quick 'air' code using pagefun:
tic
M = pagefun(@mtimes, A(:,:,1:x2/2), AT(:,:,1:x2/2));
M = pagefun(@mtimes, M, M);
C22 = repmat(E,[1 1 x2/2])-M;
toc
I would be curious to know if this works for you, and what times you get on your hardware.
  1 个评论
Jan
Jan 2013-11-27
编辑:Jan 2013-11-27
Thank you for idea, I will try and let you know...
I apologise, but first time I wrote bad code, I forgot for inversion of matrix, now is code corrected.
BTW, pagefun, help me, it is 10x times speed up (M = pagefun(@mtimes, A(:,:,1:x2/2), AT(:,:,1:x2/2)); ). Now I need figure out how do it quick inversion on every page of 3D matrix. I will inform you.

请先登录,再进行评论。


Joss Knight
Joss Knight 2013-11-28
编辑:Joss Knight 2013-11-28
Are your matrices always 4x2? This results in AT*A being 2x2, so you can just calculate your inverses manually:
function Ainv = batch2x2inv(A)
% Grab each matrix element as a vector
a = A(1,1,:);
b = A(1,2,:);
c = A(2,1,:);
d = A(2,2,:);
% Compute determinants
det = a.*d - b.*c;
% Construct inverse
Ainv = bsxfun(@rdivide, [d -b; -c a], det);
end
...and the relevant chunk of your code also uses pagefun as Ashish suggests:
AT = pagefun(@transpose, A);
ATA = pagefun(@mtimes, AT, A);
invATA = batch2x2inv(ATA);
pinvA = pagefun(@mtimes, invATA, AT);
residual = pagefun(@mtimes, A, pinvA);
C22 = bsxfun(@minus, E, residual);
Your code now runs 6x faster than the CPU on my machine.

类别

Help CenterFile Exchange 中查找有关 Parallel Computing Fundamentals 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by