Improve the speed of nested for loops through vectorization or similar methods
5 次查看(过去 30 天)
显示 更早的评论
My Matlab code has a function that is called 10^3 - 10^7 times. I'm curious if I can improve the speed of the function through vectorization or a similar method.
clc; clear all;
% Test data for function
u = rand(32,33);
Nx = 32;
Nz = 32;
Dz = rand(Nx+1,Nz+1);
u_z = zeros(Nx,Nz+1);
u_z_2 = zeros(Nx,Nz+1);
g = zeros(Nz+1,1);
% Method 1 - Original Implementation with double for loop
tic
for j=1:Nx
for ell=0:Nz
g(ell+1) = u(j,ell+1);
end
u_z(j,:) = (2.0)*Dz*g;
end
toc
% Method 2 - Remove one for loop
tic
for j=1:Nx
g=u(j,:)';
u_z_2(j,:) = (2.0)*Dz*g;
end
diff = norm(u_z - u_z_2,inf);
toc
Repeating these for loops 10,000 times gives
clc; clear all;
u = rand(32,33);
Nx = 32;
Nz = 32;
Dz = rand(Nx+1,Nz+1);
u_z = zeros(Nx,Nz+1);
u_z_2 = zeros(Nx,Nz+1);
g = zeros(Nz+1,1);
tic
for rep=1:10000
for j=1:Nx
for ell=0:Nz
g(ell+1) = u(j,ell+1);
end
u_z(j,:) = (2.0)*Dz*g;
end
end
toc
tic
for rep=1:10000
for j=1:Nx
g=u(j,:)';
u_z_2(j,:) = (2.0)*Dz*g;
end
end
toc
diff = norm(u_z - u_z_2,inf);
where the original implementation is slightly faster since the above code returns
Elapsed time is 0.771755 seconds.
Elapsed time is 1.079783 seconds.
Could the speed be improved through implementating vectorization or a similar method?
0 个评论
采纳的回答
DGM
2021-7-18
编辑:DGM
2021-7-18
One big speed improvement you can do is to move the scalar multiplication of Dz outside the loop, but if you don't use a loop, it doesn't really matter.
% Test data for function (i'm using bigger arrays)
Nx = 320;
Nz = 320;
u = zeros(Nx,Nz+1);
ntests = 100; % number of test iterations to average exec time
Dz = rand(Nx+1,Nz+1);
u_z = zeros(Nx,Nz+1);
u_z_2 = zeros(Nx,Nz+1);
g = zeros(Nz+1,1);
% Method 1 - Original Implementation with double for loop
tic
for N = 1:ntests
for j=1:Nx
for ell=0:Nz
g(ell+1) = u(j,ell+1);
end
u_z(j,:) = (2.0)*Dz*g;
end
end
toc/ntests
% Method 2 - Remove one for loop
tic
for N = 1:ntests
for j=1:Nx
g=u(j,:)';
u_z_2(j,:) = (2.0)*Dz*g;
end
end
toc/ntests
immse(u_z,u_z_2) % result is identical
% simplified
tic
for N = 1:ntests
uuu = (2*Dz*u.').';
end
toc/ntests
immse(u_z,uuu) % result is identical
When you're trying to find out how to make things fast, it might matter how you scale the test to emphasize the execution time. Increasing the number of iterations or the size of the inputs may reveal different things. It all depends on what you expect to do.
0 个评论
更多回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!