Efficient way of Vectorization
1 次查看(过去 30 天)
显示 更早的评论
Hello, I searched everywhere for the efficient explaination of vectorization, I would like to know how can we use the technique of vectorization efficiently if we have this kind of problem
clc
clear
close all
n=1000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
tic
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
toc
%VECTORIZATION
tic
C2(2:n-1,2:n-1)=(A(2:n-1,2:n-1).*B(2:n-1,1:n-2) + A(1:n-2,2:n-1).*B(3:n,1:n-2))./(A(2:n-1,3:n).*B(3:n,2:n-1));
toc;
norm(C1-C2)
This is a very basic example, although it is showing the improvement after vectorization but not that enough. If I make more divison and multiplication in the same function, "vectorization" will become even worse than "for loop". If anybody have any suggestion regarding this, it would be very helpful for me.
8 个评论
Mike Croucher
2022-11-30
Thanks. So for N,M=50, the code runs in 0.01 seconds on my machine.
Increasing to N,M=100, the code runs in 0.22 seconds
Trying N,M=200, I run out of memory on my 32Gb laptop.
What values of N and M are you interested in and how fast do you need the code to be?
采纳的回答
Matt J
2022-11-30
编辑:Matt J
2022-11-30
Unfortunately, this is a situation where the for loop is the fastest option. This is because vectorized solution does much more memory allocation than it should. I have raised this issue with MathWorks staff, but am not sure what is being done on it.
function test
n=1000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
timeit(@()method1)
timeit(@()method2)
ans =
0.0161
ans =
0.0210
function method1
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function method2
C2(2:n-1,2:n-1)=(A(2:n-1,2:n-1).*B(2:n-1,1:n-2) + A(1:n-2,2:n-1).*B(3:n,1:n-2))./(A(2:n-1,3:n).*B(3:n,2:n-1));
end
end
9 个评论
Bruno Luong
2022-11-30
编辑:Bruno Luong
2022-11-30
I don't think the problem is allocating memory, but actually indexing with truncation index, which requires elements in memory to be rearranged.
I'm not surprised that to make a vectorize code as fast as the for-loop requires a big development of the internal engine (for instant using meta data that describe subarray of an array without copying the data).
Indexing is always the bottleneck of MATLAB.
Matt J
2022-11-30
I don't think the problem is allocating memory, but actually indexing with truncation index
Not sure what a "truncation index" refers to here. In any case, the subsref operations are definitely to blame, since when we revise the test with the indexing done offline, the vectorized version is much more competitive with the loops:
function test
n=1000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
[Q1,Q2,Q3,Q4,Q5,Q6]=...
deal( A(2:n-1,2:n-1) , B(2:n-1,1:n-2), A(1:n-2,2:n-1),...
B(3:n,1:n-2), A(2:n-1,3:n), B(3:n,2:n-1) );
timeit(@()method1)
timeit(@()method2)
ans =
0.0149
ans =
0.0051
function method1
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function method2
C2(2:n-1,2:n-1)=(Q1.*Q2 + Q3.*Q4)./(Q5.*Q6);
end
end
更多回答(1 个)
Mike Croucher
2022-11-30
Switch the order of the loops around. It will be faster because you'll be operating on the matrix column-wise
test
function test
n=2000;
C1=zeros(n,n);
C2=zeros(n,n);
A=rand(n,n);
B=rand(n,n);
disp('loops')
timeit(@()loops)
disp('loops 2')
timeit(@()loops2)
disp('vector')
timeit(@()vector)
function loops
for i=2:n-1
for j=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function loops2
for j=2:n-1
for i=2:n-1
C1(i,j) = (A(i,j)*B(i,j-1) + A(i-1,j)*B(i+1,j-1))/(A(i,j+1)*B(i+1,j));
end
end
end
function vector
C2(2:n-1,2:n-1)=(A(2:n-1,2:n-1).*B(2:n-1,1:n-2) + A(1:n-2,2:n-1).*B(3:n,1:n-2))./(A(2:n-1,3:n).*B(3:n,2:n-1));
end
end
4 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Time Series Objects 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!