Vec-trick implementation (multiple times)

15 次查看（过去 30 天）

ConvexHull 2021-8-21

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1438154-vec-trick-implementation-multiple-times

编辑： Bruno Luong 2023-9-14

Dear all,

the question is related to Tensorproduct. Since the question was not answered as intended, i want to revisit the question.

Introduction:

Suppose you have a matrix vector multiplication, where a matrix C with size (np x mq) is constructed by a Kronecker product of matrices A with size (n x m) and B with size (p x q). The vector is denoted v with size (mp x 1) or its vectorized version X with size (m x p).

In two dimensions this operation can be performed with O(npq+qnm) operations instead of O(mqnp) operations, see Wikipedia.

Expensive variant (in case of flops):

Cheap variant (in case of flops):

Main question:

I want to perform many of these operations at ones, e.g. 2500000. Example: n=m=p=q=7 with A=size(7x7), B=size(7x7), v=size(49x2500000).

In Tensorproduct i have implemented a MeX-C version of the cheap variant which is quite slower than a Matlab version of the expensive variant provided by Bruno Luong.

Is it possible to implement the cheap version in Matlab without looping?

5 个评论
显示 3更早的评论隐藏 3更早的评论

Bruno Luong 2021-8-23

Because smaller flops doesn't mean necessary faster. Memory access, cache, thread management are as well important, and which is fatest method probably depends on n=m=p=q.

ConvexHull 2021-8-23

编辑：ConvexHull 2021-8-23

Yeah that's definitly the case here.

The main problem is that, if you want to perform the Vec-trick multiple times in a vectorized fashion you have to reorder the datastructure. After applying AX you cannot perform a Matrix-Matrix multiplication directly with B.

Stupid Memory access O.o!

请先登录，再进行评论。

请先登录，再回答此问题。

采纳的回答

ConvexHull 2021-8-24

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1438154-vec-trick-implementation-multiple-times#answer_773279

编辑：ConvexHull 2021-8-25

在 MATLAB Online 中打开

Here is a pure intrinsic Matlab version without loops, however with two transpose operations and quite slow.

n=7;m=7;p=7;q=7;
A = rand(n,m);
B = rand(p,q);
v = rand(m*p,500000,5);
n = 5;
C = kron(B,A);
tic
for i=1:n
    v1 = reshape(C*reshape(v,49,[]),size(v));
end
toc % Elapsed time is 0.456353 seconds
tic
for i=1:n
    v2 = reshape(reshape(B*reshape((A*reshape(v,7,[])).',7,[]),7*2500000,[]).',7,[]);
end
toc % Elapsed time is 3.879752 seconds
max(abs(v1(:)-v2(:))) 
% 1.4211e-14

22 个评论
显示 20更早的评论隐藏 20更早的评论

ConvexHull 2021-8-24

编辑：ConvexHull 2021-8-24

在 MATLAB Online 中打开

I don't know what you mean.

The ().' is far the most expensive operation no matter what is being done in the background.
Reshape is for free.
The small 7er matrix-matrix multiplication is cheaper than the 49er big one.
By the way ()' or ().' are nearly same expensive.

n=7;m=7;p=7;q=7;
A = rand(n,m);
B = rand(p,q);
v = rand(m*p,500000,5);
n = 5;
tic
for i=1:n
    vv = reshape(v,7,[]); %#ok<*NASGU>
end
toc % Elapsed time is 0.000186 seconds
tic
for i=1:n
    vvv = A*vv;
end
toc % Elapsed time is 0.350487 seconds
tic
for i=1:n
    vvvv = (vvv).';
end
toc % Elapsed time is 1.682334 seconds
tic
for i=1:n
    vvvvv = reshape(vvvv,7,[]);
end
toc % Elapsed time is 0.000181 seconds
tic
for i=1:n
    vvvvvvv = B*vvvvv;
end
toc % Elapsed time is 0.347840 seconds
tic
for i=1:n
    vvvvvvvv = reshape(vvvvvvv,7*2500000,[]);
end
toc % Elapsed time is 0.000174 seconds
tic
for i=1:n
    vvvvvvvvv = (vvvvvvvv).';
end
toc % Elapsed time is 1.470868 seconds
tic
for i=1:n
    vvvvvvvvvv = reshape(vvvvvvvvv,7,[]);
end
toc % Elapsed time is 0.000148 seconds

Bruno Luong 2021-8-26

在 MATLAB Online 中打开

Add benchmark with mtimesx

Conclusion

For version before R2020b, use expensive method for s < 44, use mtimesx otherwise;
For version R2020b or later, use expensive method for s < 27, use pagemtimes otherwise.

stab = 5:5:100;
t1 = zeros(size(stab));
t2 = zeros(size(stab));
t3 = zeros(size(stab));
t4 = zeros(size(stab));
for i = 1:length(stab)
    fprintf('%d/%d\n', i, length(stab));
    s = stab(i);
    n=s;
    m=s;
    p=s;
    q=s;
    
    A = rand(n,m);
    B = rand(p,q);
    v = rand(m*p,100000);
    
    tic
    C = kron(B,A);
    v1 = reshape(C*reshape(v,s*s,[]),size(v));
    t1(i) = toc;
    
    tic
    v2 = reshape(reshape(B*reshape((A*reshape(v,s,[])).',s,[]),[],s).',s,[]);
    t2(i) = toc;
    
    tic
    X = reshape(v, size(A,2), size(B,1), []);
    v3 = pagemtimes(pagemtimes(A, X), 'none', B, 'transpose');
    t3(i) = toc;
    
    tic
    X = reshape(v, size(A,2), size(B,1), []);
    v4 = mtimesx(mtimesx(A, X), 'N', B, 'T');
    t4(i) = toc;
end
close all
semilogy(stab, [t1; t2; t3; t4]');
legend('Expensive method', ...
    'Cheap method using transposition', ...
    'Cheap method using pagemtimes', ...
    'Cheap method using mtimesx');
xlabel('s');
ylabel('time [sec]');
grid on;

Stefano Cipolla 2023-9-14

编辑：Stefano Cipolla 2023-9-14

Hi there! May I ask if you are aware of implementation of functions similar to "pagemtimes" but able to work with at least one sparse input? Alternatively do you see any easy workaround? More precisely I need someting like

pagemtimes(A, V)

where A is a nxnxn sparse real tensor and V is a real dense nxn matrix...

Bruno Luong 2023-9-14

编辑：Bruno Luong 2023-9-14

在 MATLAB Online 中打开

@Stefano Cipolla "sparse real tensor"

I'm not aware this native MATLAB class.

But you can put the A as diagonal block of n^2 x n^2 sparse matrix

SA = [A(:,:,1) 0         0 ... 0
      0       A(:,:,2)  0 ... 0
      ...
      9=0      0 ...           A(:,:,n)]
    

Do the same expansion for V (with the same block) then solve it

请先登录，再进行评论。

类别

MATLAB Mathematics Linear Algebra

在 Help Center 和 File Exchange 中查找有关 Linear Algebra 的更多信息

产品

MATLAB

版本

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

Vec-trick implementation (multiple times)

5 个评论
显示 3更早的评论隐藏 3更早的评论

采纳的回答

22 个评论
显示 20更早的评论隐藏 20更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Vec-trick implementation (multiple times)

5 个评论 显示 3更早的评论隐藏 3更早的评论

采纳的回答

22 个评论 显示 20更早的评论隐藏 20更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

5 个评论
显示 3更早的评论隐藏 3更早的评论

22 个评论
显示 20更早的评论隐藏 20更早的评论