So far I've attempted indexing that replaces the repmat/repelem calls, but I can't get this to work and I'm not sure it will result in a memory improvement. I've also attempeted to use the above component matricies to create new input vectors for a sparse command, but I'm either missing elements or introducing new ones as a result. My updated mat file with the indexing pattern made explicit is attached.
Reduce memory use when doing element-wise multiplication of repeated (repmat/repelem) matrices? (Final product is more sparse than intermediate computations.)
4 次查看(过去 30 天)
显示 更早的评论
Re-Edit: I realized I could use kronecker products in place of each repmat and repelem call. I did this so that I could decompose ABC_output into the following:
ABC_output2 = kron(kron(A,ones(num_s3)).*kron(ones(num_s1,num_s2),B),ones(num_s4)).*kron(ones(num_s1*num_s2,num_s2*num_s3),C);
But Matlab balks on memory, and it's slow as.
Yes, I am aware that you can swap Hadamard and Kronecker products in the case of matrices having the same dimension, but unfortunately, this isn't the case. If it was, I could simplify the above equation greatly and do all of the Hadamards first and one Kronecker later.
-----
I have three matricies, A, B, and C, whose elements are repeated (repmat/repelem) according to a specific pattern, and then element-wise multiplied together. Using the attached .mat file, the following code will do this:
% these four numbers relate to the discretization of four variables
num_s1 = 11;
num_s2 = 15;
num_s3 = 19;
num_s4 = 26;
% here we calculate two intermediaries (to see memory requirements) and the desired result
AB_repeated = repelem(repelem(A,num_s3,num_s3).*repmat(B,num_s1,num_s2),num_s4,num_s4);
C_repeated = repmat(C,num_s1*num_s2,num_s2*num_s3); % particularly large
ABC_output = AB_repeated.*C_repeated; % smaller memory requirement than intermediates
This is fast and the code is easy to understand if the pattern--which is implicit in the last three lines--is known. For reference: we are creating a transition matrix that tracks the evolution of three state variables (output across columns), and takes the values of the three states as well as a shock variable as inputs (input across rows). Each row represents a composite state of these four variables (and three for the columns). The matrix is structured so that as we go down the rows, s4 loops through its state levels first, holding s1-s3 constant, then when s4 resets we increment s3 and loop through s4 again... then once we loop all the way through s3 we would increment s2, etc. The pattern exists in the column direction as well, without s1. This results in the repelem/repmat usage above, and I get my desired result.
As you can see from running the code, the intermediates are massive relative to the final output. This is because the final output is much more sparse because the sparse-pattern of each component matrix is different. I believe there must be a way to take advantage of this structure and not waste memory in the middle of computation. At the moment, I have to keep s1-s4 small, and I should be able to increase these significantly if I can reduce the intermediate step to require only as much ram as the final output.
Any ideas would be appreciated. This code creates an input to a dynamic programming problem, which may suffer from discretization error because of the current coarseness of this transition matrix stemming from the memory-intensive intermediate computation. I can verify/reject this claim by making this step more efficient. Thank you for your help.
回答(2 个)
Matt J
2022-7-19
编辑:Matt J
2022-7-19
or
, where F is ones(1,s3), D is ones(s1,s2), and E is ones(s4),
This solution eliminates the need to form the Kronecker product
. It uses the File Exchange files mat2tiles.
%%%Fake data
[A,B,C]=deal(rand(4), rand([1,5]) , rand(3));
n1 = size(A,1);
n2 = size(A,2);
n3 = size(B,2);
n4 = size(C,1);
%%%% proposed alternative
T1=kron(A,kron(ones(1,n3),C));
[b(1),b(2)]=size(B);
P1=permute( reshape(1:n1*b(1)*n4,[n4,b(1),n1]) , [2,1,3]);
P1=P1(:);
iP1=1:max(P1);iP1(P1)=iP1; %inverse sort
P2=permute( reshape(1:n2*b(2)*n4,[n4,b(2),n2]) , [2,1,3]);
P2=P2(:);
iP2=1:max(P2);iP2(P2)=iP2;
T1=mat2tiles(T1(P1,P2),b);
sz0=size(T1);
T1=cell2mat(T1(:).');
sz1=size(T1);
T1=reshape(T1,prod(b),[]).*B(:);
T1=mat2tiles( reshape(T1,sz1),b);
T1=cell2mat(reshape(T1,sz0));
T1=T1(iP1,iP2);
%%%% baseline computation
T0 = kron(A,kron(ones(1,n3),C)) .* kron(kron(ones(n1,n2),B),ones(n4));
%%%% compare proposed to baselin
[lowError, highError] = bounds(T0-T1,'all')
7 个评论
Matt J
2022-7-25
编辑:Matt J
2022-7-26
however, these do not reduce memory usage relative to my one-liner.
For this reason, I unaccepted the answer. However, here is yet another variant, which I think should use less memory (though it is twice as slow).
load('ABC.mat');
n1 = 11;
n2 = 15;
n3 = 19;
n4 = 26;
tic
%%%% proposed alternative
T1=kron(A,kron(ones(1,n3),C));
b=size(B);
P1=permute( reshape(1:n1*b(1)*n4,[n4,b(1),n1]) , [2,1,3]);
P1=P1(:);
iP1=1:max(P1);iP1(P1)=iP1; %inverse sort
P2=permute( reshape(1:n2*b(2)*n4,[n4,b(2),n2]) , [2,1,3]);
P2=P2(:);
iP2=1:max(P2);iP2(P2)=iP2;
T1=T1(P1,P2);
sz=size(T1);
N=b(2);
BB=repmat(B,sz(1)/b(1),1);
for i=1:N
T1(:,i:N:end)= T1(:,i:N:end).*BB(:,i);
end
T1=T1(iP1,iP2);
toc
tic
%%%% baseline computation
T1=kron(A,kron(ones(1,n3),C));
T0 = T1 .* kron(kron(ones(n1,n2),B),ones(n4));
toc
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Logical 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!