Efficient algorithm for a duplication matrix

Can anybody help me to design a Matlab code function that creates a duplication matrix D?
Thanks in advnace.
My codes is very slow...
Any ideas to speed it up?
n=1000;
% Duplication matrix: vec(P)=Dvech(P)
tic
m=1/2*n*(n+1);
nsq=n^2;
DT=sparse(m,nsq);
for j=1:n
for i=j:n
ijth=(j-1)*n+i;
jith=(i-1)*n+j;
vecTij=sparse(ijth,1,1,nsq,1);
vecTij(jith,1)=1;
k=(j-1)*n+i-1/2*j*(j-1);
uij=sparse(k,1,1,m,1);
DT=DT+uij*vecTij';
end
end
D=DT';
toc
% test duplication matrix
C=rand(n,n);
P=1/2*(C+C');
vechP=nonzeros(tril(P));
vecP=P(:);
err_D=vecP-D*vechP;
max(err_D(:))
min(err_D(:))

 采纳的回答

Jan
Jan 2019-7-28
编辑:Jan 2021-8-4
For n=300 this needs 1.3 sec instead of 27.5 sec:
tic
m = n * (n + 1) / 2;
nsq = n^2;
D = spalloc(nsq, m, nsq);
row = 1;
a = 1;
for i = 1:n
b = i;
for j = 0:i-2
D(row + j, b) = 1;
b = b + n - j - 1;
end
row = row + i - 1;
for j = 0:n-i
D(row + j, a + j) = 1;
end
row = row + n - i + 1;
a = a + n - i + 1;
end
toc
But it is much faster to create the index vector at first instead of accessing the sparse matrix repeatedly:
tic
m = n * (n + 1) / 2;
nsq = n^2;
r = 1;
a = 1;
v = zeros(1, nsq);
for i = 1:n
b = i;
for j = 0:i-2
v(r) = b;
b = b + n - j - 1;
r = r + 1;
end
for j = 0:n-i
v(r) = a + j;
r = r + 1;
end
% BUGFIX: Omit "r = r + n - i + 1;" Thanks Trisha Phippard
a = a + n - i + 1;
end
D2 = sparse(1:nsq, v, 1, nsq, m);
toc
Now I get 0.013 sec for n=300. Finally vectorize the 2 inner loops:
tic
m = n * (n + 1) / 2;
nsq = n^2;
r = 1;
a = 1;
v = zeros(1, nsq);
cn = cumsum(n:-1:2); % [EDITED, 2021-08-04], 10% faster
for i = 1:n
% v(r:r + i - 2) = i - n + cumsum(n - (0:i-2));
v(r:r + i - 2) = i - n + cn(1:i - 1); % [EDITED, 2021-08-04]
r = r + i - 1;
v(r:r + n - i) = a:a + n - i;
r = r + n - i + 1;
a = a + n - i + 1;
end
D2 = sparse(1:nsq, v, 1, nsq, m);
toc
0.011 sec. A speedup of factor 2500 for n=300. And 0.12 sec for n=1000. Nice! :-)

7 个评论

@Youngkyu Kim: The runtime behavior of the original code is quadratic and the extrapolated run-time for n=1000 is about 14'000 seconds. My last suggestions needs 0.12 seconds.
I took me some time to figure this out. Don't you think, that the problem is solved? A short reaction would be fine, or at least accepting the answer as a working solution. I do not need any reputation points anymore, but I think, such a success is worth to be honored.
Thanks for your suggesteion.
I modified my code and it worked well enough.
Your suggestion is slightly faster than mine.
function D = duplication(n)
m=1/2*n*(n+1);
nsq=n^2;
Lind=tril(true(n));
Lind=find(Lind(:));
Lind=Lind(:);
Uind=rem(Lind-1,n)*n+ceil(Lind/n);
i=(1:m)';
a=(i-1)*nsq+Lind;
b=(i-1)*nsq+Uind;
c=union(a,b);
[I,J]=ind2sub([nsq,m],c);
D=sparse(I,J,1,nsq,m);
end
A very useful algorithm!
In your second version, it would seem that the following line is incorrect:
r = r + n - i + 1;
The whole series of increments in r work as they should, there is no need to jump to a new location. In the vectorized version it is needed, but in the scalar version it causes problems.
I wonder if there is a way to simply "store" the duplication matrices for n=1 until n=1000 in matlab and then just accessing them when needed instead of recreating them every time. If yes, would that speed up things even more?
@Michael Stollenwerk: Yes, of course. Simply store the calcuated matrices in a persistent cell array:
function D = duplication(n)
persistent C
if isempty(C)
nCache = 1000; % Set according to your needs
C = cell(1, nCache);
end
if n <= numel(C) && ~isempty(C{n})
D = C{n};
else
m = n * (n + 1) / 2;
nsq = n^2;
r = 1;
a = 1;
v = zeros(1, nsq);
cn = cumsum(n:-1:2);
for i = 1:n
v(r:r + i - 2) = i - n + cn(1:i - 1);
r = r + i - 1;
v(r:r + n - i) = a:a + n - i;
r = r + n - i + 1;
a = a + n - i + 1;
end
D = sparse(1:nsq, v, 1, nsq, m);
if n <= numel(C)
C{n} = D;
end
end
end
When this function is called the first time for a specific n, D is calculated. On further calls D is taken from the persistently stored list C.
% Timings for i7, Win 10 64, Matlab R2018b:
tic; for k = 1:1000; D = duplication(k); end; toc
tic; for k = 1:1000; D = duplication(k); end; toc
% First run with calculations:
% Elapsed time is 35.022836 seconds.
% Second run with copies from the list:
% Elapsed time is 0.006062 seconds.
The list C of my code needs 6GB of RAM, if all 1000 matrices are created. Writing it as -v7.3 MAT file creates an 1GB file. Reading it takes 15 seconds on a HDD instead of 35 seconds of creating all matrices dynamically.

请先登录,再进行评论。

更多回答(0 个)

类别

帮助中心File Exchange 中查找有关 Multidimensional Arrays 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by