Compute only a few entries of a big matrix product

1 次查看(过去 30 天)
Suppose I have a tall matrix x (say 10000 by 10), and a collection of indices {N1, N2, ..... Nk} where each Ni is a subset of {1, 2, ......, 10000}, and has small size (say |Ni| = 4), and {Ni} can overlap. Suppose I want to compute X:=xx', but am only interested in X(Ni, Ni) for all i, and I can set other entries of X to 0. Is there a way to quickly form such an X without doing the entire matrix product xx' and without using forloop?
  6 个评论
Matt J
Matt J 2023-11-20
编辑:Matt J 2023-11-20
each Ni is a subset of {1, 2, ......, 10000}, and has small size (say |Ni| = 4)
Is |Ni| fixed, i.e., independent of i?

请先登录,再进行评论。

采纳的回答

Bruno Luong
Bruno Luong 2023-11-20
编辑:Bruno Luong 2023-11-20
It will not store the result in 2D matrix but 3D array. I hope you can switch to this format of storage.
x=rand(10000,10);
k = 4000;
m = 4;
N = arrayfun(@(varargin)randi(size(x,1), 2+randi(m-2), 1), 1:k, 'unif', 0);
tic
X=x*x';
toc
Elapsed time is 0.325735 seconds.
tic
[p,q] = size(x);
x(end+1,:)=NaN;
l = cellfun('prodofsize', N);
m = max(l);
NA = cellfun(@(n) paddata(n,m,'FillValue',p+1), N, 'unif', 0);
NA = [NA{:}];
y = x(NA,:);
y = reshape(y,[m k q]);
y = permute(y, [1 3 2]);
Y = pagemtimes(y,'none',y,'ctranspose'); % each page of Y contain X(N1,N1)
toc
Elapsed time is 0.126183 seconds.
% Check first and last results
X(N{1},N{1})
ans = 3×3
4.5222 2.2467 3.3031 2.2467 1.5639 1.8717 3.3031 1.8717 3.3260
Y(1:l(1),1:l(1),1)
ans = 3×3
4.5222 2.2467 3.3031 2.2467 1.5639 1.8717 3.3031 1.8717 3.3260
X(N{k},N{k})
ans = 3×3
3.2240 1.9555 3.6699 1.9555 1.8531 2.7131 3.6699 2.7131 5.1233
Y(1:l(k),1:l(k),k)
ans = 3×3
3.2240 1.9555 3.6699 1.9555 1.8531 2.7131 3.6699 2.7131 5.1233
  2 个评论
Bruno Luong
Bruno Luong 2023-11-21
编辑:Bruno Luong 2023-11-21
This version putback the 3D array result to (sparse) matrix
x=rand(10000,10);
k = 4000;
m = 4;
N = arrayfun(@(varargin)randi(size(x,1), 2+randi(m-2), 1), 1:k, 'unif', 0);
tic
X=x*x';
toc
Elapsed time is 0.307213 seconds.
tic
[p,q] = size(x);
x(end+1,:)=0; % NOTE: x is modified here, not difficult to fix it if not wanted
l = cellfun('prodofsize', N);
m = max(l);
NA = cellfun(@(n) paddata(n,m,'FillValue',p+1), N, 'unif', 0);
NA = [NA{:}];
y = x(NA,:);
y = reshape(y,[m k q]);
y = permute(y, [1 3 2]);
Y = pagemtimes(y,'none',y,'ctranspose'); % each page of Y contain X(N1,N1)
I=repmat(NA,m,1);
J=repelem(NA,m,1);
[IJ,loc] = unique([I(:) J(:)],'rows');
keep = all(IJ<=p,2);
XX=sparse(IJ(keep,1),IJ(keep,2),Y(loc(keep)),p,p);
toc
Elapsed time is 0.162196 seconds.
% Check first and last results
X(N{1},N{1})
ans = 3×3
3.9532 2.9068 2.4660 2.9068 3.2922 2.5177 2.4660 2.5177 2.8154
full(XX(N{1},N{1}))
ans = 3×3
3.9532 2.9068 2.4660 2.9068 3.2922 2.5177 2.4660 2.5177 2.8154
X(N{k},N{k})
ans = 4×4
4.5085 3.3695 3.3036 2.9847 3.3695 3.4139 2.2194 1.9003 3.3036 2.2194 3.6579 2.7151 2.9847 1.9003 2.7151 3.0315
full(XX(N{k},N{k}))
ans = 4×4
4.5085 3.3695 3.3036 2.9847 3.3695 3.4139 2.2194 1.9003 3.3036 2.2194 3.6579 2.7151 2.9847 1.9003 2.7151 3.0315
Bruno Luong
Bruno Luong 2023-11-22
A variation, avoid filter with subindexing, since the elements of the last row and column are 0
x=rand(10000,10);
k = 4000;
m = 4;
N = arrayfun(@(varargin)randi(size(x,1), 2+randi(m-2), 1), 1:k, 'unif', 0);
tic
[p,q] = size(x);
x(end+1,:) = 0; % NOTE: x is modified here, not difficult to fix it if not wanted
l = cellfun('prodofsize', N);
m = max(l);
NA = cellfun(@(n) paddata(n,m,'FillValue',p+1), N, 'unif', 0);
NA = [NA{:}];
y = x(NA,:);
y = reshape(y,[m k q]);
y = permute(y, [1 3 2]);
Y = pagemtimes(y,'none',y,'ctranspose'); % each page of Y contain X(N1,N1)
I = repmat(NA,m,1);
J = repelem(NA,m,1);
[IJ,loc] = unique([I(:) J(:)],'rows');
IJ(IJ>p) = 1;
XX = sparse(IJ(:,1),IJ(:,2),Y(loc), p, p);
toc
Elapsed time is 0.108971 seconds.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Assessments, Criteria, and Verification 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by