how to calculate cosine similarity on a codistributed array?

1 次查看(过去 30 天)
I have to calculate the cosine similarity between the rows of an array. It works in a serial execution with pdist, but this is not working when working with codistributed arrays on MDCS. In the parallel setup, 4 compute nodes are used and the (large) array is distributed row-wise over the 4 nodes. I wrote a naive function to calculate the cosine similarity, but it takes for ages, even with a small array it takes (too) long.
This is the test I use currently: I generate a random array
r = floor(rand(100, codistributor('1d', 1)))
q = cosineSimilarityNaive(r)
the code of the function:
function [res] = cosineSimilarityNaive(data)
% get the dimensions
[n_row n_col] = size(data);
% calculate the norm for each row
%
norm_r = sqrt(sum(abs(data).^2,2));
%
for i = 1:n_row
%
for j = i:n_row
%
res(i,j) = dot(data(i,:), data(j,:)) / (norm_r(i) * norm_r(j));
res(j,i) = res(i,j);
end
end
Currently I have no idea on how to make it run faster, codistributed arrays on different nodes are necessary since the array is so large that is does not fit on 1 compute node. I did some testing on with svd on a distributed array over 4 nodes, and this works fine. I think I am missing something in my code, but currently I have no clue. Any tips?

采纳的回答

Jill Reese
Jill Reese 2012-7-2
It would be much more efficient to lump all of the multiplications together. Also, when you use for loops with codistributed arrays you need to use the drange command to make sure that the workers only operate on the data that they own. I think rewriting your code a bit will speed things up:
spmd
% Create the data. Don't use floor because that will return all zeros.
r = rand(100,codistributor1d(1));
end
% Find the norm of each row
norm_r = sqrt(sum(abs(r).^2,2));
% get the dimensions
[n_row n_col] = size(data);
% Scale each row by its norm first.
% Use drange so that each worker operates only on the data it owns/
spmd
for i=drange(1:n_row)
r(i,:) = r(i,:)/norm_r(i);
end
end
% Transpose the data so we can use matrix multiplication to
% perform the dot products all at once. A transpose is cheap and
% incurs no communication. Of course this is only useful if you have
% enough memory to store another copy of the local part on each worker.
tr = transpose(r);
res = r*tr;

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Distributed Arrays 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by