Hi John Doe,
As per my understanding, you have a 1000x1000x3 matrix A and an “Mx3” dimensional matrix B of 3D vectors. You want to find indices of vectors in A with the highest cosine similarity to each vector in B, and you're looking to replace your slow nested loop method with a faster vectorized solution.
You can convert the original 1000x1000x3 matrix into a 2D matrix of size 1000000x3. This reshaping allows you to treat each 3D vector as a row in a 2D matrix, facilitating vectorized operations. You can refer to the example code below. This code validates that vectorized approach gives same answer as the nested loop approach and also shows the performance of vectorized implementation and nested loop implementation:
N = 1000;
matrix3D = rand(N, N, 3); % A NxNx3 matrix for testing
vectorSet = rand(100, 3); % A 100x3 vector set
tic
% Vectorized approach
% Reshape the 3D matrix to 2D
matrix2D = reshape(matrix3D, [], 3);
% Normalize the vectors in the 2D matrix
normMatrix2D = sqrt(sum(matrix2D.^2, 2));
normalizedMatrix2D = matrix2D ./ normMatrix2D;
% Normalize the vectors in the vector set
normVectorSet = sqrt(sum(vectorSet.^2, 2));
normalizedVectorSet = vectorSet ./ normVectorSet;
% Compute the cosine similarity
cosineSimilarity = normalizedMatrix2D * normalizedVectorSet';
% Find the index of the maximum cosine similarity for each vector
[~, maxIndexVectorized] = max(cosineSimilarity, [], 2);
% Reshape the result back to a NxN matrix
maxIndexMatrixVectorized = reshape(maxIndexVectorized, N, N);
vectime = toc;
tic
% Non-vectorized approach (using nested loops)
maxIndexMatrixLoop = zeros(N, N);
for i = 1:N
for j = 1:N
maxCosine = -Inf;
maxIndex = 0;
vector1 = squeeze(matrix3D(i, j, :))';
for k = 1:size(vectorSet, 1)
vector2 = vectorSet(k, :);
cosine = dot(vector1, vector2) / (norm(vector1) * norm(vector2));
if cosine > maxCosine
maxCosine = cosine;
maxIndex = k;
end
end
maxIndexMatrixLoop(i, j) = maxIndex;
end
end
looptime = toc;
disp("Time taken by vectorized approach = " + vectime + "s");
disp("Time taken by loop approach = " + looptime + "s");
% Verify if both methods give the same results
if isequal(maxIndexMatrixVectorized, maxIndexMatrixLoop)
fprintf('Both methods produce the same results.\n');
else
fprintf('There is a discrepancy between the methods.\n');
end
Hope this helps!
